How to Find Duplicate Values in a SQL Table using Python?

27 July 2024

4

MySQL server is an open-source relational database management system that is a major support for web-based applications. Databases and related tables are the main component of many websites and applications as the data is stored and exchanged over the web. In order to access MySQL databases from a web server, we use various modules in Python such as PyMySQL, mysql.connector, etc.

In this article, we are going to find duplicate values in a specific MySQL table in a Database. First, we are going to connect to a database having a MySQL table. The SQL query that is going to be used is:

SELECT * FROM table-name 
GROUP BY col_1, col_2,..., col_n 
HAVING COUNT(*) > 1;

If the table has a primary key then the below query can also be used:

SELECT * FROM table-name 
GROUP BY primary-key
HAVING COUNT(*) > 1;

The above queries will generate only duplicate rows in a table, and then these rows will be displayed as output.

Below are some programs which depict how to find duplicate values in a specific MySQL table in a Database:

Example 1

Below is the table Documentary in database geek which is going to be accessed by a Python script:

Below is the program to get the duplicate rows in the MySQL table:

Python3

# import required module
import mysql.connector
 
# connect python with mysql with your hostname, 
# database, user and password
db = mysql.connector.connect(host='localhost',
                             database='gfg',
                             user='root',
                             password='')
 
# create cursor object
cursor = db.cursor()
 
# get the sum of rows of a column
cursor.execute("SELECT * FROM Documentary \
               GROUP BY Name, Production \
               HAVING COUNT(*) > 1;")
 
# fetch duplicate rows and display them
print('Duplicate Rows:')               
for row in cursor.fetchall(): print(row)
 
# terminate connection
db.close()

Output:

Example 2:

Here is another example to find the duplicate rows from a table in a given database, below is the table scheme and rows:

As we can see the Roll attribute is the primary key of the Student table, hence it only can be used with the GROUP BY statement in the query to generate the duplicate rows, below is the python script to get row count from the table Student:

Python3

# import required module
import mysql.connector
 
# connect python with mysql with your hostname, 
# database, user and password
db = mysql.connector.connect(host='localhost',
                             database='gfg',
                             user='root',
                             password='')
 
# create cursor object
cursor = db.cursor()
 
# get the sum of rows of a column
cursor.execute("SELECT * FROM Student \
               GROUP BY Roll \
               HAVING COUNT(*) > 1;")
 
# fetch duplicate rows and display them
print('Duplicate Rows:')               
for row in cursor.fetchall(): print(row)
 
# terminate connection
db.close()

Output:

How to Find Duplicate Values in a SQL Table using Python?

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Sticky Password vs. LastPass 2024: Which Is Better? by Katarina Glamoslija

Galaxy S25 on-device AI capability expands, reducing reliance on the cloud

OnePlus 13R launches with a huge battery upgrade, starting in China

This is my surprise phone of the year [Video]

Recent Comments

EDITOR PICKS

Sticky Password vs. LastPass 2024: Which Is Better? by Katarina Glamoslija

Galaxy S25 on-device AI capability expands, reducing reliance on the cloud

OnePlus 13R launches with a huge battery upgrade, starting in China

POPULAR POSTS

Sticky Password vs. LastPass 2024: Which Is Better? by Katarina Glamoslija

Galaxy S25 on-device AI capability expands, reducing reliance on the cloud

OnePlus 13R launches with a huge battery upgrade, starting in China

POPULAR CATEGORY

ABOUT US

FOLLOW US