Wednesday, November 27, 2024
Google search engine
HomeData Modelling & AIHow To Find Duplicate Values in MySQL

How To Find Duplicate Values in MySQL

Introduction

MySQL is a database application that stores data in rows and columns of different tables to avoid duplication. Duplicate values can occur, which can impact MySQL performance.

This guide will show you how to find duplicate values in a MySQL database.

tutorial on how to check duplicates in MySQL Databasetutorial on how to check duplicates in MySQL Database

Prerequisites

  • An existing installation of MySQL
  • Root user account credentials for MySQL
  • A command line / terminal window

Setting up a Sample Table (Optional)

This step will help you create a sample table to work with. If you already have a database to work on, skip to the next section.

Open a terminal window, and switch to the MySQL shell:

mysql –u root –p

Note: If you get ‘ERROR 1698’, try opening MySQL using sudo mysql instead. This error occurs on some Ubuntu installations and is caused by authentication settings.

List existing databases:

SHOW databases;
The list of MySQL databases when using the show databases commandThe list of MySQL databases when using the show databases command

Create a new database that doesn’t already exist:

CREATE database sampledb;

Select the table you just created:

USE sampledb;

Create a new table with the following fields:

CREATE TABLE dbtable (
  id INT PRIMARY KEY AUTO_INCREMENT,
  date_x VARCHAR(10) NOT NULL,
  system_x VARCHAR(50) NOT NULL,
  test VARCHAR(50) NOT NULL
);

Insert rows into the table:

INSERT INTO dbtable (date_x,system_x,test)
VALUES ('01/03/2020','system1','hard_drive'),
  ('01/04/2020','system2','memory'),
  ('01/10/2020','system2','processor'),
  ('01/14/2020','system3','hard drive'),
  ('01/10/2020','system2','processor'),
  ('01/20/2020','system4','hard drive'),
  ('01/24/2020','system5','memory'),
  ('01/29/2020','system6','hard drive'),
  ('02/02/2020','system7','motherboard'),
  ('02/04/2020','system8','graphics card'),
  ('02/02/2020','system7','motherboard'),
  ('02/08/2020','system9','hard drive');

Run the following SQL query:

SELECT * FROM dbtable
ORDER BY date_x;
MySQL output for the select from table commandMySQL output for the select from table command

Finding Duplicates in MySQL

Find Duplicate Values in a Single Column

Use the GROUP BY function to identify all identical entries in one column.  Follow up with a COUNT() HAVING function to list all groups with more than one entry.

SELECT
  test,
  COUNT(test)
FROM 
  dbtable
GROUP BY test
HAVING COUNT(test) > 1;

Find Duplicate Values in Multiple Columns

You may want to list exact duplicates, with the same information in all three columns.

SELECT
  date_x, COUNT(date_x),
  system_x, COUNT(system_x),
  test, COUNT(test)
FROM
  dbtable
GROUP BY
  date_x,
  system_x,
  test
HAVING COUNT(date_x)>1
  AND COUNT(system_x)>1
  AND COUNT(test)>1;
Select multiple columns and find exact duplicates in MySQL databaseSelect multiple columns and find exact duplicates in MySQL database

This query works by selecting and testing for the >1 condition on all three columns. The result is that only rows with duplicate values are returned in the output.

Check for Duplicates in Multiple Tables With INNER JOIN

Use the INNER JOIN function to find duplicates that exist in multiple tables.

Sample syntax for an INNER JOIN function looks like this:

SELECT column_name
  FROM table1
  INNER JOIN table2
  ON table1.column_name = table2.column name;

To test this example, you need a second table that contains some information duplicated from the sampledb table we created above.

SELECT dbtable.date_x
  FROM dbtable
  INNER JOIN new_table
  ON dbtable.date_x = new_table.date_x;

This will display any duplicate dates that exist between the existing data and the new_table.

output of the MySQL inner join command to check for duplicate valuesoutput of the MySQL inner join command to check for duplicate values

Note: The DISTINCT command can be used to return results while ignoring duplicates. Also, newer versions of MySQL use a strict mode, which can affect operations that attempt to select all columns. If you get an error, make sure that you’re selecting specific individual columns.

Conclusion

Now you can check for duplicates in MySQL data in one or multiple tables and understand the INNER JOIN function. Make sure you created the tables correctly and that you select the right columns.

Now that you have found duplicate values, learn how to remove MySQL duplicate rows.

Was this article helpful?
YesNo

RELATED ARTICLES

Most Popular

Recent Comments