Removing duplicate records in MySQL table

Question

Removing duplicate records in MySQL table

I have a table with several thousand rows. The table contains two columns, name

and email

. I have multiple duplicate lines, for example:

John Smith | john@smith.com
John Smith | john@smith.com
Erica Smith | erica@smith.com
Erica Smith | erica@smith.com

What would be the easiest way to remove all duplicate results. For example, to make the table contents = SELECT name, DISTINCT(email) FROM table

.

+3

sql mysql

David542 26 Mar 12 at 21:22

source to share

5 answers

The easiest way is to copy all the different values into a new table:

select distinct *
into NewTable
from MyTable

+2

keyser 26 Mar 12 at 21:25

source to share

DELETE FROM table
WHERE id 
NOT IN
(SELECT A.id
FROM 
(
SELECT name,MAX(id) AS id
FROM table
GROUP BY name
) A
)

+1

Teja 26 Mar 12 at 21:26

source to share

Add auto increment to the table. I believe that when you add it, it will be filled in for you. Since MySql does not allow deletion based on a subquery on the same table, the easiest solution is to then dump the entire dataset into an enticing one for processing. Assuming you have called the new RowId field and temp table tempTable, you can use the following code:

DELETE FROM NameAndEmail
LEFT JOIN 
(     SELECT name, email, Max(RowId) as MaxRowId 
      FROM temptable 
      GROUP BY name, email
) as MaxId
WHERE NameAndEmail.Email = MaxId.Email
     and NameAndEmail.Name = MaxId.Name
     and NameAndEmail.RowId <> MaxId.RowId

+1

Rose 26 Mar 12 at 21:34

source to share

Add a unique index

The easiest way to clean up a table with duplicate data is to simply add a unique index:

set session old_alter_table=1;
ALTER IGNORE TABLE `table` ADD UNIQUE INDEX (name, email);

Pay special attention to the first sql statement, without it the IGNORE flag is ignored and the alter table statement will fail.

+1

AD7six 26 Mar 12 at 21:41

source to share

Umbrella · Accepted Answer · 2012-03-26T21:27:16+0000

You could easily do this by selecting this query to a different table and then renaming it to replace the original.

CREATE TABLE `table2` (
  `name` varchar(255), 
  `email` varchar(255), 
  UNIQUE KEY `email` (`email`));
INSERT INTO `table2` SELECT `name`, DISTINCT(`email`) FROM `table`;
RENAME TABLE `table` TO `table1`;
RENAME TABLE `table2` TO `table`;

Note that this one CREATE

needs to be adjusted to your actual table format. I added a unique key to the email field as a suggestion on how to prevent duplicates in the first place.

Alternatively, you can loop

DELETE FROM `table` 
WHERE `email` IN (
  SELECT `email` FROM `table` GROUP BY `email` HAVING count(*) > 1
) LIMIT 1

To delete one duplicate entry per call. The importance of the constraint is not to delete both lines for any duplicate

Removing duplicate records in MySQL table

Add a unique index

More articles: