Finding Duplicates using SQL

While migrating old data to a new database schema I was unable to activate a PRIMARY KEY on the legacy_customer_id field because there were duplicates. It turned out that the old application did not clean the data really well, and it allowed duplicates to be created where one of the customer_ids had a space character making it unique.

I used the following query to test for others:

SELECT
    customer_id,
    COUNT(customer_id)
FROM
    customers
GROUP BY
    customer_id
HAVING
    (COUNT(customer_id) > 1);

This allowed me to find all customer_ids that had duplicates and clean things up.

Published by

Adam

Speaker, author, consultant, OSS contributor, SoFloPHP UG and SunshinePHP Conf organizer, RunGeekRadio Host, Long distance runner and ultra marathoner.

Leave a Reply

Your email address will not be published. Required fields are marked *