Finding Duplicates using SQL

While migrating old data to a new database schema I was unable to activate a PRIMARY KEY on the legacy_customer_id field because there were duplicates. It turned out that the old application did not clean the data really well, and it allowed duplicates to be created where one of the customer_ids had a space character making it unique.

I used the following query to test for others:

SELECT
    customer_id,
    COUNT(customer_id)
FROM
    customers
GROUP BY
    customer_id
HAVING
    (COUNT(customer_id) > 1);

This allowed me to find all customer_ids that had duplicates and clean things up.

Published by

Adam

Application Architect from Florida

Leave a Reply

Your email address will not be published. Required fields are marked *