Finding Duplicates using SQL

While migrating old data to a new database schema I was unable to activate a PRIMARY KEY on the legacy_customer_id field because there were duplicates. It turned out that the old application did not clean the data really well, and it allowed duplicates to be created where one of the customer_ids had a space character making it unique.

I used the following query to test for others:

SELECT
    customer_id,
    COUNT(customer_id)
FROM
    customers
GROUP BY
    customer_id
HAVING
    (COUNT(customer_id) > 1);

This allowed me to find all customer_ids that had duplicates and clean things up.

Using JOIN within the Zend Framework

I found documentation very sparse on the subject of using JOIN with the Zend Framework. So i set out on a quest of many hours figuring out how to get it to work. Here is what I ended up with.

I do not claim that this is the best way to do it, or that it is correct, but here is how I solved this and got JOIN working within Zend Framework.

Continue reading Using JOIN within the Zend Framework

MySQL FIND_IN_SET

I was faced with a field in the database that had a comma separated list of INT’s with a space after the comma. The application searched this field to generate the recordset to display on a web page.

First I tried using LIKE, with something along this line: ($state_id is being passed by PHP)

SELECT
    *
WHERE
    `assoc_states` LIKE '%$state_id%';

Well, as you can guess if I was searching for an INT less than two numbers I would get anything that had that digit in it. (Ex. – Searching for ‘%4%’ would also give me entries of 4, 14, 24, 34, 42, 44, etc.)

I may have been able to use RLIKE, but I didn’t fully understand it and there were no good examples of doing what I needed.

That was when I stumbled across FIND_IN_SET. Here is what MySQL.com has to say about it:

Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings. A string list is a string composed of substrings separated by “,” characters. If the first argument is a constant string and the second is a column of type SET, the FIND_IN_SET() function is optimized to use bit arithmetic. Returns 0 if str is not in strlist or if strlist is the empty string. Returns NULL if either argument is NULL. This function does not work properly if the first argument contains a comma (“,”) character.

SELECT
    FIND_IN_SET('b','a,b,c,d');
 
// output -> 2

Here is what I ended up with: ($state_id is being passed by PHP)

SELECT
    *
WHERE
    FIND_IN_SET('$state_id', `assoc_state`) > 0;

MySQL ADDDATE or DATE_ADD

This weekend I needed a couple of MySQL query items that were out of the ordinary, so I thought I would write about them.

ADDDATE or DATE_ADD both work the same, but came in very handy. I was struggling with an application where I need to have the date 42 days from now. PHP doesn’t really have anything that is easy to use, and after a few tries I decided to turn to MySQL. Below is what MySQL.com has to say on these:

When invoked with the INTERVAL form of the second argument, ADDDATE() is a synonym for DATE_ADD(). The related function SUBDATE() is a synonym for DATE_SUB(). For information on the INTERVAL unit argument, see the discussion for DATE_ADD().

SELECT DATE_ADD('1998-01-02', INTERVAL 42 DAY);
        -> '1998-02-02'
SELECT ADDDATE('1998-01-02', INTERVAL 42 DAY);
        -> '1998-02-02'