Please read this before continuing: Filter an unfiltered table against a whitelist table
So, I currently have a whitelist table set up as shown in the referenced link, and I'm encountering yet another issue brought up by said table, that is, to check the UNIQUENESS of each column. As MySQL's specification, it is not possible to set NULL column as UNIQUE, so, I've decided to come up with a different solution to check if rows are duplicated or not by using a SELECT GROUP BY query as follows.
SELECT GROUP_CONCAT(ID) AS IDs, country, region, item, count(*) AS amount
FROM whitelist
Now, to check if the item is duplicated, I've warpped it on top of another layer.
SELECT IDs, country, region, item, amount
FROM (SELECT GROUP_CONCAT(ID) AS IDs, country, region, item, count(*) AS amount
FROM whitelist) tmp
WHERE amount > 1
Still works fine as intended, but the question starts here.
Is it possible for me to use this data, and RE-SELECT the whitelist table so I can get each entry as a row with something like ...
SELECT ID, country, region, item
FROM whitelist
WHERE ID IN (SELECT group_concat(ID)
FROM (SELECT group_concat(ID) AS ID, country, region, item, COUNT(*) AS AMOUNT
FROM whitelist
GROUP BY country, region, item) tmp
WHERE AMOUNT > 1)
Of course, I could just use PHP and explode the group_concat IDs and re-select it, but I'm wondering if it's possible to do it in one SQL query call instead of two.
Edit: Oops, the example above had an error in it (accidentally used real schema there xD)
Edit2: Doh, I suddenly thought why complicate things and why not just simply go with this ...
SELECT wl1.ID, wl1.country, wl1.region, wl1.item, wl1.reason
FROM whitelist wl1,
(SELECT country, region, item
FROM whitelist
GROUP BY country, region, item
HAVING count(*) > 1) wl2
WHERE wl1.country = wl2.country AND
wl1.region = wl2.region AND
wl1.item = wl2.reason
... but still fails too, because you cannot use = on two NULL columns. Urgh, so close yet so far >.<
To: Bill Karwin
That is exactly the issue here. If I set a unique key on country, region, item, and I perform the following SQL, this will happen.
INSERT INTO whitelist(country, region, item) VALUES ('Taiwan', 'Asia', 'PC');
INSERT INTO whitelist(country, region, item) VALUES ('Taiwan', 'Asia', 'PC');
-- Would fail due to UNIQUE check
However, if I include any of the wildcards, aka NULL, and this would happen.
INSERT INTO whitelist(country, region, item) VALUES (NULL, 'Asia', 'Rice');
INSERT INTO whitelist(country, region, item) VALUES (NULL, 'Asia', 'Rice');
-- Would succeed due to UNIQUE does not check NULL columns.
Hence the idea of this post is to list all repeating whitelist in a list so that the operator can decide what to keep and what to delete.