可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a query that returns a result set similar to the one below (in reality it is far bigger, thousands of rows):
A | B | C | D
-----|----|----|-----
1 NULL | d0 | d0 | NULL
2 NULL | d0 | d1 | NULL
3 NULL | d0 | d2 | a0
4 d0 | d1 | d1 | NULL
5 d0 | d2 | d2 | a0
Two of the rows are considered duplicates, 1 and 2, because A, B and D are the same. To eliminate this, I could use SELECT DISTINCT A, B, D
but then I do not get column C in my result set. Column C is necessary information for rows 3, 4 and 5.
So how do I come from the result set above to this one (the result appearing in C4 can also be NULL
instead of d1
):
A | B | C | D
-----|----|------|-----
1 NULL | d0 | NULL | NULL
3 NULL | d0 | d2 | a0
4 d0 | d1 | d1 | NULL
5 d0 | d2 | d2 | a0
回答1:
DECLARE @YourTable TABLE (
A VARCHAR(2)
, B VARCHAR(2)
, C VARCHAR(2)
, D VARCHAR(2))
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd0', NULL)
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd1', NULL)
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd2', 'a0')
INSERT INTO @YourTable VALUES ('d0', 'd1', 'd1', NULL)
INSERT INTO @YourTable VALUES ('d0', 'd2', 'd2', 'a0')
SELECT A, B, C = MIN(C), D
FROM @YourTable
GROUP BY A, B, D
SELECT A, B, CASE WHEN MIN(C) = MAX(C) THEN MIN(C) ELSE NULL END, D
FROM @YourTable
GROUP BY A, B, D
SELECT A, B, CASE WHEN MIN(COALESCE(C, 'dx')) = MAX(COALESCE(C, 'dx')) THEN MIN(C) ELSE NULL END, D
FROM @YourTable
GROUP BY A, B, D
回答2:
Use Dense_Rank() to partition by A, B, and D
(Thanks Lieven, for the temp table query, I had to use it for demo to be consistent ;))
According to MSDN,
The rank of a row is one plus the number of distinct ranks that come before the row in question
Partitioning by A, B, C
and then sorting by A, B, C, D
will give you the rank of 1 for the first distinct value where uniqueness is defined by A, B, D
. That is where filtering by 1
came from.
where DenseRank = 1
Here is the result
Here is the code:
DECLARE @YourTable TABLE (
A VARCHAR(2)
, B VARCHAR(2)
, C VARCHAR(2)
, D VARCHAR(2))
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd0', NULL)
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd1', NULL)
INSERT INTO @YourTable VALUES (NULL, 'd0', 'd2', 'a0')
INSERT INTO @YourTable VALUES ('d0', 'd1', 'd1', NULL)
INSERT INTO @YourTable VALUES ('d0', 'd2', 'd2', 'a0')
;with DistinctTable as (
select *,
DenseRank = Dense_Rank() over (Partition By A, B, D order by A, B, C, D)
from @YourTable
)
select A, B, C, D
from DistinctTable
where DenseRank = 1
回答3:
A subquery perhaps?
SELECT A,B,C,D FROM table1 WHERE EXISTS ( SELECT DISTINCT A,B,D FROM table1 );
回答4:
The fact you have NULLs in A and D compicates matters for any EXISTS.
Any MIN/MAX solution on C may not give you NULL as I think you want. Otherwise, use MIN(C) and a simple group by.
You have to extract the unique keys first (A, B, D), then use that to determine extract the rows again and work out what to do with C
DECLARE @TheTable TABLE (
A varchar(2) NULL,
B varchar(2) NULL,
C varchar(2) NULL,
D varchar(2) NULL
)
INSERT INTO @TheTable VALUES (NULL, 'd0', 'd0', NULL)
INSERT INTO @TheTable VALUES (NULL, 'd0', 'd1', NULL)
INSERT INTO @TheTable VALUES (NULL, 'd0', 'd2', 'a0')
INSERT INTO @TheTable VALUES ('d0', 'd1', 'd1', NULL)
INSERT INTO @TheTable VALUES ('d0', 'd2', 'd2', 'a0')
SELECT DISTINCT
T.A,
T.B,
CASE Number WHEN 1 THEN T.C ELSE NULL END,
T.D
FROM
(SELECT
COUNT(*) AS Number,
A, B, D
FROM
@TheTable
GROUP BY
A, B, D
) UQ
JOIN
@TheTable T ON ISNULL(T.A, '') = ISNULL(UQ.A, '') AND ISNULL(T.B, '') = ISNULL(UQ.B, '') AND ISNULL(T.D, '') = ISNULL(UQ.D, '')
回答5:
if you have an unique id in the table, then i would go for something like this:
SELECT A,B,C,D FROM table WHERE id IN (SELECT DISTINCT A,B,D)
The problem is that you would always get the first value of C, not the first one with an value.