Counting DISTINCT over multiple columns

Is there a better way of doing a query like this:

SELECT COUNT(*) 
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
      FROM DocumentOutputItems) AS internalQuery

I need to count the number of distinct items from this table but the distinct is over two columns.

My query works fine but I was wondering if I can get the final result using just one query (without using a sub-query)

标签： sql sql-server tsql

17条回答

何必那么认真

2楼-- · 2019-01-05 09:36

It works for me. In oracle:

SELECT SUM(DECODE(COUNT(*),1,1,1))
FROM DocumentOutputItems GROUP BY DocumentId, DocumentSessionId;

In jpql:

SELECT SUM(CASE WHEN COUNT(i)=1 THEN 1 ELSE 1 END)
FROM DocumentOutputItems i GROUP BY i.DocumentId, i.DocumentSessionId;

0人赞添加讨论(0) 举报

傲

3楼-- · 2019-01-05 09:38

How about something like:

select count(*)
from
  (select count(*) cnt
   from DocumentOutputItems
   group by DocumentId, DocumentSessionId) t1

Probably just does the same as you are already though but it avoids the DISTINCT.

0人赞添加讨论(0) 举报

smile是对你的礼貌

4楼-- · 2019-01-05 09:38

I wish MS SQL could also do something like COUNT(DISTINCT A, B). But it can't.

At first JayTee's answer seemed like a solution to me bu after some tests CHECKSUM() failed to create unique values. A quick example is, both CHECKSUM(31,467,519) and CHECKSUM(69,1120,823) gives the same answer which is 55.

Then I made some research and found that Microsoft does NOT recommend using CHECKSUM for change detection purposes. In some forums some suggested using

SELECT COUNT(DISTINCT CHECKSUM(value1, value2, ..., valueN) + CHECKSUM(valueN, value(N-1), ..., value1))

but this is also not conforting.

You can use HASHBYTES() function as suggested in TSQL CHECKSUM conundrum. However this also has a small chance of not returning unique results.

I would suggest using

SELECT COUNT(DISTINCT CAST(DocumentId AS VARCHAR)+'-'+CAST(DocumentSessionId AS VARCHAR)) FROM DocumentOutputItems

0人赞添加讨论(0) 举报

一纸荒年 Trace。

5楼-- · 2019-01-05 09:39

There's nothing wrong with your query, but you could also do it this way:

WITH internalQuery (Amount)
AS
(
    SELECT (0)
      FROM DocumentOutputItems
  GROUP BY DocumentId, DocumentSessionId
)
SELECT COUNT(*) AS NumberOfDistinctRows
  FROM internalQuery

0人赞添加讨论(0) 举报

Explosion°爆炸

6楼-- · 2019-01-05 09:41

Hope this works i am writing on prima vista

SELECT COUNT(*) 
FROM DocumentOutputItems 
GROUP BY DocumentId, DocumentSessionId

0人赞添加讨论(0) 举报

爷的心禁止访问

7楼-- · 2019-01-05 09:42

This was posed and answered on Quora (https://www.quora.com/In-SQL-how-to-I-count-DISTINCT-over-multiple-columns):

select col1, col2, col3, count(*)
from table
group by col1, col2, col3

I was working on this in SAS, and SAS Proc SQL does not like DISTINCT with more than one column.

0人赞添加讨论(0) 举报

1 2 3 下一页

Counting DISTINCT over multiple columns

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间