Is there a better way of doing a query like this:
SELECT COUNT(*)
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
FROM DocumentOutputItems) AS internalQuery
I need to count the number of distinct items from this table but the distinct is over two columns.
My query works fine but I was wondering if I can get the final result using just one query (without using a sub-query)
What is it about your existing query that you don't like? If you are concerned that
DISTINCT
across two columns does not return just the unique permutations why not try it?It certainly works as you might expect in Oracle.
edit
I went down a blind alley with analytics but the answer was depressingly obvious...
edit 2
Given the following data the concatenating solution provided above will miscount:
So we to include a separator...
Obviously the chosen separator must be a character, or set of characters, which can never appear in either column.
Many (most?) SQL databases can work with tuples like values so you can just do:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
If your database doesn't support this, it can be simulated as per @oncel-umut-turer's suggestion of CHECKSUM or other scalar function providing good uniqueness e.g.COUNT(DISTINCT CONCAT(DocumentId, ':', DocumentSessionId))
.A related use of tuples is performing
IN
queries such as:SELECT * FROM DocumentOutputItems WHERE (DocumentId, DocumentSessionId) in (('a', '1'), ('b', '2'));
How about this,
This will get us the count of all possible combinations of DocumentId, and DocumentSessionId
You can just use the Count Function Twice.
In this case, it would be:
Edit: Altered from the less-than-reliable checksum-only query I've discovered a way to do this (in SQL Server 2005) that works pretty well for me and I can use as many columns as I need (by adding them to the CHECKSUM() function). The REVERSE() function turns the ints into varchars to make the distinct more reliable