If I have a table
CREATE TABLE users (
id int(10) unsigned NOT NULL auto_increment,
name varchar(255) NOT NULL,
profession varchar(255) NOT NULL,
employer varchar(255) NOT NULL,
PRIMARY KEY (id)
)
and I want to get all unique values of profession
field, what would be faster (or recommended):
SELECT DISTINCT u.profession FROM users u
or
SELECT u.profession FROM users u GROUP BY u.profession
?
In MySQL, "
Group By
" uses an extra step:filesort
. I realizeDISTINCT
is faster thanGROUP BY
, and that was a surprise.Group by is expensive than Distinct since Group by does a sort on the result while distinct avoids it. But if you want to make group by yield the same result as distinct give order by null ..
is equal to
Here is a simple approach which will print the 2 different elapsed time for each query.
OR try SET STATISTICS TIME (Transact-SQL)
It simply displays the number of milliseconds required to parse, compile, and execute each statement as below:
After heavy testing we came to the conclusion that GROUP BY is faster
SELECT sql_no_cache opnamegroep_intern FROM
telwerken
WHEREopnemergroep
IN (7,8,9,10,11,12,13) group by opnamegroep_intern635 totaal 0.0944 seconds Weergave van records 0 - 29 ( 635 totaal, query duurde 0.0484 sec)
SELECT sql_no_cache distinct (opnamegroep_intern) FROM
telwerken
WHEREopnemergroep
IN (7,8,9,10,11,12,13)635 totaal 0.2117 seconds ( almost 100% slower ) Weergave van records 0 - 29 ( 635 totaal, query duurde 0.3468 sec)
(more of a functional note)
There are cases when you have to use GROUP BY, for example if you wanted to get the number of employees per employer:
In such a scenario
DISTINCT u.employer
doesn't work right. Perhaps there is a way, but I just do not know it. (If someone knows how to make such a query with DISTINCT please add a note!)SELECT DISTINCT will always be the same, or faster, than a GROUP BY. On some systems (i.e. Oracle), it might be optimized to be the same as DISTINCT for most queries. On others (such as SQL Server), it can be considerably faster.