What's faster, SELECT DISTINCT or GROUP BY in-第2页回答

If I have a table

CREATE TABLE users (
  id int(10) unsigned NOT NULL auto_increment,
  name varchar(255) NOT NULL,
  profession varchar(255) NOT NULL,
  employer varchar(255) NOT NULL,
  PRIMARY KEY  (id)
)

and I want to get all unique values of profession field, what would be faster (or recommended):

SELECT DISTINCT u.profession FROM users u

SELECT u.profession FROM users u GROUP BY u.profession

标签： mysql sql database group-by distinct

15条回答

路过你的时光

2楼-- · 2019-01-01 12:41

In MySQL, "Group By" uses an extra step: filesort. I realize DISTINCT is faster than GROUP BY, and that was a surprise.

0人赞添加讨论(0) 举报

孤独总比滥情好

3楼-- · 2019-01-01 12:43

Group by is expensive than Distinct since Group by does a sort on the result while distinct avoids it. But if you want to make group by yield the same result as distinct give order by null ..

SELECT DISTINCT u.profession FROM users u

is equal to

SELECT u.profession FROM users u GROUP BY u.profession order by null

0人赞添加讨论(0) 举报

呛了眼睛熬了心

4楼-- · 2019-01-01 12:45

Here is a simple approach which will print the 2 different elapsed time for each query.

DECLARE @t1 DATETIME;
DECLARE @t2 DATETIME;

SET @t1 = GETDATE();
SELECT DISTINCT u.profession FROM users u; --Query with DISTINCT
SET @t2 = GETDATE();
PRINT 'Elapsed time (ms): ' + CAST(DATEDIFF(millisecond, @t1, @t2) AS varchar);

SET @t1 = GETDATE();
SELECT u.profession FROM users u GROUP BY u.profession; --Query with GROUP BY
SET @t2 = GETDATE();
PRINT 'Elapsed time (ms): ' + CAST(DATEDIFF(millisecond, @t1, @t2) AS varchar);

OR try SET STATISTICS TIME (Transact-SQL)

SET STATISTICS TIME ON;
SELECT DISTINCT u.profession FROM users u; --Query with DISTINCT
SELECT u.profession FROM users u GROUP BY u.profession; --Query with GROUP BY
SET STATISTICS TIME OFF;

It simply displays the number of milliseconds required to parse, compile, and execute each statement as below:

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 2 ms.

0人赞添加讨论(0) 举报

其实，你不懂

5楼-- · 2019-01-01 12:45

After heavy testing we came to the conclusion that GROUP BY is faster

SELECT sql_no_cache opnamegroep_intern FROM telwerken WHERE opnemergroep IN (7,8,9,10,11,12,13) group by opnamegroep_intern

635 totaal 0.0944 seconds Weergave van records 0 - 29 ( 635 totaal, query duurde 0.0484 sec)

SELECT sql_no_cache distinct (opnamegroep_intern) FROM telwerken WHERE opnemergroep IN (7,8,9,10,11,12,13)

635 totaal 0.2117 seconds ( almost 100% slower ) Weergave van records 0 - 29 ( 635 totaal, query duurde 0.3468 sec)

0人赞添加讨论(0) 举报

与君花间醉酒

6楼-- · 2019-01-01 12:47

(more of a functional note)

There are cases when you have to use GROUP BY, for example if you wanted to get the number of employees per employer:

SELECT u.employer, COUNT(u.id) AS "total employees" FROM users u GROUP BY u.employer

In such a scenario DISTINCT u.employer doesn't work right. Perhaps there is a way, but I just do not know it. (If someone knows how to make such a query with DISTINCT please add a note!)

0人赞添加讨论(0) 举报

琉璃瓶的回忆

7楼-- · 2019-01-01 12:49

SELECT DISTINCT will always be the same, or faster, than a GROUP BY. On some systems (i.e. Oracle), it might be optimized to be the same as DISTINCT for most queries. On others (such as SQL Server), it can be considerably faster.

0人赞添加讨论(0) 举报

What's faster, SELECT DISTINCT or GROUP BY in

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间