What's faster, SELECT DISTINCT or GROUP BY in

If I have a table

CREATE TABLE users (
  id int(10) unsigned NOT NULL auto_increment,
  name varchar(255) NOT NULL,
  profession varchar(255) NOT NULL,
  employer varchar(255) NOT NULL,
  PRIMARY KEY  (id)
)

and I want to get all unique values of profession field, what would be faster (or recommended):

SELECT DISTINCT u.profession FROM users u

SELECT u.profession FROM users u GROUP BY u.profession

标签： mysql sql database group-by distinct

15条回答

笑指拈花

2楼-- · 2019-01-01 12:34

They are essentially equivalent to each other (in fact this is how some databases implement DISTINCT under the hood).

If one of them is faster, it's going to be DISTINCT. This is because, although the two are the same, a query optimizer would have to catch the fact that your GROUP BY is not taking advantage of any group members, just their keys. DISTINCT makes this explicit, so you can get away with a slightly dumber optimizer.

When in doubt, test!

0人赞添加讨论(0) 举报

与风俱净

3楼-- · 2019-01-01 12:34

If you don't have to do any group functions (sum, average etc in case you want to add numeric data to the table), use SELECT DISTINCT. I suspect it's faster, but i have nothing to show for it.

In any case, if you're worried about speed, create an index on the column.

0人赞添加讨论(0) 举报

ら面具成の殇う

4楼-- · 2019-01-01 12:36

This is not a rule

For each query .... try separately distinct and then group by ... compare the time to complete each query and use the faster ....

In my project sometime I use group by and others distinct

0人赞添加讨论(0) 举报

几人难应

5楼-- · 2019-01-01 12:37

Go for the simplest and shortest if you can -- DISTINCT seems to be more what you are looking for only because it will give you EXACTLY the answer you need and only that!

0人赞添加讨论(0) 举报

只靠听说

6楼-- · 2019-01-01 12:40

It seems that the queries are not exactly the same. At least for MySQL.

Compare:

describe select distinct productname from northwind.products
describe select productname from northwind.products group by productname

The second query gives additionally "Using filesort" in Extra.

0人赞添加讨论(0) 举报

情到深处是孤独

7楼-- · 2019-01-01 12:41

All of the answers above are correct, for the case of DISTINCT on a single column vs GROUP BY on a single column. Every db engine has its own implementation and optimizations, and if you care about the very little difference (in most cases) then you have to test against specific server AND specific version! As implementations may change...

BUT, if you select more than one column in the query, then the DISTINCT is essentially different! Because in this case it will compare ALL columns of all rows, instead of just one column.

So if you have something like:

// This will NOT return unique by [id], but unique by (id,name)
SELECT DISTINCT id, name FROM some_query_with_joins

// This will select unique by [id].
SELECT id, name FROM some_query_with_joins GROUP BY id

It is a common mistake to think that DISTINCT keyword distinguishes rows by the first column you specified, but the DISTINCT is a general keyword in this manner.

So people you have to be careful not to take the answers above as correct for all cases... You might get confused and get the wrong results while all you wanted was to optimize!

0人赞添加讨论(0) 举报

1 2 3 下一页

What's faster, SELECT DISTINCT or GROUP BY in

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间