How to determine what is more effective: DISTINCT

2019-01-25 10:05发布

问题:

For example, I have 3 tables: user, group and permission, and two many2many relationships between them: user_groups and group_permissions.

I need to select all permissions of given user, without repeats. Every time I encounter a similar problem, I can not determine which version of a query better:

SELECT permisson_id FROM group_permission WHERE EXISTS(
    SELECT 1 FROM user_groups 
        WHERE user_groups.user_id = 42 
          AND user_groups.group_id = group_permission.group_id
)

SELECT DISTINCT permisson_id FROM group_permission
    INNER JOIN user_groups ON user_groups.user_id = 42 
           AND user_groups.group_id = group_permission.group_id 

I have enough experience to make conclusions based on explain. The first query have subquery, but my experiences have shown that the first query is faster. Perhaps because of the large number of filtered permissions in result.

What would you do in this situation? Why? Thanks!

回答1:

Use EXISTS Rather than DISTINCT

You can suppress the display of duplicate rows using DISTINCT; you use EXISTS to check for the existence of rows returned by a subquery. Whenever possible, you should use EXISTS rather than DISTINCT because DISTINCT sorts the retrieved rows before suppressing the duplicate rows.

in your case there whould be many duplicated data so the exists should be faster.

by http://my.safaribooksonline.com/book/-/9780072229813/high-performance-sql-tuning/ch16lev1sec10