sql statements with equals vs in

2019-01-25 01:30发布

问题:

Say that someone came up to you and said we're going to cut down the amount of SQL that we write by replacing equals with IN. The use would be both for single scalar values and lists of numbers.

SELECT * 
  FROM table 
 WHERE id = 1

OR

SELECT * 
  FROM table 
 WHERE id IN (1)

Are these statement equivalent to what the optimizer produces?

This looks really simple on the surface, but it leads to simplification for two reasons: 1. large blocks of SQL don't need to be duplicated, and 2. we don't overuse dynamic SQL.

This is a contrived example, but consider the following.

select a.* from tablea a 
join tableb b on a.id = b.id
join tablec c on b.id2 = c.id2
left join tabled d on c.id3 = c.id3
where d.type = 1

... and the same again for the more than one case

select a.* from tablea a 
join tableb b on a.id = b.id
join tablec c on b.id2 = c.id2
left join tabled d on c.id3 = c.id3
where d.type in (1,2,3,4)

(this isn't even a large statement)

conceivably you could do string concatenation, but this isn't desirable in light of ORM usage, and dynamic SQL string concatenation always starts off with good intentions (at least in these parts).

回答1:

The two will produce the same execution plan - either a table scan, index scan, or index seek, depending on if/how you have your table indexed.

You can see for yourself - Displaying Graphical Execution Plans (SQL Server Management Studio) - See the section called "Using the Execution Plan Options".



回答2:

Those two specific statements are equivalent to the optimizer (did you compare the execution plans?), but I think the more important benefit you get out of the latter is that

WHERE id = 1 OR id = 2 OR id = 3 OR id = 4 OR id = 5

Can be expressed as the following, much more concise and readable (but semantically equivalent to the optimizer) version:

WHERE id IN (1,2,3,4,5)

The problem is that when expressed as the latter most people think they can pass a string, like @list = '1,2,3,4,5' and then say:

WHERE id IN (@list)

This does not work because @list is a single scalar string, and not an array of integers.

For cases where you have a single value, I don't see how that "optimization" helps anything. You haven't written less SQL, you've actually written more. Can you outline in more detail how this is going to lead to less SQL?