Performance differences between equal (=) and IN w

2019-02-02 22:46发布

问题:

How does SQL engines differ when we use equal sign and IN operator have same value? Does execution time changes?

1st one using equality check operator

WHERE column_value = 'All'

2nd one using OR operator and single value

WHERE column_value IN ('All')

Does SQL engine changes IN to = if only one value is there?

Is there any difference for same in MySQL and PostgreSQL?

回答1:

There is no difference between those two statements, and the optimiser will transform the IN to the = when IN have just one element in it.

Though when you have a question like this, just run both statements, run their execution plan and see the differences. Here - you won't find any.

After a big search online, I found a document on SQL to support this(I assume it applies to all DBMS):

If there is only one value inside the parenthesis, this commend is equivalent to

WHERE "column_name" = 'value1

Here is the link to the document.

Here is the execution plan of both queries in Oracle (Most DBMS will process this the same) :

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number = '123456789'

Plan hash value: 2312174735
-----------------------------------------------------
| Id  | Operation                   | Name          |
-----------------------------------------------------
|   0 | SELECT STATEMENT            |               |
|   1 |  TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
|   2 |   INDEX UNIQUE SCAN         | SYS_C0029838  |
-----------------------------------------------------

And for IN() :

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number in('123456789');

Plan hash value: 2312174735
-----------------------------------------------------
| Id  | Operation                   | Name          |
-----------------------------------------------------
|   0 | SELECT STATEMENT            |               |
|   1 |  TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
|   2 |   INDEX UNIQUE SCAN         | SYS_C0029838  |
-----------------------------------------------------

As you can see, both are identical. This is on an indexed column. Same goes for an unindexed column (just full table scan) .



回答2:

There is no difference when you are using it with a single value. If you will check the table scan, index scan, or index seek for the above two queries you will find that there is no difference between the two queries.

Is there any difference for same in Mysql and PostgresSQL?

No it would not have any difference on the two engines(Infact it would be same for most of the databases including SQL Server, Oracle etc). Both engines will convert IN to =



回答3:

There are no big differences really, but if your column_value is indexed, IN operator may not read it as an index.

Encountered this problem once, so be careful.



回答4:

Teach a man to fish, etc. Here's how to see for yourself what variations on your queries will do:

mysql> EXPLAIN SELECT * FROM sentence WHERE sentence_lang_id = "AMH"\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sentence
         type: ref
possible_keys: sentence_lang_id
          key: sentence_lang_id
      key_len: 153
          ref: const
         rows: 442
        Extra: Using where

And let's try it the other way:

mysql> EXPLAIN SELECT * FROM sentence WHERE sentence_lang_id in ("AMH")\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sentence
         type: ref
possible_keys: sentence_lang_id
          key: sentence_lang_id
      key_len: 153
          ref: const
         rows: 442
        Extra: Using where

You can read here about how to interpret the results of a mysql EXPLAIN request. For now, note that we got identical output for both queries: exactly the same "execution plan" is generated. The type row tells us that the query uses a non-unique index (a foreign key, in this case), and the ref row tells us that the query is executed by comparing a constant value against this index.



回答5:

For single IN Clause,there is no difference..below is demo using an EMPS table i have..

select * from emps where empid in (1)
select * from emps where empid=1

Predicate for First Query in execution plan:

[PerformanceV3].[dbo].[Emps].[empID]=CONVERT_IMPLICIT(int,[@1],0)

Predicate for second query in execution plan:

[PerformanceV3].[dbo].[Emps].[empID]=CONVERT_IMPLICIT(int,[@1],0)

If you have multiple values in IN Clause,its better to convert them to joins



回答6:

Just to add a different perspective, one of the main points of rdbms systems is that they will rewrite your query for you, and pick the best execution plan for that query and all equivalent ones. This means that as long as two queries are logically identical, the should always generate the same execution plan on a given rdbms.

That being said, many queries are equivalent (same result set) but only because of constraints the database itself is unaware of, so be careful about those cases (E.g for a flag field with numbers 1-6, the db doesn't know <3 is the same as in (1,2)). But at the end of the day, if you're just thinking about legibility of and and or statements it won't make a difference for performance which way you write them.



回答7:

You will need to run execution plan on both, and see the results.

I believe they will have the same execution plan as it will be performed the same as a normal = sign when only one value is placed inside the IN() statement.

There is no reason for the optimizer to behave any differently on a query like this.