-->

New cardinality estimator (SQL Server 2014) is way

2020-05-25 06:47发布

问题:

I have a data warehouse database and I'm facing problems with the new cardinality estimator of SQL Server 2014.

After upgrading the database server to SQL Server 2014 I have observed a big difference in query performance. Some queries are executing much slower (30 sec in SQL 2012 vs. 5 minutes in SQL 2014). After researching execution plans I've seen that the cardinality estimates on the SQL Server 2014 are way off and I can't find a reason for it.

Here's an example of a query execution plan (top-left operator) in SQL 2012 vs. SQL 2014:

Some details:

  • My queries are typical data warehouse fact table load queries. I query a transactional table and join a lot (15-20) dimension tables (there's always either 0 or 1 record that is joined from the dimensional table).

  • I have updated statistics of all tables (with FULLSCAN) to be sure that the statistics is up-to-date.

  • The business keys of the dimension tables are indexed (unique non-clusted index). It seems to me that because of the uniqueness of this index the old cardinality estimator (SQL 2012) correctly assumes that there's max. 1 record that joins (the estimated number of records does not change in the execution plan).

I tried to narrow down the issue to the simplest example – SELECT with 2 joins:

Here's the cardinality estimation on operators 1 and 2 in SQL 2012 vs. SQL 2014:

           | Est.rows - SQL2012 | Est.rows - SQL2014
Operator 1 |               7653 |               7653
Operator 2 |               7653 |              10000

As you can see, SQL Server 2014 misses the estimation by more than 30% (10000 vs. 7653). Because I have cca. 15-20 joins in a typical query, the final estimate goes way off.

I can put the database in the lower compatibility mode (110) and it works fine then (same like on SQL Server 2012), but I would really like to know what is the reason for this behaviour. Why is the result of cardinality estimator of SQL Server 2014 wrong?

回答1:

I think there is no simple answer today to this interesting question. The best answer I know is the following video: http://channel9.msdn.com/events/TechEd/NorthAmerica/2014/DBI-B331#fbid=. It has numerous examples of new and old estimators. The video is about 50+ mins long but it is worth the time.

A summary of the video that relates to this question:

Old assumptions of cardinality estimates:

  1. Uniformity – data is uniformly distributed.
  2. Independence – column 1 has no relation to column 2.
  3. Containment – when two attributes might be the same, they are assumed to be the same.
  4. Inclusion – there should be a match.

To Use SQL SERVER 2012 cardinality estimator in SQL SERVER 2014 use the following option:

  • Option (querytraceon 9481) --revert to 2012

What is new estimator doing (based on video):

  • SQL Server uses average selectivity in the index and estimates number of rows by multiplying density of the key to total numbers of rows in the index.
  • New estimator does not work very well with jagged distributions.
  • Most differences between the estimators are based on WHERE clause.
  • New cardinality estimator believes there is a correlation between the tables.
  • You can create filtered statistics to improve queries. (http://msdn.microsoft.com/en-us/library/ms188038.aspx )

To do / checklist:

1. Auto Create / Update Stats
2.  Check database compatibility mode (120/110)
3.  Test using query trace flags
4.  XML showplan

Update What's new in cardinality estimator (SQL Server 2016)

  1. The more accurate.
  2. The CE predicts how many rows your query will likely return
  3. SQL Server 2016 the query store
  4. Another option for tracking the cardinality predictions of the CE is to use the extended event named query_optimizer_estimate_cardinality
  5. CE understands maximum value might be higher than when statistics were last gathered
  6. CE understands that filtered predicates on the same table are often correlated
  7. CE no longer assumes any correlation between filtered predicates from different tables

More details:

https://docs.microsoft.com/en-us/sql/relational-databases/performance/cardinality-estimation-sql-server

https://www.sqlshack.com/query-optimizer-changes-in-sql-server-2016-explained/



回答2:

I wonder if you are running into this issue around multicolumn selectivity estimates:

http://www.sqlskills.com/blogs/kimberly/multi-column-statistics-exponential-backoff/

it seems that there are still some quirks with the new CE try also using TF 4137 as outlined and see if that helps.

finally make sure you are on the latest CU and are running with TF 4199 to blanket enable all query optimizer fixes as always test this in a non-production environment if possible first and be mindful of regressions in other queries when enable settings globally



回答3:

This is not a direct answer to this question but it might help those who are facing similar performance issue related to that SCCM (aka ConfigMgr) database pertaining to Cardinality Estimator (CE) changes. SQL queries can timeout or your ConfigMgr console can run slow because of the new Cardinality Estimator (CE) changes in SQL Server 2014 and SQL Server 2016. Microsoft has given a solution to this problem here which suggests to apply an appropriate SQL Cardinality Estimator (CE) compatibility level as shown in the table below:

SQL Server version    Supported compatibility       Recommended compatibility   
                      level values                  level for ConfigMgr

SQL Server 2016       130, 120, 110, 100            130 

SQL Server 2014       120, 110, 100                 110

Hope this helps!