SQL Server Latches and their indication of perform

2020-05-19 07:37发布

问题:

I am trying to understand a potential performance issue with our database (SQL 2008) and in particular one performance counter, SQLServer:Latches\Total Latch Wait Time Total Latch Wait Time (ms). We are seeing a slow down in DB response times and the only correlating spike that I can match it with is a spike in Total Latch Wait Time and Latch Waits/sec. I am not seeing any particular bottleneck in disk IO, CPU usage or memory.

The common explanation of a SQLServer latch is that it is a lightweight lock, but I am trying to get a more detailed understanding of what a latch is, how it differs from a lock and what the high amount of them that I am seeing may be an indicator for.

回答1:

I recommend you looke into sys.dm_os_latch_stats and see what type of latches have increased contention and wait types, compared to previous base-line.

If you see a spike in the BUFFER type latches it means it is driven by updates conflicting to modify the same page. Other latch types have also short explanation in the MSDN and can guide you toward the problem root cause. For those marked 'internal use only', you're going to have to open a support case with MS, as a detailed explanation of what they mean is on the verge of NDA.

You should also look into sys.dm_os_wait_stats. If you see an increase of PAGELATCH_*, then it is the same problem as the BUFFER type latch above, contention in trying to modify same page, aka. as an update hot-spot. If you see an increase PAGEIOLATCH_*then your problem is the I/O susbsytem, it takes too long to load the pages in memory when they are needed.



回答2:

This maybe a really basic error to professional DBA... but this is what I found with our high latch problem, and this thread ranks very high in search results. I thought I'd share our bit that it may help someone else.

on newer dual / multi processor server using NUMA memory architecture, the max degree of parallelism should be set to the actual core number per processor. in our example we had dual xenon with 4 cores each, and with hyper threading it appears as 16 logical processors to SQL.

Locking this value from the default 0 to 4 cut the high latch on some queries down immediately.

Our latch ran 1000ms+ up to 30,000ms on some occasions.



回答3:

Reference taken from this blog:

Using sys.dm_db_index_operational_stats:

SELECT 
    OBJECT_NAME(object_id)
    ,page_latch_wait_count
    ,page_latch_wait_in_ms
    ,tree_page_latch_wait_count
    ,tree_page_latch_wait_in_ms  
    ,Page_io_latch_wait_count
    ,Page_io_latch_wait_in_ms
FROM sys.dm_db_index_operational_stats (DB_ID(), NULL, NULL, NULL)

Using sys.dm_os_latch_stats:

SELECT * FROM sys.dm_os_latch_stats  
WHERE latch_class = 'buffer'


回答4:

sp_configure 'max degree of parallelism', 8
go
reconfigure
go