Forcing SQL Server to pre-cache entire database in

2019-01-30 19:37发布

We have a client site with a 50Gb SQL 2012 database on a server with 100+ Gb of RAM.

As the application is used, SQL server does a great job of caching the db into memory but the performance increase from the caching occurs the SECOND time a query is run, not the first.

To try to maximize cache hits the first time queries are run, we wrote a proc that iterates through every index of every table within the entire DB, running this:

SELECT * INTO #Cache 
FROM ' + @tablename + ' WITH (INDEX (' + @indexname + '))'

In an attempt to force a big, ugly, contrived read for as much data as possible. We have it scheduled to run every 15 minutes, and it does a great job in general.

Without debating other bottlenecks, hardware specs, query plans, or query optimization, does anybody have any better ideas about how to accomplish this same task?

UPDATE
Thanks for the suggestions. Removed the "INTO #Cache". Tested & it didn't make a difference on filling the buffer.
Added: Instead of Select *, I'm selecting ONLY the keys from the Index. This (obviously) is more to-the-point and is much faster.
Added: Read & Cache Constraint Indexes also.

Here's the current code: (hope it's useful for somebody else)

CREATE VIEW _IndexView
as
-- Easy way to access sysobject and sysindex data
SELECT 
so.name as tablename,
si.name as indexname,
CASE si.indid WHEN 1 THEN 1 ELSE 0 END as isClustered,
CASE WHEN (si.status & 2)<>0 then 1 else 0 end as isUnique,
dbo._GetIndexKeys(so.name, si.indid) as Keys,
    CONVERT(bit,CASE WHEN EXISTS (SELECT * FROM sysconstraints sc WHERE object_name(sc.constid) = si.name) THEN 1 ELSE 0 END) as IsConstraintIndex
FROM    sysobjects so
INNER JOIN sysindexes si ON so.id = si.id
WHERE   (so.xtype = 'U')--User Table
AND     ((si.status & 64) = 0) --Not statistics index
AND (   (si.indid = 0) AND (so.name <> si.name) --not a default clustered index
        OR
        (si.indid > 0)
    )
AND si.indid <> 255 --is not a system index placeholder

UNION
SELECT 
so.name as tablename,
si.name as indexname,
CASE si.indid WHEN 1 THEN 1 ELSE 0 END as isClustered,
CASE WHEN (si.status & 2)<>0 then 1 else 0 end as isUnique,
dbo._GetIndexKeys(so.name, si.indid) as Keys,
CONVERT(bit,0) as IsConstraintIndex
FROM    sysobjects so
INNER JOIN sysindexes si ON so.id = si.id
WHERE   (so.xtype = 'V')--View
AND     ((si.status & 64) = 0) --Not statistics index
GO


CREATE PROCEDURE _CacheTableToSQLMemory
@tablename varchar(100)
AS
BEGIN
DECLARE @indexname varchar(100)
DECLARE @xtype varchar(10)
DECLARE @SQL varchar(MAX)
DECLARE @keys varchar(1000)

DECLARE @cur CURSOR
SET @cur = CURSOR FOR
SELECT  v.IndexName, so.xtype, v.keys
FROM    _IndexView v
INNER JOIN sysobjects so ON so.name = v.tablename
WHERE   tablename = @tablename

PRINT 'Caching Table ' + @Tablename
OPEN @cur
FETCH NEXT FROM @cur INTO @indexname, @xtype, @keys
WHILE (@@FETCH_STATUS = 0)
BEGIN
        PRINT '    Index ' + @indexname
        --BEGIN TRAN
            IF @xtype = 'V'
                SET @SQL = 'SELECT ' + @keys + ' FROM ' + @tablename + ' WITH (noexpand, INDEX (' + @indexname + '))' --
            ELSE
                SET @SQL = 'SELECT ' + @keys + ' FROM ' + @tablename + ' WITH (INDEX (' + @indexname + '))' --

            EXEC(@SQL)
        --ROLLBACK TRAN
        FETCH NEXT FROM @cur INTO @indexname, @xtype, @keys
END
CLOSE @cur
DEALLOCATE @cur

END
GO

2条回答
三岁会撩人
2楼-- · 2019-01-30 20:18

This is not an answer, but to supplement Joel Coehoorn's answer, you can look at the table data in the cache using this statement. Use this to determine whether all the pages are staying in the cache as you'd expect:

USE DBMaint
GO
SELECT COUNT(1) AS cached_pages_count, SUM(s.used_page_count)/COUNT(1) AS total_page_count,
name AS BaseTableName, IndexName,
IndexTypeDesc
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT s_obj.name, s_obj.index_id,
s_obj.allocation_unit_id, s_obj.OBJECT_ID,
i.name IndexName, i.type_desc IndexTypeDesc
FROM
(
SELECT OBJECT_NAME(OBJECT_ID) AS name,
index_id ,allocation_unit_id, OBJECT_ID
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND (au.type = 1 OR au.type = 3)
UNION ALL
SELECT OBJECT_NAME(OBJECT_ID) AS name,
index_id, allocation_unit_id, OBJECT_ID
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.partition_id
AND au.type = 2
) AS s_obj
LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id
AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj
ON bd.allocation_unit_id = obj.allocation_unit_id
INNER JOIN sys.dm_db_partition_stats s ON s.index_id = obj.index_id AND s.object_id = obj.object_ID
WHERE database_id = DB_ID()
GROUP BY name, obj.index_id, IndexName, IndexTypeDesc
ORDER BY obj.name;
GO
查看更多
唯我独甜
3楼-- · 2019-01-30 20:32

First of all, there is a setting called "Minumum Server Memory" that looks tempting. Ignore it. From MSDN:

The amount of memory acquired by the Database Engine is entirely dependent on the workload placed on the instance. A SQL Server instance that is not processing many requests may never reach min server memory.

This tells us that setting a larger minimum memory won't force or encourage any pre-caching. You may have other reasons to set this, but pre-filling the buffer pool isn't one of them.

So what can you do to pre-load data? It's easy. Just set up an agent job to do a select * from every table. You can schedule it to "Start automatically when Sql Agent Starts". In other words, what you're already doing is pretty close to the standard way to handle this.

However, I do need to suggest three changes:

  1. Don't try to use a temporary table. Just select from the table. You don't need to do anything with the results to get Sql Server to load your buffer pool: all you need to do is the select. A temporary table could force sql server to copy the data from the buffer pool after loading... you'd end up (briefly) storing things twice.
  2. Don't run this every 15 minutes. Just run it once at startup, and then leave it alone. Once allocated, it takes a lot to get Sql Server to release memory. It's just not needed to re-run this over and over.
  3. Don't try to hint an index. Hints are just that: hints. Sql Server is free to ignore those hints, and it will do so for queries that have no clear use for the index. The best way to make sure the index is pre-loaded is to construct a query that obviously uses that index. One specific suggestion here is to order the results in the same order as the index. This will often help Sql Server use that index, because then it can "walk the index" to produce the results.
查看更多
登录 后发表回答