可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm trying to insert rows of in-memory data into a table on SQL Server Express 2005. It is running what seems to me very slowly - about 5 seconds per 1000 rows inserted. I am just using a basic "INSERT INTO" command. The slowness does not depend on the table data - it is still slow with a table with one int column and no index. It is nothing to do with my software - it is just as slow running SQL in a loop from Management Studio. There is nothing else accessing the database at the same time. On a 3Ghz Xeon (old I know), this will take about 10 seconds to execute:
declare @i int
set @i = 0
set nocount on
while @i < 2000
begin
insert into testdb(testcolumn)
values (1)
set @i = @i + 1
end
Is there a better way to insert bulk in-memory data than looping on INSERT? Or some configuration I should change in SQL Server?
回答1:
You perform each insert inside its own transaction.
Beginning and committing transaction is very expensive in SQL Server
.
Enclose everything into a single transaction block:
declare @i int
set @i = 0
set nocount on
BEGIN TRANSACTION
while @i < 2000
begin
insert into testdb(testcolumn)
values (1)
set @i = @i + 1
end
COMMIT
To generate sample data, you can use a recursive CTE
:
WITH q (num) AS
(
SELECT 1
UNION ALL
SELECT num + 1
FROM q
WHERE num < 2000
)
INSERT
INTO testdb(testcolumn)
SELECT 1
FROM q
OPTION (MAXRECURSION 0)
, which will be faster.
回答2:
1) Log Flush on commit. Every transaction has to ensure the log is flushed to the disk before the commit returns. Every INSERT statement is an implicit transaction. Bulk commit:
declare @i int
set @i = 0
set nocount on
begin transaction
while @i < 2000
begin
insert into testdb(testcolumn)
values (1)
set @i = @i + 1
if (@i % 1000 = 0)
begin
commit;
begin transaction;
end
end
commit
2) Slow disk. Check the Avg. Disk sec/Transfer performance counter for your data and your log disks.
3) To many indices (unlikely on a test table). Each index is nearly as expensive as a 'table' for inserts.
4) Triggers (again, unlikely)
Ultimately, measure. Follow the guidelines of a whitepaper like Troubleshooting Performance Problems in SQL Server 2005 if you don't know where to start.
回答3:
You have plenty of tools/techniques to get more performance out of this type of work load.
- If appropriate Bulk Load anything you can. Somethings you can't. Need to run validated against the records, destination table has nullable columns...
- Consider moving complex Data Warehousing/ETL operations to a staging database with no transaction logging (aka simple mode). This will improved performance greatly. Then batch/bulk the data to the destination system.
- Batch non-bulk load insert operations. Commit every n records start with 1,000 and performance tune from there.
- Improve the speed of your disk storage. Smaller faster disk are much better than bigger and slower. The last db performance tuning project I worked on we moved from local disk 10,000 RPM to SAN then back to solid state disk on the server for some operations. Solid State most definitely rocks! But is expensive.
- Use the force, um performance tuning tools for Sql Server to find less obvious bottle necks. Sometimes the best course of action might be to drop and rebuilt indexes based on what % of records are being inserted/deleted compared to the table size; disable triggers during certain operations; and modifying the sparseness of records in data blocks.
回答4:
In addition to indices, if you're actual scenario is as per your example, you could do a set-based approach to insert 2000 records like this:
INSERT testdb(testcolumn)
SELECT 1
FROM master..spt_values
WHERE number BETWEEN 1 AND 2000
回答5:
Insert speed is driven by the following things:
- The speed of your log disk. In particular, it's important that the log be on a volume by itself, so that disk seeks don't slow things down (can be a 40x effect)
- The structure of your table and associated indexes / keys / triggers, etc.
- The size of your transactions. Larger transactions require fewer round-trips to the log disk, and less associated overhead.
- The size of your command batches. Larger batches are more efficient than many individual ones.
In case it's of any interest, I go through this in detail in my book (Ultra-Fast ASP.NET), including benchmarks and example code.
回答6:
Having a clustered index (usually primary key) actually increases insert speed, so verify you have one of those. And running 1000 transactions against a table isn't the fastest way if you can have all of the data at once and insert it into the table (This can be accomplished by using table valued parameters in sql server 2008 or xml parameters in 2005).
回答7:
I would google to "SQL Server Tuning"... There are many books written on the subject. It is a very hard thing to solve as there are MANY things that affect speed, from query syntax, to RAM allocated to the server, to proportions of allocated RAM (to which part of SQL Server you allocate RAM), to RAID array configuration, and MANY other factors. You can have a database server optimized for insert/updates (OLTP) or for querying (data warehouse type of stuff). In other words, don't expect a single, simple answer to this, even thought your problem seems straightforward.
This is why you have database server administrators.
Or you could just not sweat the server-side issues and optimize your client-code as much as possible, if timing is not very important to you.
I would look into prepared statements and transactions as a way to begin to optimize. Then look at indexing (if this is a set of inserts that do not happen very often I would consider dropping indices, doing the import, and creating the indices again).