How does BULK INSERT work internally?

2020-02-26 08:50发布

问题:

Could someone please explain how does BULK INSERT internally work and why is it much faster than the normal INSERT operations ?

Regards, Shishir.

回答1:

BULK INSERT runs in-process with the database engine of SQL Server and thus avoids passing data through the network layer of the Client API - this makes it faster than BCP and DTS / SSIS.

Also, with BULK INSERT, you can specify the ORDER BY of the data, and if this is the same as the PK of the table, then the locking occurs at a PAGE level. Writes to the transaction logs happen at a page level rather than a row level as well.

In the case of regular INSERT, the locking and the Transaction log writes are at a row level. That makes BULK INSERT faster than an INSERT statement.



回答2:

At the very least it can avoid the overhead of creating and commiting a transaction for each record to be inserted. There are other savings to be had: for example shipping larger chunks of data across the network rather than one record at a time will help.

Then, perhaps more interestingly, the DB implementor can (I don't know if any specific vendor actually does this) start getting clever - they can look at the set of records to be inserted and rather than reorganising pages of data to accommodate single records can make bulk reorganisations of the pages to allow for new insertions, likewise we can imagine that shuffling indices could be done more efficiently if we know that we are inserting a set of records.



回答3:

Internally, it converts to an OLEDB recordset, then loads the table.

I've not given it much thought, but I'd assume this is quicker when run on the SQL Server box rather than the client. Plus you can manage constraints that you can't with normal INSERT

From BOL