by Ahmed
24. October 2010 13:10
Bulk copying is the preferred method for loading high volume data. The SqlBulkCopy API in ADO.NET provides the ability to bulk copy using interfaces that .NET developers are more accustomed to e.g. IDataReader and DataTable. It’s similar to the SQL BCP utility however it can provide much more control over the loads. Bulk copying performance over individual inserts is achieved mainly due to a few factors:
- Significant reduction in database round trips
- Minimal logging at the database level (under certain conditions)
- Reduced constraint checking by default
Contrary to popular belief, bulk copy operations can also participate in transactions. SqlBulkCopy has been working quite well for a high volume data loader I developed loading approximate 5 million rows daily on average. Here is an in depth research on SqlBulkCopy API behavior in different scenarios (thanks to the author for all the research): SqlBulkCopy Performance Analysis
Some notes from the article:
- SQL Server scales quite well for concurrent loads up to a certain threshold.
- Loading into a heap is must faster than an indexed table for obvious reasons. However if indexes are required then it maybe better (in some cases) to disable them during loading and enable them after.
- The batch size can impact performance. The bigger the batch size the better.