T-SQL vs CLR Integration Performance

by Admin 30. July 2007 19:40

I had an interesting discussion with a colleague today on T-SQL performance compared with CLR integration. CLR integration is a huge feature of SQL 2005 and vastly expands the number of options available to developers when it comes to SQL Server programming. But do all these options come at a cost was the big question? My assumption was that T-SQL would generally outperform CLR in certain scenarios mainly due to 2 reasons:

  1. T-SQL is optimized for certain types of operations such as set-based operations
  2. Some kind of penalty is incurred for managed method invocation

The MSDN product documentation has a section that sheds more light on this issue. From the documentation its apparent that MS has optimized the architecture of SQL CLR integration as much as possible. Here's an excerpt from the actual documentation:

"During compilation of SQL expressions, when a reference to a managed routine is encountered, a Microsoft intermediate language (MSIL) stub is generated. This stub includes code to marshal the routine parameters from SQL Server to the CLR, invoke the function, and return the result. This "glue" code is based on the type of parameter and on parameter direction (in, out, or reference).

The glue code enables type-specific optimizations and ensures efficient enforcement of SQL Server semantics, such as nullability, constraining facets, by-value, and standard exception handling. By generating code for the exact types of the arguments, you avoid type coercion or wrapper object creation costs (called "boxing") across the invocation boundary.

The generated stub is then compiled to native code and optimized for the particular hardware architecture on which SQL Server executes, using the JIT (just-in-time) compilation services of the CLR. The JIT services are invoked at the method level and allow the SQL Server hosting environment to create a single compilation unit that spans both SQL Server and CLR execution. Once the stub is compiled, the resulting function pointer becomes the run-time implementation of the function. This code generation approach ensures that there are no additional invocation costs related to reflection or metadata access at run time."

The documentation goes on to provide additional details but the general idea one gets is that there are no huge drawbacks in most cases. In fact, CLR can outperform T-SQL in certain scenarios.

Then I came across these performance analysis results and as per the analysis the T-SQL performance stood out. Of course, one has to understand the type of tests performed since it does not apply to every requirement. And finally there is the general performance rule when deciding between native & managed procedure:

  1. Use T-SQL for CRUD operations and CLR for algorithms involving complex calculations or string manipulation
  2. Test it on your data & servers to be sure

So to summarize my findings use the right tool for the job. CLR integration is optimized and makes available a host of functionality that wasn't possible or simple to achieve with T-SQL before. As long as you are not doing something like this you should be fine ;)

Comments are closed