SQL Server is optimized to work in sets, but this is certainly a paradigm shift for someone accustomed to row-by-row operations.
We will look at T-SQL features that can break set-based processing — generally meaning poor performance — and gather empirical evidence to support that assertion. We’ll also look at some constructs to overcome these issues and make your T-SQL scripts run faster and use fewer resources.
- Slide deck (PDF)
- CorpDB Database (19 MB) used by the scripts. This is a backup from SQL Server 2012. If you restore to 2014+ be sure to upgrade the compatibility level for best results. Last updated 4 January 2017.
To run the demos, restore the CorpDB database, set the compatibility level (if needed), then run the “001 – Create DB Objects.sql” script to create some needed database objects. If you are unable or unwilling to enable CLR on your system, skip everything in the script starting with the sp_configure statement.
Before running the SSIS demo, copy the “UpdatedPriceList.csv” file to c:\temp. To run the SSIS demos, execute both the “050 – Price List Update – Command.dtsx” and “051 – Price List Update – Staging.dtsx” packages, either through Visual Studio or through the Execute Package Utility application. Check the results by running the corresponding SQL scripts (“050…” and “051…”).
To run the .NET App demos, execute the “ThinkInSetsTest06x.exe” utility. Modify the “ThinkInSetsTest06x.exe.config” file first to point to your SQL instance and database and to indicate a path on the file system where file names will be collected. Run the app without parameters to execute both tests, or include a single parameter as indicated below to run an individual test. The results will be written to the console.
- Test 060 (Singleton insert test): “ThinkInSetsTest06x singleton” or “ThinkInSetsTest06x 060”.
- Test 061 (Bulk copy test): “ThinkInSetsTest06x bulkcopy” or “ThinkInSetsTest06x 061”.
Please be aware that the “Scalar UDF Data Access,” “Multi-Statement TVF” and “CLR” demos will take some time to run (about 15 minutes on my system), and that the “Triangle Join” demo will take a very long time (I never had the patience to let it finish, but my guess is that would take about a week to finish).
Here are the results of the various tests performed during the session:
|Test #||Test Name||Run time, ms|
|20||Subquery in SELECT||3,479|
|21||Rewrite as APPLY||3,113|
|22||Rewrite using #temp||926|
|23||Rewrite as RowNum||467|
|24||Subquery in WHERE||3,151|
|25||Rewrite as APPLY||3.575|
|26||Rewrite using #temp||1,416|
|27||Rewrite as RowNum||230|
|30||Scalar UDF Data Access||183,561|
|31||Scalar UDF Constant||116|
|40||Running Total - Cursor||5,158|
|41||Running Total - Triangle Join||100,000,000|
|42||Running Total -|
|50||SSIS - Command Component||2,279|
|51||SSIS - Staging||565|
|60||C# Singleton Insert||2,739|