Presentation: Set Me Up: How to Think in Sets

Abstract:

SQL Server is optimized to work in sets, but this is certainly a paradigm shift for someone accustomed to row-by-row operations.

We will look at T-SQL features that can break set-based processing — generally meaning poor performance — and gather empirical evidence to support that assertion. We’ll also look at some constructs to overcome these issues and make your T-SQL scripts run faster and use fewer resources.

  • Slide deck (PDF)
  • Scripts
  • CorpDB Database (19 MB) used by the scripts. This is a backup from SQL Server 2012. If you restore to 2014+ be sure to upgrade the compatibility level for best results. Last updated 4 January 2017.

To run the demos, restore the CorpDB database, set the compatibility level (if needed), then run the “001 – Create DB Objects.sql” script to create some needed database objects. If you are unable or unwilling to enable CLR on your system, skip everything in the script starting with the sp_configure statement.

Before running the SSIS demo, copy the “UpdatedPriceList.csv” file to c:\temp. To run the SSIS demos, execute both the “050 – Price List Update – Command.dtsx” and “051 – Price List Update – Staging.dtsx” packages, either through Visual Studio or through the Execute Package Utility application. Check the results by running the corresponding SQL scripts (“050…” and “051…”).

To run the .NET App demos, execute the “ThinkInSetsTest06x.exe” utility. Modify the “ThinkInSetsTest06x.exe.config” file first to point to your SQL instance and database and to indicate a path on the file system where file names will be collected. Run the app without parameters to execute both tests, or include a single parameter as indicated below to run an individual test. The results will be written to the console.

  • Test 060 (Singleton insert test): “ThinkInSetsTest06x singleton” or “ThinkInSetsTest06x 060”.
  • Test 061 (Bulk copy test): “ThinkInSetsTest06x bulkcopy” or “ThinkInSetsTest06x 061”.

Please be aware that the “Scalar UDF Data Access,” “Multi-Statement TVF” and “CLR” demos will take some time to run (about 15 minutes on my system), and that the “Triangle Join” demo will take a very long time (I never had the patience to let it finish, but my guess is that would take about a week to finish).

Here are the results of the various tests performed during the session:

Test #Test NameRun time, ms
11Cursor15,306
12Cursor ROFF10,421
13While Loop9,200
14Set-based131
20Subquery in SELECT3,479
21Rewrite as APPLY3,113
22Rewrite using #temp926
23Rewrite as RowNum467
24Subquery in WHERE3,151
25Rewrite as APPLY3.575
26Rewrite using #temp1,416
27Rewrite as RowNum230
30Scalar UDF Data Access183,561
31Scalar UDF Constant116
32Multi-statement TVF194,591
33CLR184,538
34APPLY1,901
35Inline TVF1,779
36RowNum249
40Running Total - Cursor5,158
41Running Total - Triangle Join100,000,000
42Running Total -
Windowing
555
50SSIS - Command Component2,279
51SSIS - Staging565
60C# Singleton Insert2,739
61C# SqlBulkCopy85

Presentation History: