Category Archives: SQL Server 2016

Always Encrypted Storage Overhead

The new Always Encrypted feature in SQL Server 2016 adds some overhead to the storage requirements of a column. I wanted to do some analysis to determine just how much overhead is incurred for various data types. This test was run against CTP 3.2.

Methodology

After creating a database along with a column master key and the column encryption key, I created four tables to store the encrypted data.

if exists (select * from sys.tables where name = 'VarcharAnalysis')
	drop table VarcharAnalysis;
create table VarcharAnalysis
(
	CharCount int null,
	CharData varchar(8000) collate Latin1_General_BIN2
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null
);
 
create clustered index cl_VarcharAnalysis on VarcharAnalysis (CharCount);
 
if exists (select * from sys.tables where name = 'NVarcharAnalysis')
	drop table NVarcharAnalysis;
create table NVarcharAnalysis
(
	CharCount int null,
	CharData nvarchar(4000) collate Latin1_General_BIN2
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null
);
 
create clustered index cl_NVarcharAnalysis on NVarcharAnalysis (CharCount);
 
if exists (select * from sys.tables where name = 'VarbinaryAnalysis')
	drop table VarbinaryAnalysis;
create table VarbinaryAnalysis
(
	ByteCount int null,
	BinaryData varbinary(8000)
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null
);
 
create clustered index cl_VarbinaryAnalysis on VarbinaryAnalysis (ByteCount);
 
if exists (select * from sys.tables where name = 'FixedSizeAnalysis')
	drop table FixedSizeAnalysis;
create table FixedSizeAnalysis
(
	TinyIntData tinyint
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	SmallIntData smallint
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	IntData int
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	BigIntData bigint
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	DecimalData decimal(38,0)
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	BitData bit
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	DateTimeData datetime
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	DateTime2Data datetime2
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
	GuidData uniqueidentifier
		encrypted with (column_encryption_key = [CEK_Auto1], 
		encryption_type = Deterministic, 
		algorithm = 'AEAD_AES_256_CBC_HMAC_SHA_256') null,
);

I then created a small C# application to first insert data into the tables, and then to extract the encrypted values out of the tables.

using (Dal dal = new Dal(instanceName, databaseName, userName, password, true))
{
	dal.InsertVarcharAnalysis(null);
	for (int i = 0; i <= 8000; i++)
	{
		string data = new string('*', i);
		dal.InsertVarcharAnalysis(data);
	}
}
 
using (Dal dal = new Dal(instanceName, databaseName, userName, password, true))
{
	dal.InsertNVarcharAnalysis(null);
	for (int i = 0; i <= 4000; i++)
	{
		string data = new string('*', i);
		dal.InsertNVarcharAnalysis(data);
	}
}
 
using (Dal dal = new Dal(instanceName, databaseName, userName, password, true))
{
	dal.InsertVarbinaryAnalysis(null);
	byte[] template = new byte[8000];
	Enumerable.Range(0, 8000).ToList().ForEach(i => template[i] = (byte)(i % 256));
	for (int i = 0; i <= 8000; i++)
	{
		byte[] data = new byte[i];
		Array.Copy(template, data, i);
		dal.InsertVarbinaryAnalysis(data);
	}
}
 
 
using (Dal dal = new Dal(instanceName, databaseName, userName, password, true))
{
	dal.InsertFixedSizeAnalysis(null, null, null, null, null, null, null, null, null);
	dal.InsertFixedSizeAnalysis(byte.MaxValue, short.MaxValue, int.MaxValue, long.MaxValue, decimal.MaxValue, true,
			DateTime.Now, DateTime.Now, Guid.NewGuid());
}

In my Dal object’s constructor, the final parameter is a boolean value indicating whether to set the Column Encryption Setting to “Enabled”. This is necessary because the inserts must be done when encryption is enabled, but we need to read back the tables with encryption disabled so that we get back the raw, encrypted bytes instead of the decrypted data.

The analysis code looks like this:

using (Dal dal = new Dal(instanceName, databaseName, userName, password, false))
{
	List records = dal.ReadVarcharAnalysis();
	int? previousRecordSize = int.MinValue;
	foreach (CharDataRecord record in records)
	{
		if (record.Data?.Length != previousRecordSize)
		{
			Console.WriteLine("{0} - {1}", record.Length, record.Data?.Length);
			previousRecordSize = record.Data?.Length;
		}
	}
}
 
using (Dal dal = new Dal(instanceName, databaseName, userName, password, false))
{
	List records = dal.ReadNVarcharAnalysis();
	int? previousRecordSize = int.MinValue;
	foreach (CharDataRecord record in records)
	{
		if (record.Data?.Length != previousRecordSize)
		{
			Console.WriteLine("{0} - {1}", record.Length, record.Data?.Length);
			previousRecordSize = record.Data?.Length;
		}
	}
}
 
using (Dal dal = new Dal(instanceName, databaseName, userName, password, false))
{
	List records = dal.ReadVarbinaryAnalysis();
	int? previousRecordSize = int.MinValue;
	foreach (CharDataRecord record in records)
	{
		if (record.Data?.Length != previousRecordSize)
		{
			Console.WriteLine("{0} - {1}", record.Length, record.Data?.Length);
			previousRecordSize = record.Data?.Length;
		}
	}
}
 
using (Dal dal = new Dal(instanceName, databaseName, userName, password, false))
{
	List records = dal.ReadFixedSizeAnalysis();
	foreach (FixedSizeDataRecord record in records)
	{
		Console.WriteLine("TinyInt: {0}", record.TinyIntData?.Length);
		Console.WriteLine("SmallInt: {0}", record.SmallIntData?.Length);
		Console.WriteLine("Int: {0}", record.IntData?.Length);
		Console.WriteLine("BigInt: {0}", record.BigIntData?.Length);
		Console.WriteLine("Decimal: {0}", record.DecimalData?.Length);
		Console.WriteLine("Bit: {0}", record.BitData?.Length);
		Console.WriteLine("DateTime: {0}", record.DateTimeData?.Length);
		Console.WriteLine("DateTime2: {0}", record.DateTime2Data?.Length);
		Console.WriteLine("Guid: {0}", record.GuidData?.Length);
	}
}

Results

Let’s start by looking at the varchar data. Here are the first few rows in the table:

VarcharAnalysisTopRows

One thing to note is that, not surprisingly, a NULL input to the encryption function yields a NULL output This was true everywhere in this test. However, a blank string (0-length) generates 65 bytes of output, as does an input containing 1 character, 2 characters, etc. At 16 characters of input, the output grows to 81 bytes, and when input reaches 32 characters the output is 97 bytes.

Input Size (characters)Output Size (bytes)
065
1681
3297
48113
64129
80145
96161
112177
128193

The pattern is quickly apparent. We can express this relationship very simply as:

C = 16 * floor(P / 16) + 65

where P is the input plaintext size in characters and C is the output cryptotext size in bytes.

Next, the nvarchar data. This has similar growth patterns but, not surprisingly, the encryption output grows at 8-character boundaries.

Input Size (characters)Output Size (bytes)
065
881
1697
24113
32129
40145
48161
56177
64193
72209

C = 16 * floor(P / 8) + 65

This is the same result as before taking into account that each input character is now two bytes instead of one.

Varbinary data, as you might expect, follows the same pattern as varchar data:

Input Size (characters)Output Size (bytes)
065
1681
3297
48113
64129
80145
96161
112177
128193

Finally, let’s look at fixed-width data. It turns out that the tinyint, smallint, int and bigint data types now occupy 65 bytes, as do bit, datetime and datetime2 columns. The decimal and uniqueidentifier data types require 81 bytes when they are encrypted. Again, this fits our pattern we have observed so far.

Data TypeUnencrypted (bytes)Encrypted (bytes)
tinyint165
smallin265
int465
bigint865
decimal(38,0)1781
bit0.12565
datetime865
datetime2865
uniqueidentifier1681

Conclusions

The size of the encrypted data throughout this analysis is very consistently

C = 16 * floor(P / 16) + 65

where P is the size, in bytes, of the unencrypted data, and C is the resulting size, again in bytes, of the encrypted data. This means that small inputs incur a significant amount of overhead, but as the input grows the overhead becomes only a tiny fraction. The greatest overhead is observed with the bit data type, where the encrypted data size is 520 times the unencrypted data size.

Coda

You may have noticed that I created all of the tables with the encryption type set to Deterministic. I re-ran this test using Randomized encryption throughout, and the result are identical.

Observe that these results are based on the encrypted data size after the initial insert into the table. I’m not sure if there is anything later in the process, such as key rotation, that may cause the data size to grow, so keep that limitation in mind.

Always Encrypted and sys.dm_exec_describe_first_result_set

Suppose we have the following table created in SQL Server 2016 CTP 3.2 (presuming that the encryption keys have already been created):

create table dbo.Customer
(
	CustomerID int NULL,
	FirstName varchar(50) collate Latin1_General_BIN2
		ENCRYPTED WITH (COLUMN_ENCRYPTION_KEY = CEK_Auto1,
		ENCRYPTION_TYPE = Deterministic,
		ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256') NULL,
	LastName varchar(50) collate Latin1_General_BIN2
		ENCRYPTED WITH (COLUMN_ENCRYPTION_KEY = CEK_Auto1,
		ENCRYPTION_TYPE = Deterministic,
		ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256') NULL,
	Address varchar(50) NULL,
	City varchar(50) NULL,
	State varchar(50) NULL
);

Now suppose we connect to the database using a connection where Column Encryption Setting=Disabled and execute:

select column_ordinal, name, system_type_name
from sys.dm_exec_describe_first_result_set(N'select * from dbo.Customer', null, null);

Then we get the following results:

DescribeFirstResultSet_EncryptionDisabled

The first and last name columns, rather than coming back as varchar(50) are instead shown as varbinary(113). This isn’t too surprising, given that we are on an unencrypted connection, and if we actually execute the select against the Customer table we will, in fact, get back binary data.

So surely if we use a connection where Column Encryption Setting=Enabled, we will get the correct data type back, right?

Nope.

DescribeFirstResultSet_EncryptionEnabled

We still get back varbinary(113).

I’m not sure how I feel about these results. On the one hand, the results are consistent regardless of the column encryption setting, and they do reflect the true data type at the storage engine level. However, the real usefulness of sys.dm_exec_describe_first_result_set (and its companion stored procedure sys.sp_describe_first_result_set, which gives identical results) is the ability to discover the data types coming back from a particular T-SQL statement. If used as intended (that is, on a connection with the column encryption setting enabled), the metadata coming back doesn’t match what will be returned by the actual query, significantly reducing the value of this DMF.

The official documentation for the DMF doesn’t address the impact of Always Encrypted, and the wording is sufficiently vague (“describes the metadata”) that it’s hard to say what the “correct” results should be in this case.

Of course, it could very well be something that Microsoft is already planning to address before RTM.

Defeating Dynamic Data Masking

I can’t recall exactly where or who said it, but at some point over this past summer I recall someone mentioning that the new dynamic data masking feature in SQL Server 2016 could be easily worked around by selecting the data into a temporary table.  I shelved that thought for several months, and then this past weekend I started playing with it, only to find that the issue had been fixed in CTP 2.4.  However, I wanted to document some of the things that I tried just to show that it isn’t so easy any more to defeat the benefits of this new feature.  These tests were run in CTP 3.2.

Let’s start with the sample table from Books Online:

CREATE TABLE Membership
  (MemberID int IDENTITY PRIMARY KEY,
   FirstName varchar(100) NULL,
   LastName varchar(100) NOT NULL,
   Phone# varchar(12) NULL,
   EMail varchar(100) NULL);
 
ALTER TABLE Membership ALTER COLUMN FirstName ADD MASKED WITH (FUNCTION = 'partial(1,"XXXXXXX",0)');
ALTER TABLE Membership ALTER COLUMN Phone# ADD MASKED WITH (FUNCTION = 'default()');
ALTER TABLE Membership ALTER COLUMN Email ADD MASKED WITH (FUNCTION = 'email()');
 
INSERT Membership (FirstName, LastName, Phone#, EMail) VALUES 
('Roberto', 'Tamburello', '555.123.4567', 'RTamburello@contoso.com'),
('Janice', 'Galvin', '555.123.4568', 'JGalvin@contoso.com.co'),
('Zheng', 'Mu', '555.123.4569', 'ZMu@contoso.net');
 
CREATE USER TestUser WITHOUT LOGIN;
GRANT SELECT ON Membership TO TestUser;

We can test that DDM is working by running:

EXECUTE AS USER = 'TestUser';
SELECT * FROM Membership;
REVERT;

The results are:

ddmResults1

Next, we will try to select the results into a temporary table and then select from the temp table.

EXECUTE AS USER = 'TestUser';
if object_id('tempdb..#Membership') is not null
	drop table #Membership;
SELECT * into #Membership FROM Membership;
select * from #Membership;
REVERT;

The results are identical.

ddmResults2

In some cases attributes of the table get copied as part of a SELECT … INTO statement, so the next test is to explicitly create the temp table:

EXECUTE AS USER = 'TestUser';
if object_id('tempdb..#Membership') is not null
	drop table #Membership;
CREATE TABLE #Membership
  (MemberID int IDENTITY PRIMARY KEY,
   FirstName varchar(100) NULL,
   LastName varchar(100) NOT NULL,
   Phone# varchar(12) NULL,
   EMail varchar(100) NULL);
 
set identity_insert #Membership on;
 
insert #Membership (MemberID, FirstName, LastName, [Phone#], Email)
SELECT MemberID, FirstName, LastName, [Phone#], Email FROM Membership;
 
set identity_insert #Membership off;
 
select * from #Membership;
REVERT;

The results are identical:

ddmResults3

What if we do something very similar, but use one of the columns in an expression?

EXECUTE AS USER = 'TestUser';
if object_id('tempdb..#Membership') is not null
	drop table #Membership;
CREATE TABLE #Membership
  (MemberID int IDENTITY PRIMARY KEY,
   FirstName varchar(100) NULL,
   LastName varchar(100) NOT NULL,
   Phone# varchar(12) NULL,
   EMail varchar(100) NULL);
set identity_insert #Membership on;
 
insert #Membership (MemberID, FirstName, LastName, [Phone#], Email)
SELECT MemberID, FirstName + '', LastName, [Phone#], Email FROM Membership; 
set identity_insert #Membership off;
select * from #Membership;
REVERT;

ddmResults4

Interestingly, this strips the first character from the results, yielding only “xxxx” in the FirstName column!

We can try and see if the “masked” attribute was copied to the temp table created as a SELECT … INTO statement.

EXECUTE AS USER = 'TestUser';
if object_id('tempdb..#Membership') is not null
	drop table #Membership;
SELECT * into #Membership FROM Membership;
 
alter table #Membership alter column FirstName drop masked;
 
select * from #Membership;
REVERT;

Msg 16007, Level 16, State 0, Line 32
The column 'FirstName' does not have a data masking function.

Nope, the masking attribute doesn’t get copied to the temp table.

Next, we try selecting into a variable first, then seeing what is in the variable:

EXECUTE AS USER = 'TestUser';
declare @FirstName varchar(100);
select @FirstName = FirstName from Membership where MemberID = 1;
select @FirstName;
REVERT;

Now we get:

ddmResults5

Let’s select the characters individually. Based on what we’ve seen to this point, this probably isn’t going to work.

EXECUTE AS USER = 'TestUser';
declare @c1 char(1), @c2 char(1), @c3 char(1), @c4 char(1), @c5 char(1), @c6 char(1), @c7 char(1)
select @c1 = substring(FirstName, 1, 1), @c2 = substring(FirstName, 2, 1), @c3 = substring(FirstName, 3, 1), @c4 = substring(FirstName, 4, 1),
	@c5 = substring(FirstName, 5, 1), @c6 = substring(FirstName, 6, 1), @c7 = substring(FirstName, 7, 1)
from Membership where MemberID = 1;
select @c1, @c2, @c3, @c4, @c5, @c6, @c7;
REVERT;

And indeed we only get:

ddmResults6

Finally, we will create another database and create a user as database owner:

create database AnotherDatabase;
GO
use AnotherDatabase;
 
create login AnotherDBUser with password = 'B@dP@$$w0rd';
create user AnotherDBUser for login AnotherDBUser;
 
alter role db_owner add member AnotherDBUser;

Now switch back to the database containing the member table, and grant this login read-only access.

create user AnotherDBUser for login AnotherDBUser;
alter role [db_datareader] add member AnotherDBUser;

And try it out:

execute as login = 'AnotherDBUser';
SELECT * FROM Membership;
REVERT;
 
execute as user = 'AnotherDBUser';
SELECT * FROM Membership;
REVERT;

Both queries return the same masked result set.

ddmResults7

SQL Server 2016 CTP2 Initial Impressions: Temporal Table Performance

One of the interesting new features that I came across within CTP2 of SQL Server 2016 are the new Temporal Tables, which essentially give you the ability to see your records from a point in time perspective. My hope was to see something that gave me an easier to use Change Data Capture like feel, and I was specifically hoping it would be with that sort of performance that CDC gives you. What I found was not quite that.

Here is the setup: First I would like to create a baseline table. This will not be Temporal, it is just to compare the results between a Temporal table and a normal table.

CREATE TABLE nottemporal
(
id INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED
,col1 VARCHAR(200)
,col2 DATETIME2
,col3 INT
);

Next, I will create my temporal table. It will have the same structure as my non Temporal table but with the required fields that essentially make a table Temporal.

CREATE TABLE testtemporal
(
id INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED
,col1 VARCHAR(200)
,col2 DATETIME2
,col3 INT
,SysStartTime DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL
,SysEndTime DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL
,PERIOD FOR SYSTEM_TIME (SysStartTime, SysEndTime)
) WITH (SYSTEM_VERSIONING = ON);

The syntax is a bit different than what we would normally use within SQL Server, and it is required in order to make these Temporal Tables work. It is also worth noting that things look a bit different for a Temporal table within Management Studio. You will see that we have a little note that says it is System-Versioned and then when you expand the table there is a reference to the default History table that is created for us that contains this data. You can create your own table (so that it is named nicely) but that is for another time.

Temporal Table Object Explorer

Now I want to populate this table. Before doing that we need a numbers table to simplify things, plus Adam Machanic tells us that we require one! If you create your numbers table with 1 million rows and just like Adam tells us here, my scripts will work for you without any changes.

For reference, I am running all of these on an Azure VM created with the SQL 2016 CTP2 image with 4 cores and 3.5 GB of RAM. Pretty puny but it will do for this.

&gt;SET STATISTICS IO ON;
SET STATISTICS TIME ON;
 
INSERT INTO testtemporal(col1, col2, col3)
Select 'test data ' + cast(number as varchar(10))
,dateadd(ss, number, getdate())
,number
From numbers;
GO
 
INSERT INTO nottemporal(col1, col2, col3)
Select 'test data ' + cast(number as varchar(10))
,dateadd(ss, number, getdate())
,number
From numbers;
GO

From a time perspective these two ran in roughly the same amount of time and obviously there will be slightly higher IO for the Temporal table, since it has a couple of extra columns, if nothing else.

For the Non Temporal Table:

Table 'nottemporal'. Scan count 0, logical reads 28318, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Numbers'. Scan count 1, logical reads 1615, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 7812 ms, elapsed time = 57097 ms.

For the Temporal Table:

Table 'testtemporal'. Scan count 0, logical reads 38198, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Numbers'. Scan count 1, logical reads 1615, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 7844 ms, elapsed time = 58777 ms.

But it is when we start updating that this gets interesting. In this case, I am going to update all 1 million rows. Perhaps not a real-world scenario, but it is the best way to show what I see as a concern.

&gt;Update testtemporal
set col1 = 'Modified Data ' + cast(col3 as varchar(10));
GO
Update nottemporal
set col1 = 'Modified Data ' + cast(col3 as varchar(10));

For the Non Temporal Table:

Table 'nottemporal'. Scan count 1, logical reads 61298, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 8766 ms, elapsed time = 42353 ms.

For the Temporal Table:

Table 'MSSQL_TemporalHistoryFor_565577053'. Scan count 0, logical reads 15993, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'testtemporal'. Scan count 1, logical reads 83217, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 31109 ms, elapsed time = 68449 ms.

This is where it gets interesting from an IO/Time perspective. Realizing that Time is one of those things that you cannot bank on, it is a noticeable jump here though quite obviously YMMV. Comparable to having a trigger on the temporal table, I would say (without real data to back that up…just a feeling). 68 seconds to update that Temporal table vs 42 seconds to update the non Temporal table.

The reads are higher, but it is not significant … not yet anyway.

Seeing how poorly that was on the first update, I figure it is worth checking to see what would happen if I go and update those records again back to what they were before.

&gt;Update testtemporal
set col1 = 'test data ' + cast(col3 as varchar(10));
GO
Update nottemporal
set col1 = 'test data ' + cast(col3 as varchar(10))

And the IO/Time data for this gets even crazier.

For the Non Temporal Table:

Table 'nottemporal'. Scan count 1, logical reads 11165, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5063 ms, elapsed time = 15502 ms.

For the Temporal Table:

Table 'MSSQL_TemporalHistoryFor_565577053'. Scan count 0, logical reads 3242447, physical reads 0, read-ahead reads 3, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'testtemporal'. Scan count 1, logical reads 15156, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 66359 ms, elapsed time = 165560 ms.

This is where it went nuts, in my opinion. We see the IO on the History table jump up from 15,000 for that first update to over 3 million on the second (and as I later found out a number of updates after that) and the time to complete was an order of magnitude higher. Obviously, from a performance perspective this could be problematic.

Digging into the why of this, it was pretty clear by just looking at the structure of that History table that SQL creates for us. The clustering key is built on our own Primary Key, but it also adds on the SysStartTime and SysEndTime datetime2 values.

Temporal Table Clustering Keys

And then running a quick DBCC SHOWCONTIG to see how things are laid out after those updates, we see that there is a fair amount of Fragmentation, as would be expected for a clustering key of this nature.

DBCC SHOWCONTIG scanning 'MSSQL_TemporalHistoryFor_565577053' table...
Table: 'MSSQL_TemporalHistoryFor_565577053' (597577167); index ID: 1, database ID: 5
TABLE level scan performed.
- Pages Scanned................................: 12149
- Extents Scanned..............................: 1520
- Extent Switches..............................: 7212
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 21.06% [1519:7213]
- Logical Scan Fragmentation ..................: 99.99%
- Extent Scan Fragmentation ...................: 0.79%
- Avg. Bytes Free per Page.....................: 2256.4
- Avg. Page Density (full).....................: 72.12%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

In summary, Temporal tables look like they could be a very cool feature for the next and future releases of SQL Server. But they will definitely be coming with a price.

UPDATE 7/20/2015: Tested on CTP 2.1 with identical results.

UPDATE 9/30/2015: Tested on CTP 2.4 with similar results.

SQL Server 2016 CTP2 Initial Impressions: Live Statistics and Dynamic Data Masking

I have been trying out the CTP2 release that came out a couple of days ago and wanted to write about a couple of things I’ve seen so far.

First, I kind of like the Live Query Statistics feature.  The first time that I ran a query with live statistics everything seemed to hang up for about four minutes, displaying a message to the effect that it was preparing the query.  I’m guessing this more a function of initializing stuff under-the-hood on the underpowered VM that I’m using, because subsequent processing has been much speedier.

In the few queries that I’ve tested out so far, there hasn’t been too much excitement, but can certainly see how live statistics can be a useful diagnostic tool at times.

I also tried playing around with Dynamic Data Masking.  Unfortunately, I wasn’t able to get things to work even using the sample given on the page.  Specifically, the line:

ALTER TABLE Membership ALTER COLUMN FirstName ADD MASKED WITH (FUNCTION = 'partial(1,"XXXXXXX",0)');

results in an error:

Msg 102, Level 15, State 1, Line 8
Incorrect syntax near 'masked'.

The syntax appears to be correct according to the ALTER TABLE documentation, so I’m not sure if the syntax has been changed or if something isn’t quite working correctly in SQL Server.

One more thing.  When I install Management Studio, one of the first things I change is the setting to send the results to a new tab and to automatically switch to the new tab when query execution is done (Tools -> Options -> Query Results -> SQL Server -> Results to Grid).  To my annoyance, queries execute fine, but SSMS does not automatically switch to a new tab.  This happens even after a restart of SSMS.  However, I have not yet tried this on a different installation.

All that said, I really like a lot of the new features in SQL Server 2016 and look forward to further experimentation.

UPDATE:  It turns out that a pair of trace flags needed to be enabled in CTP 2.0 for the dynamic data masking to work correctly.  Starting in CTP 2.1, these trace flags were enabled by default and the sample code works as expected.

As of CTP 3.0, the switch to results tab features is still not working in Management Studio.  I have submitted a Connect item to address this bug.

UPDATE: Awesome! As of CTP 3.1, switch to results tab looks to be working!