Hekaton concurrent updates: Large vs. small transactions

Here is another experiment that I performed with In-Memory OLTP (Hekaton) on CTP2 of SQL Server 2014.  I started by creating a table and a sequence:

create table Employee
(
	EmployeeId bigint not null
		primary key nonclustered hash with (bucket_count = 20000000),
	Salary money
) with (memory_optimized = on, durability = schema_only);
 
CREATE SEQUENCE EmployeeSequence AS BIGINT
START WITH 1 INCREMENT BY 1;

Next, I added about 10 million records into the table. The EmployeeId was generated from the sequence object and the Salary was a random value from 25,000 to 75,000:

with Records as (select top (216) status from master.dbo.spt_values)
insert	Employee (EmployeeId, Salary)
select	next value for EmployeeSequence as EmployeeId,
	cast(cast(25000.00 + 50000.00 * (abs(cast(binary_checksum(newid()) as int)))
		* 1.0 / 2147483648.0 as int)as money) as Salary
from	Records r1 cross join Records r2 cross join Records r3;

The first few rows looked like this:

EmployeesTop10

I don’t know enough about the internals of Hekaton to be sure how the records in the Employee table are accessed during a table scan (or if there is a deterministic way that they are accessed), but I am making the guess that it happens in EmployeeId order. I set up the following queries in two separate connections:

Connection 1 – Give all employees a 5% raise:

update Employee
set Salary *= 1.05;

Connection 2 – Set a specific salary for a specific employee:

update Employee
set Salary = 80000.00
where EmployeeId = 9999999;

Before running anything, row 9,999,999 looked like this:

Employee9999999Before

I then fired off the query in connection 1 and as quickly as possible switched to connection 2 and executed the statement there. Query 2 completed quickly. The first query ran for a few seconds and then generated an error:

Msg 41302, Level 16, State 110, Line 1
The current transaction attempted to update a record that has been updated since this transaction started. The transaction was aborted.
The statement has been terminated.

The bottom line is that even though query 1 started first and almost surely had done far more work (in terms of number of updates done before it hits the record for employee 9,999,999), nonetheless it gets the short end of the stick and is the “victim” in this case.  At this point, the target employee has this value:

Employee9999999After

This behavior of simply killing off the second transaction involved in a concurrent update in such a fashion is the price that is paid for having no locking.  My concern is that it will limit the usefulness of Hekaton tables for OLTP purposes. I can certainly envision a scenario where a batch update repeatedly fails because small one-off updates taking place across the same set of records will repeatedly cause the larger transaction be to terminated. And batch updates are just a part of life, at least in my experience, for most OLTP systems. That being said, I still envision great utility in using the In-Memory tables in situations where concurrent updates won’t be likely, such as for staging tables.

Leave a Reply

Your email address will not be published. Required fields are marked *