Missing Rows After Updating SQL Server Index Key Column
"T-SQL Query Query" ( http://www.amazon.com/Inside-Microsoft-Querying-Developer-Reference/dp/0735626030 ) has an interesting example where querying a table by default transaction isolation level when updating columns clustered index, you can skip a line or read a line twice. It looks acceptable as updating the key table / entity is not a good idea anyway. However, I've updated this example so that it happens when you update a non-clustered index column value.
Below is the structure of the table:
SET NOCOUNT ON;
USE master;
IF DB_ID('TestIndexColUpdate') IS NULL CREATE DATABASE TestIndexColUpdate;
GO
USE TestIndexColUpdate;
GO
IF OBJECT_ID('dbo.Employees', 'U') IS NOT NULL DROP TABLE dbo.Employees;
CREATE TABLE dbo.Employees
(
empid CHAR(900) NOT NULL, -- this column should be big enough, so that 9 rows fit on 2 index pages
salary MONEY NOT NULL,
filler CHAR(1) NOT NULL DEFAULT('a')
);
CREATE INDEX idx_salary ON dbo.Employees(salary) include (empid); -- include empid into index, so that test query reads from it
ALTER TABLE dbo.Employees ADD CONSTRAINT PK_Employees PRIMARY KEY NONCLUSTERED(empid);
INSERT INTO dbo.Employees(empid, salary) VALUES
('A', 1500.00),('B', 2000.00),('C', 3000.00),('D', 4000.00),
('E', 5000.00),('F', 6000.00),('G', 7000.00),('H', 8000.00),
('I', 9000.00);
This is what needs to be done in the first join (on every update, the line will jump between the two index pages):
SET NOCOUNT ON;
USE TestIndexColUpdate;
WHILE 1=1
BEGIN
UPDATE dbo.Employees SET salary = 10800.00 - salary WHERE empid = 'I'; -- on each update, "I" employee jumps between 2 pages
END
This is what needs to be done in the second connection:
SET NOCOUNT ON;
USE TestIndexColUpdate;
DECLARE @c INT
WHILE 1 = 1
BEGIN
SELECT salary, empid FROM dbo.Employees
if @@ROWCOUNT <> 9 BREAK;
END
Typically, this query should return the 9 records that we entered in the first sample code. However, very soon I see 8 records coming back. This query reads all data from the "idx_salary" index, which is updated by the previous code example. This appears to be a rather weak relationship to data consistency with SQL Server. I would expect some lock coordination when data is read from the index and its key column is updated.
Am I interpreting this behavior correctly? Does this mean that even non-clustered index keys should not be updated?
UPDATE: To fix this problem you need to enable snapshots on the db (READ_COMMITTED_SNAPSHOT ON). No more deadlocks or missing lines. I've tried to summarize the whole thing here: http://blog.konstantins.net/2015/01/missing-rows-after-updating-sql-server.html
UPDATE 2: This seems to be the same problem as in this good old article: http://blog.codinghorror.com/deadlocked/
source to share
Am I interpreting this behavior correctly?
Yes.
Does this mean that even non-clustered index keys should not be updated?
Not. You must use the proper isolation level or force the application to tolerate the inconsistencies it allows READ COMMITTED
.
This missing row issue is not limited to clustered indexes. This is caused by moving a line in the b-tree. Clustered and non-clustered indexes are implemented as b-trees with little physical difference between them.
So you see the same physical phenomenon. It is applied every time your query reads a series of rows from a b-tree. The contents of this range can move.
Use the isolation level that gives you the guarantees you need. For read-only transactions, the snapshot isolation level is usually a very elegant and complete solution for concurrency. This seems to be relevant to your case.
This appears to be a rather weak relationship to data consistency with SQL Server. I would expect some lock coordination as data is read from the index and its key column is updated.
This is a clear request. On the other hand, you specifically requested a low isolation level. You can dial the number before SERIALIZABLE
. SERIALIZABLE
presents you with a somehow consistent execution.
Missing lines are just one special case of many of the effects it allows READ COMMITTED
. It makes no sense to specifically warn them, allowing all kinds of other inconsistencies.
SET NOCOUNT ON;
USE TestIndexColUpdate;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
DECLARE @c INT
WHILE 1 = 1
BEGIN
DECLARE @count INT
SELECT @count = COUNT(*) FROM dbo.Employees WITH (INDEX (idx_salary))
WHERE empid > '' AND CONVERT(NVARCHAR(MAX), empid) > '__'
AND salary > 0
if @count <> 9 BREAK;
END
source to share