Missing Rows After Updating SQL Server Index Key Column

"T-SQL Query Query" ( http://www.amazon.com/Inside-Microsoft-Querying-Developer-Reference/dp/0735626030 ) has an interesting example where querying a table by default transaction isolation level when updating columns clustered index, you can skip a line or read a line twice. It looks acceptable as updating the key table / entity is not a good idea anyway. However, I've updated this example so that it happens when you update a non-clustered index column value.

Below is the structure of the table:

SET NOCOUNT ON;
USE master;
IF DB_ID('TestIndexColUpdate') IS NULL CREATE DATABASE TestIndexColUpdate;
GO
USE TestIndexColUpdate;
GO

IF OBJECT_ID('dbo.Employees', 'U') IS NOT NULL DROP TABLE dbo.Employees;
CREATE TABLE dbo.Employees
(
empid CHAR(900) NOT NULL, -- this column should be big enough, so that 9 rows fit on 2 index pages
salary MONEY NOT NULL,
filler CHAR(1) NOT NULL DEFAULT('a')
);
CREATE INDEX idx_salary ON dbo.Employees(salary) include (empid); -- include empid into index, so that test query reads from it
ALTER TABLE dbo.Employees ADD CONSTRAINT PK_Employees PRIMARY KEY NONCLUSTERED(empid);

INSERT INTO dbo.Employees(empid, salary) VALUES
('A', 1500.00),('B', 2000.00),('C', 3000.00),('D', 4000.00),
('E', 5000.00),('F', 6000.00),('G', 7000.00),('H', 8000.00),
('I', 9000.00);

      

This is what needs to be done in the first join (on every update, the line will jump between the two index pages):

SET NOCOUNT ON;
USE TestIndexColUpdate;

WHILE 1=1
BEGIN
    UPDATE dbo.Employees SET salary = 10800.00 - salary WHERE empid = 'I'; -- on each update, "I" employee jumps between 2 pages
END

      

This is what needs to be done in the second connection:

SET NOCOUNT ON;
USE TestIndexColUpdate;

DECLARE @c INT
WHILE 1 = 1
BEGIN
    SELECT salary, empid FROM dbo.Employees
    if @@ROWCOUNT <> 9 BREAK;
END

      

Typically, this query should return the 9 records that we entered in the first sample code. However, very soon I see 8 records coming back. This query reads all data from the "idx_salary" index, which is updated by the previous code example. This appears to be a rather weak relationship to data consistency with SQL Server. I would expect some lock coordination when data is read from the index and its key column is updated.

Am I interpreting this behavior correctly? Does this mean that even non-clustered index keys should not be updated?

UPDATE: To fix this problem you need to enable snapshots on the db (READ_COMMITTED_SNAPSHOT ON). No more deadlocks or missing lines. I've tried to summarize the whole thing here: http://blog.konstantins.net/2015/01/missing-rows-after-updating-sql-server.html

UPDATE 2: This seems to be the same problem as in this good old article: http://blog.codinghorror.com/deadlocked/

+3


source to share


1 answer


Am I interpreting this behavior correctly?

Yes.

Does this mean that even non-clustered index keys should not be updated?

Not. You must use the proper isolation level or force the application to tolerate the inconsistencies it allows READ COMMITTED

.

This missing row issue is not limited to clustered indexes. This is caused by moving a line in the b-tree. Clustered and non-clustered indexes are implemented as b-trees with little physical difference between them.

So you see the same physical phenomenon. It is applied every time your query reads a series of rows from a b-tree. The contents of this range can move.



Use the isolation level that gives you the guarantees you need. For read-only transactions, the snapshot isolation level is usually a very elegant and complete solution for concurrency. This seems to be relevant to your case.

This appears to be a rather weak relationship to data consistency with SQL Server. I would expect some lock coordination as data is read from the index and its key column is updated.

This is a clear request. On the other hand, you specifically requested a low isolation level. You can dial the number before SERIALIZABLE

. SERIALIZABLE

presents you with a somehow consistent execution.

Missing lines are just one special case of many of the effects it allows READ COMMITTED

. It makes no sense to specifically warn them, allowing all kinds of other inconsistencies.


SET NOCOUNT ON;
USE TestIndexColUpdate;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED

DECLARE @c INT
WHILE 1 = 1
BEGIN
    DECLARE @count INT
    SELECT @count = COUNT(*) FROM dbo.Employees WITH (INDEX (idx_salary))
    WHERE empid > '' AND CONVERT(NVARCHAR(MAX), empid) > '__'
        AND salary > 0
    if @count <> 9 BREAK;
END

      

+3


source







All Articles