Write locking and concurrency

My logic diagram looks like this: enter image description here
A header record can contain multiple child records.

Multiple computers can insert child records through a stored procedure that accepts child record information and value.

  • When a child record is inserted, it may be necessary to insert a header record if it does not exist with the specified value.
  • You only want one header record to be inserted for any given value. Therefore, if two child records are inserted with the same "Value", the header only needs to be created once. This requires concurrency management during inserts.

Multiple PCs can request raw header records through a stored procedure

  • A header record should be requested if it has a specific set of child records and a header record is not processed.
    • You only want one computer PC to query and process each header record. There should never be an instance where the header record and its children need to be processed by more than one computer. This requires concurrency management at selection time.

So basically my header request looks like this:

BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT TOP 1
    *
INTO
    #unprocessed
FROM
    Header h WITH (READPAST, UPDLOCK)
JOIN
    Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
JOIN
    Child part2 ON part1.HeaderID = part2.HeaderID AND 
WHERE
    h.Processed = 0x0;

UPDATE
    Header
SET
    Processed = 0x1
WHERE
    HeaderID IN (SELECT [HeaderID] FROM #unprocessed);  

SELECT * FROM #unprocessed

COMMIT TRAN

      

This way, the above query ensures that concurrent queries never return the same record.

I think my problem is with the insert request. Here's what I have:

DECLARE @HeaderID INT

BEGIN TRAN

--Create header record if it doesn't exist, otherwise get it HeaderID
MERGE INTO
    Header WITH (HOLDLOCK) as target
USING
(
    SELECT 
        [Value] = @Value,  --stored procedure parameter
        [HeaderID]

) as source ([Value], [HeaderID]) ON target.[Value] = source.[Value] AND
                                     target.[Processed] = 0    
WHEN MATCHED THEN 
    UPDATE SET
        --Get the ID of the existing header
        @HeaderID = target.[HeaderID],
        [LastInsert] = sysdatetimeoffset() 
WHEN NOT MATCHED THEN
    INSERT
    (
        [Value]
    )
    VALUES
    (
        source.[Value]
    )


--Get new or existing ID
SELECT @HeaderID = COALESCE(@HeaderID , SCOPE_IDENTITY());

--Insert child with the new or existing HeaderID
INSERT INTO 
    [Correlation].[CorrelationSetPart]
    (
        [HeaderID],
        [Name]  
    )
VALUES
(
    @HeaderID,
    @Name --stored procedure parameter
);

      

My problem is that the request for input is often blocked by the above selection request and I get timeouts. The selection request is called by the broker, so it can be called pretty quickly. Is there a better way to do this? Please note that I am in control of the database schema.

+3


source to share


1 answer


To answer the second part of the question

You only want one computer PC to request and process each header record. There should never be an instance where the title entry and its children need to be processed on more than one PC

Have a look at sp_getapplock .

I am using application locks in a similar scenario. I have a table of objects to be processed, similar to your header table. The client application starts multiple threads at the same time. Each thread executes a stored procedure that returns the next object for processing from the object table. Thus, the main purpose of the stored procedure is not the processing itself, but the return of the first object in the queue that needs processing. The code might look something like this:

CREATE PROCEDURE [dbo].[GetNextHeaderToProcess]
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    BEGIN TRANSACTION;
    BEGIN TRY

        DECLARE @VarHeaderID int = NULL;

        DECLARE @VarLockResult int;
        EXEC @VarLockResult = sp_getapplock
            @Resource = 'GetNextHeaderToProcess_app_lock',
            @LockMode = 'Exclusive',
            @LockOwner = 'Transaction',
            @LockTimeout = 60000,
            @DbPrincipal = 'public';

        IF @VarLockResult >= 0
        BEGIN
            -- Acquired the lock
            -- Find the most suitable header for processing
            SELECT TOP 1
                @VarHeaderID = h.HeaderID
            FROM
                Header h
                JOIN Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
                JOIN Child part2 ON part1.HeaderID = part2.HeaderID
            WHERE
                h.Processed = 0x0
            ORDER BY ....;
            -- sorting is optional, but often useful
            -- for example, order by some timestamp to process oldest/newest headers first

            -- Mark the found Header to prevent multiple processing.
            UPDATE Header
            SET Processed = 2 -- in progress. Another procedure that performs the actual processing should set it to 1 when processing is complete.
            WHERE HeaderID = @VarHeaderID;
            -- There is no need to explicitly verify if we found anything. 
            -- If @VarHeaderID is null, no rows will be updated
        END;

        -- Return found Header, or no rows if nothing was found, or failed to acquire the lock
        SELECT
            @VarHeaderID AS HeaderID
        WHERE
            @VarHeaderID IS NOT NULL
        ;

        COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH
        ROLLBACK TRANSACTION;
    END CATCH;

END

      

This procedure must be called from within the procedure that does the actual processing. In my case, the client application does the actual processing, in your case it might be a different stored procedure. The idea is that we will quickly close the application lock. Of course, if the actual processing is fast, you can put it inside a lock, so only one header can be processed at a time.

Once the lock is obtained, we will search for the most appropriate header to process and then set its Process flag. Depending on the nature of your processing, you can immediately set the flag to 1 (processed) or set it to some intermediate value, for example 2 (in progress), and then set it to 1 (processed) later. In any case, once the flag is nonzero, the header will not be selected for processing again.



These application locks are decoupled from the regular locks the DB places on reading and updating rows, and should not interfere with inserts. Either way, it should be better than locking the entire table as you do WITH (UPDLOCK)

.

Returning to the first part of the question

You only want one header record to be inserted for any given value. Therefore, if two child records are inserted with the same "Value", the header only needs to be created once.

You can use the same approach: acquire the application lock at the beginning of the insert procedure (with a different name than the application lock used in the request procedure). This way, you ensure that the inserts happen sequentially and not concurrently. By the way, in practice, most likely, inserts cannot happen at the same time. The DB will execute them sequentially inside. They will be waiting for each other because each insert locks the table to update. In addition, every insert is written to the transaction log, and all writes to the transaction log are also sequential. So, just add sp_getapplock to the beginning of the insert procedure and remove the hint WITH (HOLDLOCK)

in MERGE.

The caller of GetNextHeaderToProcess must correctly handle the situation when the procedure does not return strings. This can happen if the lock timeout has expired, or there are simply no headers to process. Usually some of the processing is just delayed after a while.

The insertion routine should check to see if the lock could not be captured and tried to insert again, or to report the problem to the caller. I usually return the generated row id of the inserted row (ChildID in your case) to the caller. If the procedure returns 0, it means that the insert failed. The caller can decide what to do.

+1


source







All Articles