Remove duplicate rows in a group ordered by date

I doubt my title makes a lot of sense, I will do my best to explain what I am asking. I need to clear the audit table that keeps track of when the state of an object has changed. For one reason or another, multiple records are created with new dates, while the state of the object remains unchanged. I need to keep the first record of each state change and then delete all subsequent records where the state is the same. Oh, and there is no primary key. Yes !: |

Here's an example dataset:

ObjectID   ObjectState     DateOfEntry
101144      1           2007-08-14 12:39:30.587
101144      1           2007-08-14 12:41:52.620
101144      1           2007-08-14 12:42:11.150
101144      1           2007-08-14 12:42:24.197
101144      3           2007-08-14 12:44:06.403
101144      3           2007-08-14 12:44:06.467
101144      3           2007-08-14 12:46:12.573
101144      3           2007-08-14 12:50:51.670
101144      3           2007-08-14 12:50:51.750
101144      3           2007-08-14 12:56:34.330
101144      4           2007-08-14 17:28:59.280
101144      3           2007-08-14 17:32:26.313
101144      3           2007-08-14 17:32:48.720
101144      3           2007-08-14 17:45:07.460
101144      3           2007-08-14 17:46:31.740
101144      3           2007-08-14 17:47:04.380
101144      3           2007-08-14 17:47:29.507
101144      3           2007-08-14 17:49:13.460
101144      3           2007-08-14 17:54:15.320
101144      3           2007-08-14 17:55:57.540
101144      3           2007-08-14 19:50:11.913
101144      3           2007-08-14 19:53:10.820
101144      3           2007-08-14 20:03:44.900
101144      3           2007-08-16 10:34:56.477
101144      3           2007-08-16 10:36:06.477
101144      3           2007-08-16 10:36:24.570
101144      3           2007-11-06 09:19:26.157
101144      3           2007-11-06 09:24:28.200
101144      4           2010-09-27 14:11:03.287
101144      4           2014-01-27 17:31:58.077

      

The result of the final table should be:

ObjectID   ObjectState     DateOfEntry
101144      1           2007-08-14 12:39:30.587
101144      3           2007-08-14 12:44:06.403
101144      4           2007-08-14 17:28:59.280
101144      3           2007-08-14 17:32:26.313
101144      4           2010-09-27 14:11:03.287

      

I tried to use RANK()

, but the problem is I can't just sort by ObjectState

, because the values ObjectState

might be repeated out of order. I have to order them with DateOfEntry

. But if I do RANK() OVER(ORDER BY DateOfEntry)

, then I basically end up with line numbering.

How can I create a SQL query that will allow me to order by DateOfEntry

, but then group by ObjectState

, so that I can delete all rows within this "object state group" except for the minimum group?

+3


source to share


3 answers


An alternative solution that I finally thought of using LAG () would eliminate CTEs.

DELETE @Audits
FROM @Audits a1
INNER JOIN (SELECT ObjectID, DateOfEntry
                FROM (SELECT ObjectID, DateOfEntry, ObjectState, 
                        LAG(ObjectState) OVER(PARTITION BY ObjectID ORDER BY DateOfEntry) AS [PreviousObjectState]
                          FROM @Audits) AS Audits
             WHERE Audits.ObjectState = PreviousObjectState
             ) a2
 ON a2.ObjectID = a1.ObjectID AND a2.DateOfEntry = a1.DateOfEntry

SELECT * FROM @Audits

      

Long proof version (I duplicated dataset with a different id to check the section, works as expected)



DECLARE @Audits TABLE (ObjectID INT, ObjectState INT, DateOfEntry DATETIME)
INSERT @Audits
    SELECT 101144,1,'2007-08-14 12:39:30.587' UNION ALL
    SELECT 101144,1,'2007-08-14 12:41:52.620' UNION ALL
    SELECT 101144,1,'2007-08-14 12:42:11.150' UNION ALL
    SELECT 101144,1,'2007-08-14 12:42:24.197' UNION ALL
    SELECT 101144,3,'2007-08-14 12:44:06.403' UNION ALL
    SELECT 101144,3,'2007-08-14 12:44:06.467' UNION ALL
    SELECT 101144,3,'2007-08-14 12:46:12.573' UNION ALL
    SELECT 101144,3,'2007-08-14 12:50:51.670' UNION ALL
    SELECT 101144,3,'2007-08-14 12:50:51.750' UNION ALL
    SELECT 101144,3,'2007-08-14 12:56:34.330' UNION ALL
    SELECT 101144,4,'2007-08-14 17:28:59.280' UNION ALL
    SELECT 101144,3,'2007-08-14 17:32:26.313' UNION ALL
    SELECT 101144,3,'2007-08-14 17:32:48.720' UNION ALL
    SELECT 101144,3,'2007-08-14 17:45:07.460' UNION ALL
    SELECT 101144,3,'2007-08-14 17:46:31.740' UNION ALL
    SELECT 101144,3,'2007-08-14 17:47:04.380' UNION ALL
    SELECT 101144,3,'2007-08-14 17:47:29.507' UNION ALL
    SELECT 101144,3,'2007-08-14 17:49:13.460' UNION ALL
    SELECT 101144,3,'2007-08-14 17:54:15.320' UNION ALL
    SELECT 101144,3,'2007-08-14 17:55:57.540' UNION ALL
    SELECT 101144,3,'2007-08-14 19:50:11.913' UNION ALL
    SELECT 101144,3,'2007-08-14 19:53:10.820' UNION ALL
    SELECT 101144,3,'2007-08-14 20:03:44.900' UNION ALL
    SELECT 101144,3,'2007-08-16 10:34:56.477' UNION ALL
    SELECT 101144,3,'2007-08-16 10:36:06.477' UNION ALL
    SELECT 101144,3,'2007-08-16 10:36:24.570' UNION ALL
    SELECT 101144,3,'2007-11-06 09:19:26.157' UNION ALL
    SELECT 101144,3,'2007-11-06 09:24:28.200' UNION ALL
    SELECT 101144,4,'2010-09-27 14:11:03.287' UNION ALL
    SELECT 101144,4,'2014-01-27 17:31:58.077' UNION ALL
    SELECT 101145,1,'2007-08-14 12:39:30.587' UNION ALL
    SELECT 101145,1,'2007-08-14 12:41:52.620' UNION ALL
    SELECT 101145,1,'2007-08-14 12:42:11.150' UNION ALL
    SELECT 101145,1,'2007-08-14 12:42:24.197' UNION ALL
    SELECT 101145,3,'2007-08-14 12:44:06.403' UNION ALL
    SELECT 101145,3,'2007-08-14 12:44:06.467' UNION ALL
    SELECT 101145,3,'2007-08-14 12:46:12.573' UNION ALL
    SELECT 101145,3,'2007-08-14 12:50:51.670' UNION ALL
    SELECT 101145,3,'2007-08-14 12:50:51.750' UNION ALL
    SELECT 101145,3,'2007-08-14 12:56:34.330' UNION ALL
    SELECT 101145,4,'2007-08-14 17:28:59.280' UNION ALL
    SELECT 101145,3,'2007-08-14 17:32:26.313' UNION ALL
    SELECT 101145,3,'2007-08-14 17:32:48.720' UNION ALL
    SELECT 101145,3,'2007-08-14 17:45:07.460' UNION ALL
    SELECT 101145,3,'2007-08-14 17:46:31.740' UNION ALL
    SELECT 101145,3,'2007-08-14 17:47:04.380' UNION ALL
    SELECT 101145,3,'2007-08-14 17:47:29.507' UNION ALL
    SELECT 101145,3,'2007-08-14 17:49:13.460' UNION ALL
    SELECT 101145,3,'2007-08-14 17:54:15.320' UNION ALL
    SELECT 101145,3,'2007-08-14 17:55:57.540' UNION ALL
    SELECT 101145,3,'2007-08-14 19:50:11.913' UNION ALL
    SELECT 101145,3,'2007-08-14 19:53:10.820' UNION ALL
    SELECT 101145,3,'2007-08-14 20:03:44.900' UNION ALL
    SELECT 101145,3,'2007-08-16 10:34:56.477' UNION ALL
    SELECT 101145,3,'2007-08-16 10:36:06.477' UNION ALL
    SELECT 101145,3,'2007-08-16 10:36:24.570' UNION ALL
    SELECT 101145,3,'2007-11-06 09:19:26.157' UNION ALL
    SELECT 101145,3,'2007-11-06 09:24:28.200' UNION ALL
    SELECT 101145,4,'2010-09-27 14:11:03.287' UNION ALL
    SELECT 101145,4,'2014-01-27 17:31:58.077'

DELETE @Audits
FROM @Audits a1
INNER JOIN (SELECT ObjectID, DateOfEntry
                FROM (SELECT ObjectID, DateOfEntry, ObjectState, 
                        LAG(ObjectState) OVER(PARTITION BY ObjectID ORDER BY DateOfEntry) AS [PreviousUserState]
                          FROM @Audits) AS Audits
             WHERE Audits.ObjectState = PreviousUserState
             ) a2
 ON a2.ObjectID = a1.ObjectID AND a2.DateOfEntry = a1.DateOfEntry

SELECT * FROM @Audits

      

Sets this output

ObjectID    ObjectState DateOfEntry
----------- ----------- -----------------------
101144      1           2007-08-14 12:39:30.587
101144      3           2007-08-14 12:44:06.403
101144      4           2007-08-14 17:28:59.280
101144      3           2007-08-14 17:32:26.313
101144      4           2010-09-27 14:11:03.287
101145      1           2007-08-14 12:39:30.587
101145      3           2007-08-14 12:44:06.403
101145      4           2007-08-14 17:28:59.280
101145      3           2007-08-14 17:32:26.313
101145      4           2010-09-27 14:11:03.287

      

0


source


The short answer is:

; WITH Records AS (
    SELECT
        ObjectId,
        ObjectState,
        DateOfEntry,
        ROW_NUMBER() OVER (PARTITION BY ObjectID ORDER BY DateOfEntry) AS RowNum
    FROM @Audits
)
    DELETE R2
    FROM Records R1
        INNER JOIN Records R2
            ON R1.ObjectId = R2.ObjectId
                AND R1.ObjectState = R2.ObjectState
                AND R1.RowNum + 1 = R2.RowNum

      

Proof of Solution



DECLARE @Audits TABLE (ObjectID INT, ObjectState INT, DateOfEntry DATETIME)
INSERT @Audits
    SELECT 101144,1,'2007-08-14 12:39:30.587' UNION ALL
    SELECT 101144,1,'2007-08-14 12:41:52.620' UNION ALL
    SELECT 101144,1,'2007-08-14 12:42:11.150' UNION ALL
    SELECT 101144,1,'2007-08-14 12:42:24.197' UNION ALL
    SELECT 101144,3,'2007-08-14 12:44:06.403' UNION ALL
    SELECT 101144,3,'2007-08-14 12:44:06.467' UNION ALL
    SELECT 101144,3,'2007-08-14 12:46:12.573' UNION ALL
    SELECT 101144,3,'2007-08-14 12:50:51.670' UNION ALL
    SELECT 101144,3,'2007-08-14 12:50:51.750' UNION ALL
    SELECT 101144,3,'2007-08-14 12:56:34.330' UNION ALL
    SELECT 101144,4,'2007-08-14 17:28:59.280' UNION ALL
    SELECT 101144,3,'2007-08-14 17:32:26.313' UNION ALL
    SELECT 101144,3,'2007-08-14 17:32:48.720' UNION ALL
    SELECT 101144,3,'2007-08-14 17:45:07.460' UNION ALL
    SELECT 101144,3,'2007-08-14 17:46:31.740' UNION ALL
    SELECT 101144,3,'2007-08-14 17:47:04.380' UNION ALL
    SELECT 101144,3,'2007-08-14 17:47:29.507' UNION ALL
    SELECT 101144,3,'2007-08-14 17:49:13.460' UNION ALL
    SELECT 101144,3,'2007-08-14 17:54:15.320' UNION ALL
    SELECT 101144,3,'2007-08-14 17:55:57.540' UNION ALL
    SELECT 101144,3,'2007-08-14 19:50:11.913' UNION ALL
    SELECT 101144,3,'2007-08-14 19:53:10.820' UNION ALL
    SELECT 101144,3,'2007-08-14 20:03:44.900' UNION ALL
    SELECT 101144,3,'2007-08-16 10:34:56.477' UNION ALL
    SELECT 101144,3,'2007-08-16 10:36:06.477' UNION ALL
    SELECT 101144,3,'2007-08-16 10:36:24.570' UNION ALL
    SELECT 101144,3,'2007-11-06 09:19:26.157' UNION ALL
    SELECT 101144,3,'2007-11-06 09:24:28.200' UNION ALL
    SELECT 101144,4,'2010-09-27 14:11:03.287' UNION ALL
    SELECT 101144,4,'2014-01-27 17:31:58.077'

; WITH Records AS (
    SELECT
        ObjectId,
        ObjectState,
        DateOfEntry,
        ROW_NUMBER() OVER (PARTITION BY ObjectID ORDER BY DateOfEntry) AS RowNum
    FROM @Audits
)
    DELETE R2
    FROM Records R1
        INNER JOIN Records R2
            ON R1.ObjectId = R2.ObjectId
                AND R1.ObjectState = R2.ObjectState
                AND R1.RowNum + 1 = R2.RowNum
SELECT * FROM @Audits

      

Sets this output

ObjectID    ObjectState DateOfEntry
----------- ----------- -----------------------
101144      1           2007-08-14 12:39:30.587
101144      3           2007-08-14 12:44:06.403
101144      4           2007-08-14 17:28:59.280
101144      3           2007-08-14 17:32:26.313
101144      4           2010-09-27 14:11:03.287

      

+3


source


If @table is your table maybe below sql will help you. I assumed sorting DateOfEntry.

DELETE      B
FROM        (SELECT *,ROW_NUMBER() OVER(ORDER BY DateOfEntry) [ROW] FROM @table) A
LEFT JOIN   (SELECT *,ROW_NUMBER() OVER(ORDER BY DateOfEntry) [ROW] FROM @table) B 
            ON  A.[Row] = B.[Row] - 1 
            AND ABS(a.ObjectState - b.ObjectState) = 0

      

Result before and after Me

enter image description here

Note. Please disregard the data in the DateOfEntry column. I have kept it for convenience.

0


source







All Articles