SQL returns second difference between two strings
This question is related to SQL Server. I have the following table:
id size batch code product code additiontime
--------------------------------------------------------
1 91 55555 BigD Red 2017-05-15 13:00:00
2 91 55555 BigD Red 2017-05-15 13:00:05
3 94 55555 BigD Red 2017-05-15 13:00:15
4 91 44444 BigD Blue 2017-05-15 14:10:00
5 92 44444 BigD Blue 2017-05-15 14:15:00
6 93 44444 BigD Blue 2017-05-15 14:20:00
7 94 44444 BigD Blue 2017-05-15 14:30:00
8 91 33333 BigD Orange 2017-05-15 15:00:00
9 91 33333 BigD Orange 2017-05-15 15:00:10
10 94 33333 BigD Orange 2017-05-15 15:00:15
When I execute this SQL statement:
select *
from mytable y1
where size = 91
and not exists (select 1
from mytable y2
where y1.productcode = y2.productcode and y2.size = 92)
I get the following results:
id size batch code product code additiontime
--------------------------------------------------------
1 91 55555 BigD Red 2017-05-15 13:00:00
2 91 55555 BigD Red 2017-05-15 13:00:05
8 91 33333 BigD Orange 2017-05-15 15:00:00
9 91 33333 BigD Orange 2017-05-15 15:00:10
I would like to modify the above query to display one line in each batch code with an extra column added to show the difference in seconds between the first and second entry of each batch of batch codes, e.g .:
id size batch code product code additiontime seconds difference
--------------------------------------------------------------------------------------
1 91 55555 BigD Red 2017-05-15 13:00:00 5
8 91 33333 BigD Orange 2017-05-15 15:00:00 10
I tried the following SQL, which almost does the trick, but it returns multiple lines instead of one for each batch code:
WITH rows AS
(
SELECT
*,
ROW_NUMBER() OVER (ORDER BY AdditionTime) AS rn
FROM
mytable y1
WHERE
size = 91
AND NOT EXISTS (SELECT *
FROM mytable y2
WHERE y1.productcode = y2.productcode AND y2.size = 92)
)
SELECT DATEDIFF(second, mc.AdditionTime, mp.AdditionTime)
FROM rows mc
JOIN rows mp ON mc.rn = mp.rn - 1
My SQL knowledge is not great. I'm sure this may lead to correct results, but I've tried all the ways with no success. Thanks in advance for any pointers.
source to share
This gives you the result you want with a self-join over the result of cte. Also, I added PARTITION BY
in ROW_NUMBER()
:
CREATE TABLE #mytable
(
id INT ,
size INT ,
batchCode INT ,
productCode NVARCHAR(30) ,
additiontime DATETIME
);
INSERT INTO #mytable
( id, size, batchCode, productCode, additiontime )
VALUES ( 1, 91, 55555, 'BigD Red', '2017-05-15 13:00:00' ),
( 2, 91, 55555, 'BigD Red', '2017-05-15 13:00:05' ),
( 3, 94, 55555, 'BigD Red', '2017-05-15 13:00:15' ),
( 4, 91, 44444, 'BigD Blue', '2017-05-15 14:10:00' ),
( 5, 92, 44444, 'BigD Blue', '2017-05-15 14:15:00' ),
( 6, 93, 44444, 'BigD Blue', '2017-05-15 14:20:00' ),
( 7, 94, 44444, 'BigD Blue', '2017-05-15 14:30:00' ),
( 8, 91, 33333, 'BigD Orange', '2017-05-15 15:00:00' ),
( 9, 91, 33333, 'BigD Orange', '2017-05-15 15:00:10' ),
( 10, 94, 33333, 'BigD Orange', '2017-05-15 15:00:15' );
WITH rows
AS ( SELECT * ,
ROW_NUMBER() OVER ( PARTITION BY y1.batchCode ORDER BY additiontime ) AS rn
FROM #mytable y1
WHERE size = 91
AND NOT EXISTS ( SELECT *
FROM #mytable y2
WHERE y1.productCode = y2.productCode
AND y2.size = 92 )
)
SELECT t1.id ,
t1.size ,
t1.batchCode ,
t1.productCode ,
DATEDIFF(SECOND, t1.additiontime, t2.additiontime) SecondsDiff
FROM rows t1
INNER JOIN rows t2 ON t2.batchCode = t1.batchCode
AND t1.id != t2.id
WHERE t1.rn = 1;
DROP TABLE #mytable
Outputs:
id size batchCode productCode SecondsDiff
8 91 33333 BigD Orange 10
1 91 55555 BigD Red 5
Note. You should test with a large dataset to ensure accuracy and account for any other scenarios.
source to share
You are actually very close! :) The only thing missing is to add another condition in JOIN
CTE
so that you only compare strings with the same product code
as below:
WITH rows AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY AdditionTime) AS rn
FROM mytable y1
WHERE size = 91 AND NOT EXISTS (
SELECT *
FROM mytable y2
WHERE y1.productcode = y2.productcode and y2.size = 92)
)
SELECT mc.*, DATEDIFF(second, mc.AdditionTime, mp.AdditionTime) AS Diff
FROM rows mc JOIN rows mp ON mc.rn = mp.rn - 1 AND mc.ProductCode = mp.ProductCode
PS. Assuming you know you have two lines for the product.
source to share
;With cte(id,size,batchcode,productcode,additiontime)
AS
(
SELECT 1 , 91, 55555,'BigD Red' ,'2017-05-15 13:00:00' UNION ALL
SELECT 2 , 91, 55555,'BigD Red' ,'2017-05-15 13:00:05' UNION ALL
SELECT 3 , 94, 55555,'BigD Red' ,'2017-05-15 13:00:15' UNION ALL
SELECT 4 , 91, 44444,'BigD Blue' ,'2017-05-15 14:10:00' UNION ALL
SELECT 5 , 92, 44444,'BigD Blue' ,'2017-05-15 14:15:00' UNION ALL
SELECT 6 , 93, 44444,'BigD Blue' ,'2017-05-15 14:20:00' UNION ALL
SELECT 7 , 94, 44444,'BigD Blue' ,'2017-05-15 14:30:00' UNION ALL
SELECT 8 , 91, 33333,'BigD Orange','2017-05-15 15:00:00' UNION ALL
SELECT 9 , 91, 33333,'BigD Orange','2017-05-15 15:00:10' UNION ALL
SELECT 10, 94, 33333,'BigD Orange','2017-05-15 15:00:15'
)
SELECT id
,size
,batchcode
,productcode
,additiontime
,(SecondDiff - LEADadditiontimeSec) AS secondsDifference
FROM (
SELECT *
,DATEPART(SECOND, additiontime) LEADadditiontimeSec
,DATEPART(SECOND, LEADadditiontime) SecondDiff
FROM (
SELECT *
,LEAD(additiontime) OVER (
PARTITION BY batchcode
,size ORDER BY batchcode
) LEADadditiontime
FROM cte o
WHERE size = 91
AND NOT EXISTS ( SELECT * FROM cte i
WHERE o.productCode = i.productCode
AND i.size = 92 )
) Dt
) Final
WHERE Final.SecondDiff IS NOT NULL
ORDER BY 1
Output
id size batchcode productcode additiontime secondsDifference
------------------------------------------------------------------------------
1 91 55555 BigD Red 2017-05-15 13:00:00 5
8 91 33333 BigD Orange 2017-05-15 15:00:00 10
Hopefully the following query will do the job for you -
SELECT id, size, [batch code], [product code], additiontime, DATEDIFF(SECOND, additiontime, next_addition_time) AS [seconds difference]
FROM
(
SELECT *
, LEAD(additiontime) OVER (PARTITION BY [batch code] ORDER BY additiontime) AS next_addition_time
, ROW_NUMBER() OVER (PARTITION BY [batch code] ORDER BY additiontime) AS row_num
FROM mytable
) AS t
WHERE row_num = 1;
The subquery first puts the order of the line numbers, adding the time for each batch. It also stores the next value of the time of addition. The outer query then simply fetches the first row and the difference between the current and next added time.
Please note that you must move the subquery to a temporary table if the data is huge. Also you can do ORDER BY and provide filters in where clause.
source to share