Simplify timeline in SQL (Netezza)

I want to generalize / simplify (I don't know what to call) my timeline.

So I have an ID with a timeline. I am trying to get rid of the overlap timeline within the same id.

Here's some sample data. What do I have:

ID   START_TIME   END_TIME
 1        a          b
 1        c          d
 1        e          f
 1        g          h

      

enter image description here
As you can see from the picture, they [a,b],[c,d],[e,f]

overlap, [g,h]

not overlap, so I just want [a,f]

and [g,h]

. What I want:

ID   START_TIME   END_TIME
 1        a          f
 1        g          h

      

+3


source to share


2 answers


I think @ shA.T is close. The problem lies in the multiple overlaps that can break. You may need to turn this into a multi-step process

Step 1 (Make a sample table):

 create temp table stack (
 id integer
 ,start_time timestamp
 ,end_time timestamp
 )

 insert into stack  values(1, date('2020-01-01'),date('2020-01-01') + interval '3 hours');
 insert into stack  values(1,date('2020-01-01') + interval '2 hours',date('2020-01-01') + interval '4 hours');
 insert into stack  values(1,date('2020-01-01') + interval '3.5 hours',date('2020-01-01') + interval '5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '5.5 hours',date('2020-01-01') + interval '6.5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '7.5 hours',date('2020-01-01') + interval '9.5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '8.5 hours',date('2020-01-01') + interval '10.5 hours');

      

Step 2 (find single slabs):



create temp table stack2 as
 SELECT ID, ps2 as start_time, max(e) AS End_Time
    FROM (
        SELECT t1.ID, t1.START_TIME AS s, MAX(t1.END_TIME) AS e,
               max(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
               ,CASE WHEN pe between s and e THEN ps ELSE s END ps2
        FROM stack AS t1
        JOIN stack AS t2 ON t1.START_TIME > t2.START_TIME
        GROUP BY t1.ID, t1.START_TIME) AS DT
    GROUP BY
        ID, ps2
    ORDER BY ps2

      

Step 3 (Combining double slabs):

 SELECT ID, ps2 as start_time, max(e) AS End_Time
    FROM (
        SELECT t1.ID, t1.START_TIME AS s, MAX(t1.END_TIME) AS e,
               max(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
               ,CASE WHEN pe between s and e THEN ps ELSE s END ps2
        FROM stack2 AS t1
        JOIN stack2 AS t2 ON t1.START_TIME > t2.START_TIME
        GROUP BY t1.ID, t1.START_TIME) AS DT
    GROUP BY
        ID, ps2
    ORDER BY ps2

      

+1


source


I found a solution without adding an extra column like this:



SELECT ID, MIN(CASE WHEN pe between s and e THEN ps ELSE s END) AS START_TIME, MAX(e) AS End_Time
FROM (
    SELECT t1.ID, t1.START_TIME AS s, t1.END_TIME AS e, 
           MAX(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
    FROM t AS t1
    JOIN t AS t2 ON t1.START_TIME > t2.START_TIME
    GROUP BY t1.ID, t1.START_TIME, t1.END_TIME ) AS DT
GROUP BY
    ID, CASE WHEN pe between s and e THEN 1 ELSE 0 END
ORDER BY s

      

0


source







All Articles