MYSQL - Sums Interval Spans

I faced the following problem:

I would like to sum the hours of each name, giving the total interval between actions START

and END

, it would be simple if I could subtract from each entry the end of the beginning, more, for example, Mary, started on the 13th and was before 15 and started another action. while 14 and 16, I would like the result to be 3 (she used 3 hours of their time to do both activities)

eg:.

Name    |    START               |    END                 |
-----------------------------------------------------------
KATE    | 2014-01-01 13:00:00    | 2014-01-01 14:00:00    |
MARY    | 2014-01-01 13:00:00    | 2014-01-01 15:00:00    |
TOM     | 2014-01-01 13:00:00    | 2014-01-01 16:00:00    |
KATE    | 2014-01-01 12:00:00    | 2014-01-02 04:00:00    |
MARY    | 2014-01-01 14:00:00    | 2014-01-01 16:00:00    |
TOM     | 2014-01-01 12:00:00    | 2014-01-01 18:00:00    |
TOM     | 2014-01-01 22:00:00    | 2014-01-02 02:00:00    |

      

result:

KATE    15 hours
MARY    3 hours
TOM      9 hours

      

+3


source to share


2 answers


Have you tried a group and then an aggregate function?

SELECT Name, SUM(UNIX_TIMESTAMP(End) - UNIX_TIMESTAMP(Start)) FROM myTable
GROUP BY Name 

      



which will return the total number of seconds from the intervals you have. Then you can change the seconds to hours to display.

Also I highly recommend grouping the primary key or something instead of the string name, but I realize this could just be to simplify the question.

+1


source


I found this problem interesting, so I spent a little more time developing a solution. What I came up with involves sorting the rows by name and start time, and then using MySQL variables to account for overlapping ranges. I start by sorting the table and adding columns to it that carry the name and time from one row to the next

SELECT [expounded below]
FROM (SELECT * FROM tbl ORDER BY Name, START, END) AS u,
     (SELECT  @x := 0, @gap := 0, @same_name:='',
              @beg := (SELECT MIN(START) FROM tbl),
              @end := (SELECT MAX(END) FROM tbl)) AS t

      

This adds the name and outer bounds of the time range to each row of the table, and also sorts the table so that the names match in order in start time. For each line, we will now have @same_name, @beg and @end, wrapping values โ€‹โ€‹forward from one line to the next, and @x and @gap will accumulate hours.

Now we need to do some reasoning about the possible overlaps that might occur. For any two intervals, they either do not intersect, or they intersect:

Non-overlapping:   beg--------end      START-------END

Overlapping:  beg-----------end                                beg---------end
                    START--------------END          START-----------END

Subset: beg---------------------------------end
              START-----END

      

When strings are adjacent, we can decide if the two ranges overlap by comparing their start and end points. They overlap if the beginning of one is to the end of the other and vice versa:

IF( @end >= START && @beg <= END,

      

If they overlap, then the total spacing is the difference between the outer edges of the two spacing:

TIMESTAMPDIFF(HOUR, LEAST(@beg, START), GREATEST(@end, END))

      



If they don't overlap, we can just add a new spacing to the previous one.

We also need to know the spacing between the intervals, which is the difference from the end of the first to the beginning of the second. This will be necessary to calculate the hours for the case of more than two intervals, where there is only some overlap.

1-----------2           3----------4
                        3--------------------5

      

Combining this, we get a per-row computation where each row computes the union of the clock with the one above it. For each variable, we have to reset if the name changes:

SELECT Name, START, END,

   @x := IF(@same_name = Name,
            IF( @end >= START && @beg <= END, -- does it overlap?
                TIMESTAMPDIFF(HOUR, LEAST(@beg, START), GREATEST(@end, END)),
                @x + TIMESTAMPDIFF(HOUR, START, END) ),
            TIMESTAMPDIFF(HOUR,START,END) ) AS hr,

   @gap := IF(@same_name = Name,
                IF(@end >= START && @beg <= END,  -- does it overlap?
                    @gap,
                    @gap + TIMESTAMPDIFF(HOUR, @end, START)),
                0) AS gap,

   @beg := IF(@same_name = Name,
                CAST(LEAST(@beg, START) AS DATETIME), -- expand interval
                START) AS beg,                        -- reset interval

   @end := IF(@same_name = Name,
                CAST(GREATEST(@end, END) AS DATETIME),
              END) AS finish,
   @same_name := Name AS sameName
FROM
   (SELECT * FROM xt ORDER BY Name, START, END) AS u,
   (SELECT  @x := 0, @gap := 0, @same_name:='', @beg := (SELECT MIN(START) FROM xt), @end := (SELECT MAX(END) FROM xt)) AS t

      

This still gives us as many rows as there were in the original table. Hours and spaces will accumulate for each name, so we have to select the highest values โ€‹โ€‹and group by name:

SELECT Name, MAX(hr) - MAX(gap) AS HOURS
 FROM ( [insert above query here] ) AS intermediateCalculcation
GROUP BY Name;

      

Edit And of course, a moment after entering the game, it occurs to me that (a) there is a bug for names that do not have overlapping spacing; and (b) all @x does is create an interval from MIN (START) to MAX (END) for the name eacdh, which can be done with a simpler query and connection. Mind, exercise for the reader? :-)

0


source







All Articles