Using an aggregate function that resets after a condition is met?
I am working with event data and am currently trying to work out the time spent in the application by summing the difference between the current and previous timestamps. However, the problem is that I need to reset this value every time the value of the "package_name" column changes. I've tried using the following.
SELECT
SUM(timeDifference) OVER(PARTITION BY packageName ORDER BY sNumber, timestamp) as accTime,
*
FROM table.name
ORDER BY
sNumber, timestamp
However, the result seems too intellectual. I need it to forget about this aggregation after each section, instead of remembering the previous results and accumulating them.
My question is if there is a way to reset. I will give examples of what I am getting and what my desired output is. Any help would be much appreciated.
What do I get.
**accTime diff packageName**
10 10 com.package.1
20 20 com.package.1
10 10 com.package.2
20 20 com.package.2
30 10 com.package.1
What I want.
**accTime diff packageName**
10 10 com.package.1
20 20 com.package.1
10 10 com.package.2
20 20 com.package.2
10 10 com.package.1
The second example shows that the accumulated time for "first" is getting reset, which I need help with.
To explain even more, here's a sample of the raw data:
**timestamp packageName sNumber eventID diff**
1433119125117 com.package.1 xx123xx event1 null
1433119125200 com.package.1 xx123xx event2 83
1433119125400 com.package.2 xx123xx event3 200
1433119125600 com.package.2 xx123xx event4 200
1433119125800 com.package.1 xx123xx event5 200
source to share
Using the delay feature (you'll notice my answer is similar to Pentium) I THINK this is what you want ...
I'm not 100% sure since your accTime seems to behave strangely from its diff ... to me, accTime should be accTime + diff, no? (if I'm wrong, correct me with where the request is right now, easy to set up :))
SELECT
timestamp,package,sNumber,eventID,diff,
CASE WHEN lagPackage IS NULL then 0
WHEN package != lagPackage THEN diff
ELSE (diff + IF(lagDiff is null, 0,lagDiff)) END AS accTime
FROM (
SELECT
*,
LAG(package,1) OVER (ORDER BY timestamp) AS lagPackage,
LAG(diff,1,0) OVER (ORDER BY timestamp) AS lagDiff
FROM (
SELECT
1433119125117 AS timestamp,
'com.package.1' AS package,
'xxx123xxx' AS sNumber,
'event1' AS eventID,
NULL AS diff),
(
SELECT
1433119125200 AS timestamp,
'com.package.1' AS package,
'xxx123xxx' AS sNumber,
'event2' AS eventID,
83 AS diff),
(
SELECT
1433119125400 AS timestamp,
'com.package.2' AS package,
'xxx123xxx' AS sNumber,
'event3' AS eventID,
200 AS diff),
(
SELECT
1433119125600 AS timestamp,
'com.package.2' AS package,
'xxx123xxx' AS sNumber,
'event4' AS eventID,
200 AS diff),
(
SELECT
1433119125800 AS timestamp,
'com.package.1' AS package,
'xxx123xxx' AS sNumber,
'event5' AS eventID,
200 AS diff),
ORDER BY
timestamp )
From the set of samples you gave, this returns:
Row timestamp package sNumber eventID diff accTime
1 1433119125117 com.package.1 xxx123xxx event1 null 0
2 1433119125200 com.package.1 xxx123xxx event2 83 83
3 1433119125400 com.package.2 xxx123xxx event3 200 200
4 1433119125600 com.package.2 xxx123xxx event4 200 400
5 1433119125800 com.package.1 xxx123xxx event5 200 200
source to share
In the meantime I played with some sample. This is not a complete answer, but might help someone.
select
pos,label,diff,
if (lag!=label or lag is null,1,0) as reset
from(
select
pos,label,diff,
LAG(label, 1) OVER (ORDER BY pos asc) lag,
from (select 10 as diff,'first' as label, 1 as pos),
(select 20 as diff,'first' as label, 2 as pos),
(select 10 as diff,'second' as label, 3 as pos),
(select 20 as diff,'second' as label, 4 as pos),
(select 10 as diff,'first' as label, 5 as pos),
(select 11 as diff,'first' as label, 6 as pos),
(select 12 as diff,'first' as label, 7 as pos),
order by pos
)
this returns
+-----+-----+--------+------+-------+---+
| Row | pos | label | diff | reset | |
+-----+-----+--------+------+-------+---+
| 1 | 1 | first | 10 | 1 | |
| 2 | 2 | first | 20 | 0 | |
| 3 | 3 | second | 10 | 1 | |
| 4 | 4 | second | 20 | 0 | |
| 5 | 5 | first | 10 | 1 | |
| 6 | 6 | first | 11 | 0 | |
| 7 | 7 | first | 12 | 0 | |
+-----+-----+--------+------+-------+---+
source to share