PostgresQL: Help required
Consider a table with the following PostgresQL database table structure (split here for demonstration purpose):
time: Timestamp
type: Text
value: Integer
The table fills up with many events over time. We now need a SQL database for reporting that aggregates values over time (for example, every hour) and averages and counts them for a specific type. So the report looks like this:
[example report for type="standard"]
Time Count Avg
00:00 30 20
01:00 12 24
02:00 9 19
...
This is pretty straight forward so far, so the statement for the above report is:
select extract(hour from time) time, count(1), avg(value)
from reportdata
where type = 'standard'
group by time;
Now comes the tricky part - I need to show a report for each type with the total of all types and the percentage in relation to the specific type in each time interval. To do this, I need an instruction that creates a row for each time frame and each possible type (which can be selected from a separate table containing all possible types) from which the code can retrieve the report tab for each type without having to query the database again. Thus, the result should look like this (note the "empty" lines for the time interval, where no values were found for this type):
[example report for all types assuming there are 3 possible types]
Time Type Total Count Percent Avg
00:00 standard 40 30 0.75 20
00:00 special 40 10 0.25 8
00:00 super 40 0 0 0
01:00 standard 12 12 1.0 24
01:00 special 12 0 0 0
01:00 super 12 0 0 0
02:00 standard 9 3 0.33 20
02:00 special 9 0 0 0
02:00 super 9 6 0.67 15
...
What would the expression of this product look like?
source to share
select
s.hour as "Time",
s.type as "Type",
s.total as "Total",
coalesce(r.total, 0) as "Count",
round(coalesce(r.total, 0) * 1.0/s.total, 2) as "Percent",
round(coalesce(r.avg, 0), 2) as "Avg"
from (
select
date_trunc('hour', time) as hour,
type,
count(*) as total,
avg(value) as avg
from reportdata
group by hour, type
) r
right outer join (
select
date_trunc('hour', time) as hour,
t.type,
count(*) as total
from reportdata
inner join type t on true
group by hour, t.type
) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
Time | Type | Total | Count | Percent | Avg
---------------------+----------+-------+-------+---------+-------
2012-04-02 00:00:00 | special | 40 | 10 | 0.25 | 8.00
2012-04-02 00:00:00 | standard | 40 | 30 | 0.75 | 20.00
2012-04-02 00:00:00 | super | 40 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | special | 12 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | standard | 12 | 12 | 1.00 | 24.00
2012-04-02 01:00:00 | super | 12 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | special | 9 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | standard | 9 | 3 | 0.33 | 20.00
2012-04-02 02:00:00 | super | 9 | 6 | 0.67 | 15.00
(9 rows)
I used date_trunc
in timestamp because I think you want to separate every hour of every day. If combining every hour of all days is what you want, just go back to yourextract
Update as per the new requirement in the comment:
select
s.hour as "Time",
s.type as "Type",
s.total as "Total",
coalesce(r.total, 0) as "Count",
case s.total when 0 then round(0, 2) else
round(coalesce(r.total, 0) * 1.0/s.total, 2)
end as "Percent",
round(coalesce(r.avg, 0), 2) as "Avg"
from (
select
date_trunc('hour', time) as hour,
type,
count(*) as total,
avg(value) as avg
from reportdata
group by hour, type
) r
right outer join (
select
date_trunc('hour', d) as hour,
t.type,
count(r.time) as total
from reportdata r
right outer join (
select d
from generate_series(
(select min(time) from reportdata),
(select max(time) from reportdata),
'1 hour'
) g(d)
) g on date_trunc('hour', g.d) = date_trunc('hour', r.time)
inner join type t on true
group by hour, t.type
) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
Time | Type | Total | Count | Percent | Avg
---------------------+----------+-------+-------+---------+-------
2012-04-01 22:00:00 | special | 1 | 0 | 0.00 | 0.00
2012-04-01 22:00:00 | standard | 1 | 1 | 1.00 | 10.00
2012-04-01 22:00:00 | super | 1 | 0 | 0.00 | 0.00
2012-04-01 23:00:00 | special | 0 | 0 | 0.00 | 0.00
2012-04-01 23:00:00 | standard | 0 | 0 | 0.00 | 0.00
2012-04-01 23:00:00 | super | 0 | 0 | 0.00 | 0.00
2012-04-02 00:00:00 | special | 40 | 10 | 0.25 | 8.00
2012-04-02 00:00:00 | standard | 40 | 30 | 0.75 | 20.00
2012-04-02 00:00:00 | super | 40 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | special | 12 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | standard | 12 | 12 | 1.00 | 24.00
2012-04-02 01:00:00 | super | 12 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | special | 9 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | standard | 9 | 3 | 0.33 | 20.00
2012-04-02 02:00:00 | super | 9 | 6 | 0.67 | 15.00
2012-04-02 03:00:00 | special | 0 | 0 | 0.00 | 0.00
2012-04-02 03:00:00 | standard | 0 | 0 | 0.00 | 0.00
2012-04-02 03:00:00 | super | 0 | 0 | 0.00 | 0.00
2012-04-02 04:00:00 | special | 1 | 0 | 0.00 | 0.00
2012-04-02 04:00:00 | standard | 1 | 1 | 1.00 | 10.00
2012-04-02 04:00:00 | super | 1 | 0 | 0.00 | 0.00
(21 rows)
source to share
Something like that?
select extract(hour from rd.time) time,
at.type,
count(at.value) over (partition by extract(hour from time)) as total,
count(at.value) over (partition by rd.type) as count,
avg(value) over (partition by rd.type) as avg,
from all_types at
left join reportdata rd on at.type = rd.type
group by time, at.type;
( all_types
is "a separate table containing all possible types")
source to share