Combining SQL Aggregates and Subquery Optimization

I am trying to get aggregated values ​​over time periods of two relationships (buy and use) and concatenate them so that I can get the results in one report and also draw the relationship to them. I am using PostgreSQL. Required Final Report:dateTime, u.sum, b.sum, b.sum/u.sum

The following query works, but scales very poorly with large table sizes.

SELECT b2.datetime AS dateTime, b2.sum AS BUY_VOLUME, u1.sum AS USE_VOLUME, 
CASE u1.sum
   WHEN 0 THEN 0
   ELSE (b2.sum / u1.sum)
END AS buyToUseRatio
    FROM(
    SELECT SUM(b.total / 100.0) AS sum, date_trunc('week', (b.datetime + INTERVAL '1 day')) - INTERVAL '1 day' as datetime
    FROM buys AS b
    WHERE 
    datetime > date_trunc('month', CURRENT_DATE) - INTERVAL '1 year'
    GROUP BY datetime) AS b2
INNER JOIN (SELECT SUM(u.amount) / 100.00 AS sum, date_trunc('week', (u.datetime + INTERVAL '1 day')) - INTERVAL '1 day' AS datetime
FROM uses AS u
WHERE 
 datetime > date_trunc('month', CURRENT_DATE) - INTERVAL '1 year'
GROUP BY datetime) AS u1 ON b2.datetime = u1.datetime
ORDER BY b2.datetime ASC;

      

I was wondering if someone could help me by providing an alternative query that would provide the end result and speed up execution.

I appreciate any help on this :-) My junior SQL is a bit rusty and I can't think of any other way to do this without creating indexes. Thanks in advance.

+3


source to share


2 answers


At least these indexes can help your query:

create index idx_buys_datetime on buys(datetime);
create index idx_uses_datetime on uses(datetime);

      

Your request seems fine. However, you can use full join

(instead of inner

) for all rows where at least one of your tables has data. You can even use generate_series()

to always have 1 year of results even if there is no data in any of your tables, but I'm not sure if this is what you need. Also, some other things can be written more easily; your request might look like this:



select    dt, buy_volume, use_volume, buy_volume / nullif(use_volume, 0.0) buy_to_use_ratio
from      (select   sum(total / 100.0)  buy_volume, date_trunc('week', (datetime + interval '1 day')) - interval '1 day' dt
           from     buys
           where    datetime > date_trunc('month', current_timestamp - interval '1 year')
           group by 2) b
full join (select   sum(amount) / 100.0 use_volume, date_trunc('week', (datetime + interval '1 day')) - interval '1 day' dt
           from     uses
           where    datetime > date_trunc('month', current_timestamp - interval '1 year')
           group by 2) u using (dt)
order by  1

      

http://rextester.com/YVASV92568

0


source


So the answer depends on how big your tables are, but if it were me I would create one or two new "pivot" tables based on your query and watch them update (run a batch job once a day to update them or once an hour with all data that has recently changed).

Then I could query those tables and do it much faster.



If, however, your tables are very small, then just keep going as you do and play with the indexes until you get a reasonable time frame.

0


source







All Articles