GROUP BY and aggregation on date range
SELECT IsConfirmed, IsNetConfirmed, d.FullDate FROM Final.FactApplication f
INNER JOIN final.DimOfferedDate d on f.OfferedDateKey= d.OfferedDateKey
WHERE d.CalendarYear in ('2013','2014','2015')
The above code returns the following sample data.
IsConfirmed IsNetConfirmed FullDate
----------------------------------------------
1 0 2013-01-04 00:00:00.000
1 1 2013-02-04 00:00:00.000
0 1 2013-03-04 00:00:00.000
1 0 2013-04-04 00:00:00.000
I want to sum the sum of both IsConfirmed and IsNetConfirmed for a day and a month for each year to end up with the following result. I need to aggregate for each year so that the 4/31/2012 sum includes data between 1/1 / 2012-4 / 31/2012.
Here's my code so far - but I can't wrap my head around all the groups. Help me please.
SELECT sum(IsConfirmed) AS ConfirmCount
,sum(IsNetConfirmed) AS NetConfirmCount
,year(d.fulldate) AS cyear
,month(d.fulldate) AS cmonth
,day(d.fulldate) AS cday
FROM final.FactApplicationHistory f
INNER JOIN final.DimOfferedDate d ON f.OfferedDateKey = d.OfferedDateKey
WHERE d.CalendarYear IN ('2013','2014','2015')
GROUP BY year(d.fulldate)
,month(d.fulldate)
,day(d.fulldate)
ORDER BY year(d.fulldate)
source to share
The amounts launched are complex. You cannot do them with just one group. There are several ways to do this. One way is to join the data table with you on all records that are less than or equal in value to the current record and sum the data points. Your query is especially complicated by the fact that you also want to fold the data. This is how you would do it with the join table method:
;with temp (IsConfirmed, IsNetConfirmed, FullDate) AS
(
SELECT IsConfirmed, IsNetConfirmed, FullDate
FROM final.FactApplicationHistory f
INNER JOIN final.DimOfferedDate d
ON f.OfferedDateKey = d.OfferedDateKey
WHERE d.CalendarYear IN ('2013','2014','2015')
)
, pivottable (cmonthday,cmonth,cday,ConfirmCount2013,NetConfirmCount2013,ConfirmCount2014,NetConfirmCount2014,ConfirmCount2015,NetConfirmCount2015) AS
(
SELECT
dateadd(day,cday,(DATEADD(month, cmonth, 0))) cmonthday,
cmonth,
cday,
sum(isnull([2013],0)) ConfirmCount2013,
sum(isnull([2016],0)) NetConfirmCount2013,
sum(isnull([2014],0)) ConfirmCount2014,
sum(isnull([2017],0)) NetConfirmCount2014,
sum(isnull([2015],0)) ConfirmCount2015,
sum(isnull([2018],0)) NetConfirmCount2015
FROM
(SELECT sum(IsConfirmed) AS ConfirmCount
,sum(IsNetConfirmed) AS NetConfirmCount
,year(d.FullDate) AS cyear
,year(d.FullDate)+3 AS cyear2
,month(d.FullDate) AS cmonth
,day(d.FullDate) AS cday
FROM #temp d
WHERE year(FullDate) IN ('2013','2014','2015')
GROUP BY year(d.FullDate)
,month(d.FullDate)
,day(d.FullDate)
) ps
PIVOT
(
SUM (ConfirmCount)
FOR cyear IN
( [2013],[2014],[2015])
) AS pvt
PIVOT
(
SUM (NetConfirmCount)
FOR cyear2 IN
( [2016],[2017],[2018])
) AS pvt
Group by cmonth,
cday
)
select
pivottable.cmonth,
pivottable.cday,
sum(RunningSums.ConfirmCount2013) ConfirmCount2013,
sum(RunningSums.NetConfirmCount2013) NetConfirmCount2013,
sum(RunningSums.ConfirmCount2014) ConfirmCount2014,
sum(RunningSums.NetConfirmCount2014) NetConfirmCount2014,
sum(RunningSums.ConfirmCount2015) ConfirmCount2015,
sum(RunningSums.NetConfirmCount2015) NetConfirmCount2015
from pivottable
join pivottable RunningSums
on RunningSums.cmonthday <= pivottable.cmonthday
group by pivottable.cmonth,pivottable.cday
order by pivottable.cmonth, pivottable.cday
I suppose this might even be the case where a cursor would actually be a good idea. You can create your output summary data into a table and then iterate over each record and update each value with the running sum. For a very large table with millions of records, this will probably be more efficient than my self join method.
source to share
First, you need to create every possible combination of month and date. For example, in 2013, you have dates from January 1 to January 5, and in 2014, you have dates from January 3 to January 6. You must have dates between January 1 and January 6 for 2013 and 2014. After getting all the dates, JOIN
with the original request that each new date has a value for IsConfirmed
and IsNetConfirmed
. From this you get the total. Finally, you want to PIVOT
complete the total to achieve the desired result.
Here's a dynamic approach using Crosstab. You can read this article for reference.
DECLARE @sql1 VARCHAR(4000) = '',
@sql2 VARCHAR(4000) = '',
@sql3 VARCHAR(4000) = ''
SELECT @sql1 =
';WITH SampleData AS( -- Replace this CTE with the original query
SELECT * FROM Data
)
,CrossDates AS( -- Generate date combinations
SELECT
YY, MM, DD,
FullDate = DATEADD(DAY, DD - 1, DATEADD(MONTH, MM - 1, DATEADD(YEAR, YY - 1900, 0)))
FROM (
SELECT DISTINCT
MM = MONTH(FullDate),
DD = DAY(FullDate)
FROM SampleData
)DM
CROSS JOIN(
SELECT DISTINCT YY = YEAR(FullDate) FROM SampleData
)Y
)
,CteAllDates AS( -- Assign value for each newly generated date
SELECT
c.*,
IsConfirmed = ISNULL(s.IsConfirmed, 0),
IsNetConfirmed = ISNULL(s.IsNetConfirmed, 0)
FROM CrossDates c
LEFT JOIN SampleData s
ON s.FullDate = c.FullDate
)
,RunningTotal AS( -- Compute running total
SELECT
YY = YEAR(FullDate),
MM = MONTH(FullDate),
DD = DAY(FullDate),
Confirm = SUM(CAST(IsConfirmed AS INT)) OVER(PARTITION BY YEAR(FullDate) ORDER BY MONTH(FullDate), DAY(FullDate)),
NetConfirm = SUM(CAST(IsNetConfirmed AS INT)) OVER(PARTITION BY YEAR(FullDate) ORDER BY MONTH(FullDate), DAY(FullDate))
FROM CteAllDates
)
SELECT
MM
, DD
'
SELECT @sql2 = @sql2 +
' , MAX(CASE WHEN YY = ' + CONVERT(VARCHAR(4), YY) + ' THEN Confirm ELSE 0 END) AS [' + CONVERT(VARCHAR(4), YY) + ' Confirm]' + CHAR(10) +
' , MAX(CASE WHEN YY = ' + CONVERT(VARCHAR(4), YY) + ' THEN NetConfirm ELSE 0 END) AS [' + CONVERT(VARCHAR(4), YY) + ' NetConfirm]' + CHAR(10)
FROM(
SELECT DISTINCT YY = YEAR(FullDate)
FROM(
SELECT * FROM Data -- Replace this with the original query
)d
)t
SELECT @sql3 =
'FROM RunningTotal
GROUP BY MM, DD'
PRINT(@sql1 + @sql2 + @sql3)
EXEC(@sql1 + @sql2 + @sql3)
Notes:
1. Replace with the SampleData
original query.
2. is SUM() OVER()
used to get the total.
3. Basically replace SELECT * FROM Data
with your oqriginal query.
Sample data
IsConfirmed IsNetConfirmed FullDate
----------- -------------- -----------------------
1 0 2013-04-01 00:00:00.000
1 1 2013-04-02 00:00:00.000
0 1 2013-04-03 00:00:00.000
1 0 2013-04-04 00:00:00.000
1 0 2014-04-01 00:00:00.000
1 1 2014-04-02 00:00:00.000
0 1 2014-04-03 00:00:00.000
1 0 2014-04-05 00:00:00.000
Result
|----|----|--------------|-----------------|--------------|-----------------|
| MM | DD | 2013 Confirm | 2013 NetConfirm | 2014 Confirm | 2014 NetConfirm |
|----|----|--------------|-----------------|--------------|-----------------|
| 4 | 1 | 1 | 0 | 1 | 0 |
| 4 | 2 | 2 | 1 | 2 | 1 |
| 4 | 3 | 2 | 2 | 2 | 2 |
| 4 | 4 | 3 | 2 | 2 | 2 |
| 4 | 5 | 3 | 2 | 3 | 2 |
source to share