GROUP BY and aggregation on date range

SELECT IsConfirmed, IsNetConfirmed, d.FullDate FROM Final.FactApplication f 
INNER JOIN final.DimOfferedDate d on f.OfferedDateKey= d.OfferedDateKey
WHERE d.CalendarYear in ('2013','2014','2015')

      

The above code returns the following sample data.

IsConfirmed IsNetConfirmed  FullDate
----------------------------------------------  
1           0               2013-01-04 00:00:00.000   
1           1               2013-02-04 00:00:00.000
0           1               2013-03-04 00:00:00.000
1           0               2013-04-04 00:00:00.000

      

I want to sum the sum of both IsConfirmed and IsNetConfirmed for a day and a month for each year to end up with the following result. I need to aggregate for each year so that the 4/31/2012 sum includes data between 1/1 / 2012-4 / 31/2012. enter image description here

Here's my code so far - but I can't wrap my head around all the groups. Help me please.

SELECT sum(IsConfirmed) AS ConfirmCount
    ,sum(IsNetConfirmed) AS NetConfirmCount
    ,year(d.fulldate) AS cyear
    ,month(d.fulldate) AS cmonth
    ,day(d.fulldate) AS cday
FROM final.FactApplicationHistory f
INNER JOIN final.DimOfferedDate d ON f.OfferedDateKey = d.OfferedDateKey
WHERE d.CalendarYear IN ('2013','2014','2015')
GROUP BY year(d.fulldate)
    ,month(d.fulldate)
    ,day(d.fulldate)
ORDER BY year(d.fulldate)

      

+3


source to share


2 answers


The amounts launched are complex. You cannot do them with just one group. There are several ways to do this. One way is to join the data table with you on all records that are less than or equal in value to the current record and sum the data points. Your query is especially complicated by the fact that you also want to fold the data. This is how you would do it with the join table method:

;with temp (IsConfirmed, IsNetConfirmed, FullDate) AS
(
SELECT IsConfirmed, IsNetConfirmed, FullDate
FROM final.FactApplicationHistory f
INNER JOIN final.DimOfferedDate d 
ON f.OfferedDateKey = d.OfferedDateKey
WHERE d.CalendarYear IN ('2013','2014','2015')
)
, pivottable (cmonthday,cmonth,cday,ConfirmCount2013,NetConfirmCount2013,ConfirmCount2014,NetConfirmCount2014,ConfirmCount2015,NetConfirmCount2015) AS
(
    SELECT
        dateadd(day,cday,(DATEADD(month, cmonth, 0))) cmonthday,
        cmonth,
        cday,
        sum(isnull([2013],0)) ConfirmCount2013, 
        sum(isnull([2016],0)) NetConfirmCount2013, 
        sum(isnull([2014],0)) ConfirmCount2014, 
        sum(isnull([2017],0)) NetConfirmCount2014, 
        sum(isnull([2015],0)) ConfirmCount2015, 
        sum(isnull([2018],0)) NetConfirmCount2015
    FROM 
        (SELECT sum(IsConfirmed) AS ConfirmCount
            ,sum(IsNetConfirmed) AS NetConfirmCount
            ,year(d.FullDate) AS cyear
            ,year(d.FullDate)+3 AS cyear2
            ,month(d.FullDate) AS cmonth
            ,day(d.FullDate) AS cday
        FROM #temp d
        WHERE year(FullDate) IN ('2013','2014','2015')
        GROUP BY year(d.FullDate)
            ,month(d.FullDate)
            ,day(d.FullDate)
        ) ps
    PIVOT
    (
    SUM (ConfirmCount)
    FOR cyear IN
    ( [2013],[2014],[2015])
    ) AS pvt
    PIVOT
    (
    SUM (NetConfirmCount)
    FOR cyear2 IN
    ( [2016],[2017],[2018])
    ) AS pvt
    Group by cmonth,
        cday
)
select
    pivottable.cmonth,
    pivottable.cday,
    sum(RunningSums.ConfirmCount2013) ConfirmCount2013, 
    sum(RunningSums.NetConfirmCount2013) NetConfirmCount2013, 
    sum(RunningSums.ConfirmCount2014) ConfirmCount2014, 
    sum(RunningSums.NetConfirmCount2014) NetConfirmCount2014, 
    sum(RunningSums.ConfirmCount2015) ConfirmCount2015, 
    sum(RunningSums.NetConfirmCount2015) NetConfirmCount2015
from pivottable
join pivottable RunningSums
on RunningSums.cmonthday <= pivottable.cmonthday
group by pivottable.cmonth,pivottable.cday
order by pivottable.cmonth, pivottable.cday

      



I suppose this might even be the case where a cursor would actually be a good idea. You can create your output summary data into a table and then iterate over each record and update each value with the running sum. For a very large table with millions of records, this will probably be more efficient than my self join method.

+1


source


First, you need to create every possible combination of month and date. For example, in 2013, you have dates from January 1 to January 5, and in 2014, you have dates from January 3 to January 6. You must have dates between January 1 and January 6 for 2013 and 2014. After getting all the dates, JOIN

with the original request that each new date has a value for IsConfirmed

and IsNetConfirmed

. From this you get the total. Finally, you want to PIVOT

complete the total to achieve the desired result.

Here's a dynamic approach using Crosstab. You can read this article for reference.

SQL Fiddle

DECLARE @sql1 VARCHAR(4000) = '',
        @sql2 VARCHAR(4000) = '',
        @sql3 VARCHAR(4000) = ''

SELECT @sql1 =
';WITH SampleData AS( -- Replace this CTE with the original query
    SELECT * FROM Data
)
,CrossDates AS( -- Generate date combinations
    SELECT
        YY, MM, DD,
        FullDate = DATEADD(DAY, DD - 1, DATEADD(MONTH, MM - 1, DATEADD(YEAR, YY - 1900, 0)))        
    FROM (
        SELECT DISTINCT
            MM = MONTH(FullDate),
            DD = DAY(FullDate)
        FROM SampleData
    )DM
    CROSS JOIN(
        SELECT DISTINCT YY = YEAR(FullDate) FROM SampleData
    )Y
)
,CteAllDates AS( -- Assign value for each newly generated date
    SELECT
        c.*,
        IsConfirmed = ISNULL(s.IsConfirmed, 0),
        IsNetConfirmed = ISNULL(s.IsNetConfirmed, 0)
    FROM CrossDates c
    LEFT JOIN SampleData s
        ON s.FullDate = c.FullDate
)
,RunningTotal AS( -- Compute running total
    SELECT
        YY = YEAR(FullDate),
        MM = MONTH(FullDate),
        DD = DAY(FullDate),
        Confirm = SUM(CAST(IsConfirmed AS INT)) OVER(PARTITION BY YEAR(FullDate) ORDER BY MONTH(FullDate), DAY(FullDate)),
        NetConfirm = SUM(CAST(IsNetConfirmed AS INT)) OVER(PARTITION BY YEAR(FullDate) ORDER BY MONTH(FullDate), DAY(FullDate))
    FROM CteAllDates
)
SELECT
      MM
    , DD
'

SELECT @sql2 = @sql2 +
'   , MAX(CASE WHEN YY = ' + CONVERT(VARCHAR(4), YY) + ' THEN Confirm ELSE 0 END) AS [' + CONVERT(VARCHAR(4), YY) + ' Confirm]' + CHAR(10) +
'   , MAX(CASE WHEN YY = ' + CONVERT(VARCHAR(4), YY) + ' THEN NetConfirm ELSE 0 END) AS [' + CONVERT(VARCHAR(4), YY) + ' NetConfirm]' + CHAR(10)
FROM(
    SELECT DISTINCT YY = YEAR(FullDate)
    FROM(
        SELECT * FROM Data -- Replace this with the original query
    )d
)t

SELECT @sql3 = 
'FROM RunningTotal
GROUP BY MM, DD'

PRINT(@sql1 + @sql2 + @sql3)
EXEC(@sql1 + @sql2 + @sql3)

      


Notes:

1. Replace with the SampleData

original query.



2. is SUM() OVER()

used to get the total.

3. Basically replace SELECT * FROM Data

with your oqriginal query.


Sample data

IsConfirmed IsNetConfirmed FullDate
----------- -------------- -----------------------
1           0              2013-04-01 00:00:00.000
1           1              2013-04-02 00:00:00.000
0           1              2013-04-03 00:00:00.000
1           0              2013-04-04 00:00:00.000
1           0              2014-04-01 00:00:00.000
1           1              2014-04-02 00:00:00.000
0           1              2014-04-03 00:00:00.000
1           0              2014-04-05 00:00:00.000

      

Result

|----|----|--------------|-----------------|--------------|-----------------|
| MM | DD | 2013 Confirm | 2013 NetConfirm | 2014 Confirm | 2014 NetConfirm |
|----|----|--------------|-----------------|--------------|-----------------|
|  4 |  1 |            1 |               0 |            1 |               0 |
|  4 |  2 |            2 |               1 |            2 |               1 |
|  4 |  3 |            2 |               2 |            2 |               2 |
|  4 |  4 |            3 |               2 |            2 |               2 |
|  4 |  5 |            3 |               2 |            3 |               2 |

      

+1


source







All Articles