Select information from a row based on the maximum value of a specific column in a group?

I have this data


----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Id| Date_Opera | Emitter    |   EmitterIBAN                         |  Receiver    |   ReceiverIBAN                           |         Adresss                          |     Value 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1, | 2017-07-07 | Ernst,         HR53 8827 2118 4692 8207 5,           Kimbra,         CH20 1042 6T0N MDTG JT47 U,                     3256 Arrowood Point         0002,        121.72
2, | 2017-09-27 | Keene,         SK81 1004 7484 7505 6308 9259,        Torrance,       RO23 ZWTR OJKK VAU9 T5P4 2GDY,                  35197 Green Ridge Way,                   82.52
3, | 2017-10-17 | Ernst,         HR53 8827 2118 4692 8207 5,           Kimbra,         CH20 1042 6T0N MDTG JT47 U,                     3256 Arrowood Point         0048,        51.81
4, | 2017-05-01 | Korie,         ME43 9833 9830 7367 4239 60,Roy,      IL69            9686 1536 8102 2219 165,                        5 Swallow Alley,                         88.01
5, | 2017-11-17 | Ernst,         HR53 8827 2118 4692 8207 5,           Kimbra,         CH20 1042 6T0N MDTG JT47 U,                     3256 Arrowood Point         0001,        133.99
6, | 2017-10-10 | Charmine,      BG92 TOXX 8380 785I JKRQ JS,          Sarette,        MU67 RYRU 9293 5875 6859 7111 075X HR,          8 Sage Place,                            36.30
7, | 2017-07-18 | Ernst,         HR53 8827 2118 4692 8207 5,           Kimbra,         CH20 1042 6T0N MDTG JT47 U,                     3256 Arrowood Point         0004,        186.99

      

And I would like to get a result like this below

  • Calculate the number of operations performed by multiple EmitterIBAN and ReceiverIBAN.
  • Calculate the sum values ​​for each EmitterIBAN and ReceiverIBAN pair
  • And group by addresses that may differ, taking the maximum address value

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sum| Date_Opera | Emitter    |   EmitterIBAN                         |  Receiver    |   ReceiverIBAN                           |         Adresss                          |     SumValue 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4, | 2017-11-17 |  Ernst,         HR53 8827 2118 4692 8207 5,           Kimbra,         CH20 1042 6T0N MDTG JT47 U,                     3256 Arrowood Point           0048,      494,51
1, | 2017-09-27 |  Keene,         SK81 1004 7484 7505 6308 9259,        Torrance,       RO23 ZWTR OJKK VAU9 T5P4 2GDY,                  35197 Green Ridge Way,                   82.52
1, | 2017-05-01 |  Korie,         ME43 9833 9830 7367 4239 60,Roy,      IL69            9686 1536 8102 2219 165,                        5 Swallow Alley,                         88.01
1, | 2017-10-10 |  Charmine,      BG92 TOXX 8380 785I JKRQ JS,          Sarette,        MU67 RYRU 9293 5875 6859 7111 075X HR,          8 Sage Place,                            36.30

      

So, to get this result I am using this query

Select  count(1) as NumberOperation, 
        MAX(Emitter) as EmitterName, 
        EmitterIban, 
        MAX(Receiver) as ReceiverName, 
        ReceiverIban,
        MAX(ReceiverAddress) as ReceiverAddress,
        SUM([Value]) as SumValues
FROM TableEsperadoceTransaction
Group By EmitterIban,
         ReceiverIban

      

but now, I want, instead of taking the maximum address as in the previous example, I would like to take the address from the record that has the largest data time operation. Here's an example of my data results in shoud looking like


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sum| Date_Opera | Emitter    |   EmitterIBAN                         |  Receiver    |   ReceiverIBAN                           |         Adresss                          |     SumValue 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4, | 2017-11-17 |  Ernst,         HR53 8827 2118 4692 8207 5,           Kimbra,         CH20 1042 6T0N MDTG JT47 U,                     3256 Arrowood Point           0002,      494,51
1, | 2017-09-27 |  Keene,         SK81 1004 7484 7505 6308 9259,        Torrance,       RO23 ZWTR OJKK VAU9 T5P4 2GDY,                  35197 Green Ridge Way,                   82.52
1, | 2017-05-01 |  Korie,         ME43 9833 9830 7367 4239 60,Roy,      IL69            9686 1536 8102 2219 165,                        5 Swallow Alley,                         88.01
1, | 2017-10-10 |  Charmine,      BG92 TOXX 8380 785I JKRQ JS,          Sarette,        MU67 RYRU 9293 5875 6859 7111 075X HR,          8 Sage Place,                            36.30

      

So my question is, how can I make a request like this?

PS: I have 240 million records

Edit: I have index 3

  • Date_Operation
  • EmitterIban
  • ReceiverIban
+3


source to share


3 answers


You can try something like this:

Select  count(1) as NumberOperation, 
        MAX(t.Emitter) as EmitterName, 
        t.EmitterIban, 
        MAX(t.Receiver) as ReceiverName, 
        t.ReceiverIban,
        (SELECT TOP 1 x.RecieverAddress 
         FROM TableEsperadoceTransaction AS x 
         WHERE x.EmitterIban=t.EmitterIban AND x.RecieverIban=t.RecieverIban
         ORDER BY Data_Opera DESC) as ReceiverAddress,
        SUM(t.[Value]) as SumValues
FROM TableEsperadoceTransaction AS t
Group By t.EmitterIban,
         t.ReceiverIban;

      

I've replaced yours MAX(Address)

with a sub-selection picking the top-most address, ordered Data_Opera

with the same conditions ...

Btw: This will help place the index on the date column ...



UPDATE: this might be faster ...

Select  count(1) as NumberOperation, 
        MAX(t.Emitter) as EmitterName, 
        t.EmitterIban, 
        MAX(t.Receiver) as ReceiverName, 
        t.ReceiverIban,
        (SELECT TOP 1 x.RecieverAddress 
         FROM TableEsperadoceTransaction AS x 
         WHERE x.EmitterIban=t.EmitterIban 
           AND x.RecieverIban=t.RecieverIban
           AND x.Data_Opera=MAX(t.Data_Opera)) as ReceiverAddress,
        SUM(t.[Value]) as SumValues
FROM TableEsperadoceTransaction AS t
Group By t.EmitterIban,
         t.ReceiverIban;

      

GROUP BY

will allow you to directly receive MAX(t.Data_Opera)

. With a three-column index, you should get your address value quickly.

+3


source


I think you should use a window function (SQL 2012+):



Select  count(1) as NumberOperation, 
        MAX(t.Emitter) as EmitterName, 
        t.EmitterIban, 
        MAX(t.Receiver) as ReceiverName, 
        t.ReceiverIban,
        FIRST_VALUE(x.RecieverAddress) OVER (PARTITION BY t.EmitterIban, t.ReceiverIban ORDER BY Data_Opera DESC),
        SUM(t.[Value]) as SumValues
FROM TableEsperadoceTransaction AS t
Group By t.EmitterIban,
         t.ReceiverIban;

      

+2


source


I used row_number () in CTE, self-joining the aggregates:

with CTE as
(
select t1.*, row_number() over(partition by EmitterIban, ReceiverIban order by Date_Opera desc)  as rn
from TableEsperadoceTransaction t1
)

select a1.EmitterIban,a1.emitter as EName, 
       a1.ReceiverIban, a1.receiver as RName,
       a1.ReceiverAddress
       max(a2.rn) as NumberOperation,
       sum(a2.value) as SumValues
from CTE a1
inner join CTE a2
on a1.EmitterIban = a2.EmitterIban
and a1.ReceiverIban = a2.ReceiverIban
where a1.rn = 1
group by a1.EmitterIban,a1.emitter, 
         a1.ReceiverIban, a1.receiver,
         a1.ReceiverAddress

      

+1


source







All Articles