Select information from a row based on the maximum value of a specific column in a group?
I have this data
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Id| Date_Opera | Emitter | EmitterIBAN | Receiver | ReceiverIBAN | Adresss | Value
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1, | 2017-07-07 | Ernst, HR53 8827 2118 4692 8207 5, Kimbra, CH20 1042 6T0N MDTG JT47 U, 3256 Arrowood Point 0002, 121.72
2, | 2017-09-27 | Keene, SK81 1004 7484 7505 6308 9259, Torrance, RO23 ZWTR OJKK VAU9 T5P4 2GDY, 35197 Green Ridge Way, 82.52
3, | 2017-10-17 | Ernst, HR53 8827 2118 4692 8207 5, Kimbra, CH20 1042 6T0N MDTG JT47 U, 3256 Arrowood Point 0048, 51.81
4, | 2017-05-01 | Korie, ME43 9833 9830 7367 4239 60,Roy, IL69 9686 1536 8102 2219 165, 5 Swallow Alley, 88.01
5, | 2017-11-17 | Ernst, HR53 8827 2118 4692 8207 5, Kimbra, CH20 1042 6T0N MDTG JT47 U, 3256 Arrowood Point 0001, 133.99
6, | 2017-10-10 | Charmine, BG92 TOXX 8380 785I JKRQ JS, Sarette, MU67 RYRU 9293 5875 6859 7111 075X HR, 8 Sage Place, 36.30
7, | 2017-07-18 | Ernst, HR53 8827 2118 4692 8207 5, Kimbra, CH20 1042 6T0N MDTG JT47 U, 3256 Arrowood Point 0004, 186.99
And I would like to get a result like this below
- Calculate the number of operations performed by multiple EmitterIBAN and ReceiverIBAN.
- Calculate the sum values ββfor each EmitterIBAN and ReceiverIBAN pair
- And group by addresses that may differ, taking the maximum address value
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sum| Date_Opera | Emitter | EmitterIBAN | Receiver | ReceiverIBAN | Adresss | SumValue
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4, | 2017-11-17 | Ernst, HR53 8827 2118 4692 8207 5, Kimbra, CH20 1042 6T0N MDTG JT47 U, 3256 Arrowood Point 0048, 494,51
1, | 2017-09-27 | Keene, SK81 1004 7484 7505 6308 9259, Torrance, RO23 ZWTR OJKK VAU9 T5P4 2GDY, 35197 Green Ridge Way, 82.52
1, | 2017-05-01 | Korie, ME43 9833 9830 7367 4239 60,Roy, IL69 9686 1536 8102 2219 165, 5 Swallow Alley, 88.01
1, | 2017-10-10 | Charmine, BG92 TOXX 8380 785I JKRQ JS, Sarette, MU67 RYRU 9293 5875 6859 7111 075X HR, 8 Sage Place, 36.30
So, to get this result I am using this query
Select count(1) as NumberOperation,
MAX(Emitter) as EmitterName,
EmitterIban,
MAX(Receiver) as ReceiverName,
ReceiverIban,
MAX(ReceiverAddress) as ReceiverAddress,
SUM([Value]) as SumValues
FROM TableEsperadoceTransaction
Group By EmitterIban,
ReceiverIban
but now, I want, instead of taking the maximum address as in the previous example, I would like to take the address from the record that has the largest data time operation. Here's an example of my data results in shoud looking like
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sum| Date_Opera | Emitter | EmitterIBAN | Receiver | ReceiverIBAN | Adresss | SumValue
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4, | 2017-11-17 | Ernst, HR53 8827 2118 4692 8207 5, Kimbra, CH20 1042 6T0N MDTG JT47 U, 3256 Arrowood Point 0002, 494,51
1, | 2017-09-27 | Keene, SK81 1004 7484 7505 6308 9259, Torrance, RO23 ZWTR OJKK VAU9 T5P4 2GDY, 35197 Green Ridge Way, 82.52
1, | 2017-05-01 | Korie, ME43 9833 9830 7367 4239 60,Roy, IL69 9686 1536 8102 2219 165, 5 Swallow Alley, 88.01
1, | 2017-10-10 | Charmine, BG92 TOXX 8380 785I JKRQ JS, Sarette, MU67 RYRU 9293 5875 6859 7111 075X HR, 8 Sage Place, 36.30
So my question is, how can I make a request like this?
PS: I have 240 million records
Edit: I have index 3
- Date_Operation
- EmitterIban
- ReceiverIban
source to share
You can try something like this:
Select count(1) as NumberOperation,
MAX(t.Emitter) as EmitterName,
t.EmitterIban,
MAX(t.Receiver) as ReceiverName,
t.ReceiverIban,
(SELECT TOP 1 x.RecieverAddress
FROM TableEsperadoceTransaction AS x
WHERE x.EmitterIban=t.EmitterIban AND x.RecieverIban=t.RecieverIban
ORDER BY Data_Opera DESC) as ReceiverAddress,
SUM(t.[Value]) as SumValues
FROM TableEsperadoceTransaction AS t
Group By t.EmitterIban,
t.ReceiverIban;
I've replaced yours MAX(Address)
with a sub-selection picking the top-most address, ordered Data_Opera
with the same conditions ...
Btw: This will help place the index on the date column ...
UPDATE: this might be faster ...
Select count(1) as NumberOperation,
MAX(t.Emitter) as EmitterName,
t.EmitterIban,
MAX(t.Receiver) as ReceiverName,
t.ReceiverIban,
(SELECT TOP 1 x.RecieverAddress
FROM TableEsperadoceTransaction AS x
WHERE x.EmitterIban=t.EmitterIban
AND x.RecieverIban=t.RecieverIban
AND x.Data_Opera=MAX(t.Data_Opera)) as ReceiverAddress,
SUM(t.[Value]) as SumValues
FROM TableEsperadoceTransaction AS t
Group By t.EmitterIban,
t.ReceiverIban;
GROUP BY
will allow you to directly receive MAX(t.Data_Opera)
. With a three-column index, you should get your address value quickly.
source to share
I think you should use a window function (SQL 2012+):
Select count(1) as NumberOperation,
MAX(t.Emitter) as EmitterName,
t.EmitterIban,
MAX(t.Receiver) as ReceiverName,
t.ReceiverIban,
FIRST_VALUE(x.RecieverAddress) OVER (PARTITION BY t.EmitterIban, t.ReceiverIban ORDER BY Data_Opera DESC),
SUM(t.[Value]) as SumValues
FROM TableEsperadoceTransaction AS t
Group By t.EmitterIban,
t.ReceiverIban;
source to share
I used row_number () in CTE, self-joining the aggregates:
with CTE as
(
select t1.*, row_number() over(partition by EmitterIban, ReceiverIban order by Date_Opera desc) as rn
from TableEsperadoceTransaction t1
)
select a1.EmitterIban,a1.emitter as EName,
a1.ReceiverIban, a1.receiver as RName,
a1.ReceiverAddress
max(a2.rn) as NumberOperation,
sum(a2.value) as SumValues
from CTE a1
inner join CTE a2
on a1.EmitterIban = a2.EmitterIban
and a1.ReceiverIban = a2.ReceiverIban
where a1.rn = 1
group by a1.EmitterIban,a1.emitter,
a1.ReceiverIban, a1.receiver,
a1.ReceiverAddress
source to share