Group with "holes" in the middle
Given the following table:
create table #Track (id int identity, vehicle int, station varchar(50), pieces int, distance int)
insert into #Track (vehicle, station, pieces, distance)
values
(1, 'A', 10, 0), (1, 'B', 10, 50), (1, 'C', 11, 23), (1, 'D', 11, 40), (1, 'E', 10, 5)
This is the result I want (note the change in the shape field between C and D):
vehicle station_from station_to pieces distance_all
1 A B 10 50
1 C D 11 63
1 E E 10 5
If I make this request:
select A.vehicle,
T1.station station_from,
T2.station station_to,
A.pieces,
A.distance_all
from (
select vehicle,
min(id) min_id,
max(id) max_id,
pieces,
sum(distance) distance_all
from #Track
group
by vehicle,
pieces
) A join #Track T1 on A.min_id = T1.id
join #Track T2 on A.max_id = T2.id
I got the wrong result (distance_all is correct, but the station from and to is not. It seems that vehicle 1 was going from A to E and then from C to D:
vehicle station_from station_to pieces distance_all
1 A E 10 55
1 C D 11 63
How to achieve the desired result without using a cursor (tables are quite large, several million records)
source to share
I think this is what you are trying to do. Processing consecutive lines with the same values ββfor a vehicle as one group and obtaining the minimum and maximum station, the sum of the distance for this group. Use lag
to get the value of the previous row and reset the group (in the current amount) when it differs from the current row. After that, it's just a grouping operation.
select distinct vehicle
,first_value(station) over(partition by vehicle,grp order by id) as station_from
,first_value(station) over(partition by vehicle,grp order by id desc) as station_to
,pieces
,sum(distance) over(partition by vehicle,grp) as distance_all
from (select t.* ,sum(case when prev_pieces=pieces then 0 else 1 end) over(partition by vehicle order by id) as grp
from (select t.*,lag(pieces) over(partition by vehicle order by id) as prev_pieces
from Track t
) t
) t
source to share
This is a variant of the "gaps and islands" problem. In your case, you can solve this line number difference problem:
select vehicle,
max(case when seqnum_grp = 1 then station end) as station_from,
max(case when seqnum_grp_desc = 1 then station end) as station_to,
pieces,
sum(pieces) as pieces_all
from (select t.*,
row_number() over (partition by vehicle, pieces, (seqnum - seqnum_p) order by id) as seqnum_grp,
row_number() over (partition by vehicle, pieces, (seqnum - seqnum_p) order by id desc) as seqnum_grp_desc
from (select t.*,
row_number() over (partition by vehicle order by id) as seqnum,
row_number() over (partition by vehicle, pieces order by id) as seqnum_p
from #Track t
) t
) t
group by vehicle, pieces, (seqnum - seqnum_p);
To understand how this works, you need to understand why the difference in line numbers identifies groups. To do this, you just need to run the innermost subquery and look at the results.
This is a little more difficult than most of these problems because you want the first and last stations along the way. Hence, there is an additional subquery.
source to share
you can request as below:
Select vehicle, min(Station) as Station_From, Max(station) Station_To, pieces, sum(distance) as Distance_all
from (
Select *, [Bucket] =Row_number() over(order by id) - Row_number() over(partition by pieces order by id)
from #Track
) a
group by vehicle, pieces, [Bucket]
Output:
+---------+--------------+------------+--------+--------------+
| vehicle | Station_From | Station_To | pieces | Distance_all |
+---------+--------------+------------+--------+--------------+
| 1 | A | B | 10 | 50 |
| 1 | E | E | 10 | 5 |
| 1 | C | D | 11 | 63 |
+---------+--------------+------------+--------+--------------+
source to share