How to calculate time between events in pandas
Original question
I am stuck with the following problem. I am trying to figure out at what points in time and how long the car is in the factory. I have an Excel sheet that stores all events that are either delivery routes or service events. The end goal is to receive a data frame that indicates the vehicle registration number with the corresponding arrival at the factory and the time spent there (including maintenance). For interested people, this is because I ultimately want to be able to plan non-critical vehicle maintenance activities.
An example of my data block would be:
Registration RoutID Date Dep Loc Arr Loc Dep Time Arr Time Days
0 XC66 A58 20/May/17 Home Loc A 10:54 21:56 0
1 XC66 A59 21/May/17 Loc A Home 00:12 10:36 0
2 XC66 A345 21/May/17 Home Loc B 12:41 19:16 0
3 XC66 A346 21/May/17 Loc B Loc C 20:50 03:49 1
4 XC66 A347 22/May/17 Loc C Home 06:10 07:40 0
5 XC66 #M1 22/May/17 Home Home 10:51 13:00 0
I created a script where all dates and times are processed to create the correct date and time columns for arrival and departure times. For maintenance periods: "Dep Loc" = Home and "Arr Loc" = Home, the following code is used to highlight the corresponding lines:
df_home = df[df["Dep Loc"].isin(["Home"])]
df_home = df_home[df_home["Arr Loc"].isin(["Home"])]
From here, I can easily subtract dates to create a duration column.
So far so good. However, I have been stuck with computation at other times. This is because there may be intermediate stops, so the .shift () function does not work since the number of rows to shift is not constant.
I tried searching for this question, but I could only find shift solutions or answers that are based on internal events, but not on time between events.
It would be helpful to take any guidance in the right direction!
Hello
Attempted solution
I've been stuck on this question for a while, but shortly after posting this question, I tried this solution:
for idx, loc in enumerate(df["Arr Loc"]):
if loc == "Home":
a = ((idx2, obj) for idx2, obj in enumerate(df["Dep Loc"]) if (obj == "Home" and idx2 > idx))
idx_next = next(a)
idx_next = idx_next[0]
Arrival_times = df["Arr Time"]
Departure_times = df["Dep Time"]
Duration = Arrival_times[idx] - Departure_times[idx_next]
Here I have used the following function to find the next occurrence of Home as the starting location (i.e. the time the car leaves the base). Afterwards, I subtract the two dates to find the correct time difference.
It works for a small dataset, but not for the entire dataset.
source to share
No one has answered this question yet
Check out similar questions: