How to calculate time between events in pandas

Original question

I am stuck with the following problem. I am trying to figure out at what points in time and how long the car is in the factory. I have an Excel sheet that stores all events that are either delivery routes or service events. The end goal is to receive a data frame that indicates the vehicle registration number with the corresponding arrival at the factory and the time spent there (including maintenance). For interested people, this is because I ultimately want to be able to plan non-critical vehicle maintenance activities.

An example of my data block would be:

  Registration RoutID       Date Dep Loc Arr Loc Dep Time Arr Time  Days
0         XC66    A58  20/May/17    Home   Loc A    10:54    21:56     0
1         XC66    A59  21/May/17   Loc A    Home    00:12    10:36     0
2         XC66   A345  21/May/17   Home    Loc B    12:41    19:16     0
3         XC66   A346  21/May/17   Loc B   Loc C    20:50    03:49     1
4         XC66   A347  22/May/17   Loc C    Home    06:10    07:40     0
5         XC66    #M1  22/May/17    Home    Home    10:51    13:00     0

      

I created a script where all dates and times are processed to create the correct date and time columns for arrival and departure times. For maintenance periods: "Dep Loc" = Home and "Arr Loc" = Home, the following code is used to highlight the corresponding lines:

df_home = df[df["Dep Loc"].isin(["Home"])]
df_home = df_home[df_home["Arr Loc"].isin(["Home"])]

      

From here, I can easily subtract dates to create a duration column.

So far so good. However, I have been stuck with computation at other times. This is because there may be intermediate stops, so the .shift () function does not work since the number of rows to shift is not constant.

I tried searching for this question, but I could only find shift solutions or answers that are based on internal events, but not on time between events.

It would be helpful to take any guidance in the right direction!

Hello

Attempted solution

I've been stuck on this question for a while, but shortly after posting this question, I tried this solution:

for idx, loc in enumerate(df["Arr Loc"]):
    if loc == "Home":
        a = ((idx2, obj) for idx2, obj in enumerate(df["Dep Loc"]) if (obj == "Home" and idx2 > idx))
        idx_next = next(a)
        idx_next = idx_next[0]

        Arrival_times = df["Arr Time"]
        Departure_times = df["Dep Time"]

        Duration = Arrival_times[idx] - Departure_times[idx_next]

      

Here I have used the following function to find the next occurrence of Home as the starting location (i.e. the time the car leaves the base). Afterwards, I subtract the two dates to find the correct time difference.

It works for a small dataset, but not for the entire dataset.

+3


source to share





All Articles