Create a new column base on an existing time column in a dataframe

I need to create a shift column based on an existing time column.

For example, I have a dataframe df1 with details:

   time
0  10:30
1  13:50
2  19:20
3  14:10

      

I need a dataframe that looks like this with a shift:

  • from 8:30 to 12:30 = shift 1,
  • 12:30 to 20:20 = shift 2
  • 20:30 to 8:30 = shift 3
   time shift
0  10:30    1
1  13:50    2
2  19:20    2
3  23:10    3

      

+3


source to share


3 answers


Next, an offset dictionary is used to help determine the offset associated with a given time:

import pandas as pd


df = pd.DataFrame({'time': ['00:00','08:29', '08:30', '08:31', '12:29', '12:30', '12:31', '20:29', '20:30', '20:31', '23:59', '10:30', '13:50', '19:20', '14:10', '23:10']})

# Convert the time column into datetime objects
df.time = pd.to_datetime(df.time).dt.time

# Set up a shifts dictionary
shifts = {('8:30', '12:30'): 1 , ('12:30', '20:30'): 2, ('20:30', '8:30'): 3}

# Convert the keys to datetime objects
shifts = {tuple(map(pd.to_datetime, k)):v for k,v in shifts.items()}

# Expand the datetime objects beyond one day if the second element occurred after the first element
shifts = {(k if k[0].time() < k[1].time() else (k[0],k[1]+pd.to_timedelta('1day'))):v for k,v in shifts.items()}

# Determine shift
def get_shift(time):
    try:
        return shifts.get([k for k in shifts if time in pd.date_range(*k, freq='60S', closed='left').time][0])
    except:
        return 'No Shift'

# Use .apply on the time column to get the shift column
df['shift'] = df.time.apply(get_shift)

print(df)

      



Outputs:

#         time  shift
# 0   00:00:00      3
# 1   08:29:00      3
# 2   08:30:00      1
# 3   08:31:00      1
# 4   12:29:00      1
# 5   12:30:00      2
# 6   12:31:00      2
# 7   20:29:00      2
# 8   20:30:00      3
# 9   20:31:00      3
# 10  23:59:00      3
# 11  10:30:00      1
# 12  13:50:00      2
# 13  19:20:00      2
# 14  14:10:00      2
# 15  23:10:00      3

      

+1


source


You can accomplish this apply

using the create column function shift

.

import datetime

def check_shift(row):
    shift_time = row[0]
    if datetime.time(8, 30) <= shift_time <= datetime.time(12, 30):
        return 1
    elif datetime.time(12, 30) < shift_time <= datetime.time(20, 30):
        return 2
    else:
        return 3

df['shift'] = df.apply(check_shift, axis='columns')

      

This will lead to the following file frame

       time  shift
0  10:30:00      1
1  13:50:00      2
2  19:20:00      2
3  14:10:00      2

      

If we adjust this last offset to 23:10

(for example, your sample output), we get the following:



       time  shift
0  10:30:00      1
1  13:50:00      2
2  19:20:00      2
3  23:10:00      3

      


An important note here, I converted the column time

from string to actual type time

:

df['time'] = pd.to_datetime(df['time'], format="%H:%M").dt.time

      

0


source


Assuming we have the following DF:

In [380]: df
Out[380]:
     time
0   00:00
1   08:29
2   08:30
3   08:31
4   12:29
5   12:30
6   12:31
7   20:29
8   20:30
9   20:31
10  23:59

In [381]: df.dtypes
Out[381]:
time    object
dtype: object

      

Consider this solution:

In [382]: bins = [-1, 830, 1230, 2030, 2400]
     ...: labels = [0,1,2,3]
     ...: df['shift'] = pd.cut(df.time.str.replace(':','').astype(int),
     ...:                      bins=bins, labels=labels, right=False)
     ...: df.loc[df['shift']==0, 'shift'] = 3
     ...:

In [383]: df
Out[383]:
     time shift
0   00:00     3
1   08:29     3
2   08:30     1
3   08:31     1
4   12:29     1
5   12:30     2
6   12:31     2
7   20:29     2
8   20:30     3
9   20:31     3
10  23:59     3

      

Explanation:

  • first we convert time

    to a numeric value, for example 08:29

    829

    , 12:31

    1231

    etc.
  • now we can cut them into 4 cells (shifts): [0,1,2,3]

    NOTE: labels must be unique, so we could not specify[3,1,2,3]

  • Finally, we have to change the 0

    3

    as we have to split the interval between 20:30 - 08:30

    by two: 00:00 - 08:30

    and20:30 - 23:59:59

0


source







All Articles