Aggregation of time to a specific slot

I have a dataframe df

that has several weeks of 2

minute data:

df.dtypes

time_stamp      datetime64[ns]
Day_name                object 
x                     int64
y                   int64

      

df.head

time_stamp             Day_name   x   y  
0 2017-05-17 14:28:35  Wednesday  100 200   
1 2017-05-17 14:30:32  Wednesday  300 400

      

I want to aggregate the metrics x

and y

and find their average for an average 15 minute period. I originally had an epoch mark, but I converted it to the datetime

one shown above.

time_stamp             Day_name   x   y        15_min_slot
    0 2017-05-17 14:28:35  Wednesday  100 200  14:15
    1 2017-05-17 14:30:32  Wednesday  300 400  14:30

      

How should I do it?

I can find an hour in:

df['hour'] = df['time_stamp'].dt.hour

df['minute'] = df['time_stamp'].dt.minute

Eventually I will do the following:

output = df.groupby(['15_min_slot'],as_index=False)['x'].mean()

+3


source to share


3 answers


You can use Grouper

in conjunction with an argument freq

, ie:



df.groupby(pd.Grouper(key=df['time_stamp'], freq='15T')).mean()

      

+3


source


Use resample

:

First create a datetimeindex for your dataframe

df = df.set_index('time_stamp')
df.index = pd.to_datetime(df.index,format='%Y-%m-%d %H:%M:%S')

      

Then use resample

with 15T

and mean

:



df.resample('15T').mean()

      

Output:

                       x    y
time_stamp                   
2017-05-17 14:15:00  100  200
2017-05-17 14:30:00  300  400

      

+1


source


You can use df.resample()

as in

df.resample('15T').mean()

      

A list of options can be found here: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases

0


source







All Articles