Aggregation of time to a specific slot
I have a dataframe df
that has several weeks of 2
minute data:
df.dtypes
time_stamp datetime64[ns]
Day_name object
x int64
y int64
df.head
time_stamp Day_name x y
0 2017-05-17 14:28:35 Wednesday 100 200
1 2017-05-17 14:30:32 Wednesday 300 400
I want to aggregate the metrics x
and y
and find their average for an average 15 minute period. I originally had an epoch mark, but I converted it to the datetime
one shown above.
time_stamp Day_name x y 15_min_slot
0 2017-05-17 14:28:35 Wednesday 100 200 14:15
1 2017-05-17 14:30:32 Wednesday 300 400 14:30
How should I do it?
I can find an hour in:
df['hour'] = df['time_stamp'].dt.hour
df['minute'] = df['time_stamp'].dt.minute
Eventually I will do the following:
output = df.groupby(['15_min_slot'],as_index=False)['x'].mean()
source to share
Use resample
:
First create a datetimeindex for your dataframe
df = df.set_index('time_stamp')
df.index = pd.to_datetime(df.index,format='%Y-%m-%d %H:%M:%S')
Then use resample
with 15T
and mean
:
df.resample('15T').mean()
Output:
x y
time_stamp
2017-05-17 14:15:00 100 200
2017-05-17 14:30:00 300 400
source to share
You can use df.resample()
as in
df.resample('15T').mean()
A list of options can be found here: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
source to share