Pandas read csv dataframe data lines from a specific date and time
I have a data file from a CSV file like this:
TimeStamp
0 12/7/2017 8:00
1 12/7/2017 7:00
2 12/7/2017 6:00
3 12/7/2017 5:00
4 12/7/2017 4:00
5 12/7/2017 3:00
6 12/7/2017 2:00
7 12/7/2017 1:00
8 12/7/2017 0:00
9 11/7/2017 23:00
10 11/7/2017 22:00
...
9996 3/12/2015 6:00
9997 3/12/2015 5:00
9998 3/12/2015 4:00
9999 3/12/2015 3:00
Name: TimeStamp, Length: 10000, dtype: object
I am trying to use Pandas
to read data from a specific date and time, for example from 11/7/2017 8:00
to 12/7/2017 8:00
.
I tried using methods Boolean mask
, DatetimeIndex
and .Between
, and it was reading data from this range both from 2016 and 2015. Here are my codes,
import pandas as pd
eurusd = pd.read_csv('fxhistoricaldata_EURUSD_hour.csv')
eurusd = eurusd[(eurusd['TimeStamp'] >= '11/7/2017 8:00') &
(eurusd['TimeStamp'] <= '12/7/2017 8:00')]
print(eurusd['TimeStamp'])
or using .between,
eurusd = eurusd[eurusd['TimeStamp'].between('11/7/2017 8:00', '12/7/2017 8:00')]
Results as such,
2 12/7/2017 6:00
3 12/7/2017 5:00
4 12/7/2017 4:00
5 12/7/2017 3:00
6 12/7/2017 2:00
7 12/7/2017 1:00
8 12/7/2017 0:00
23 11/7/2017 9:00
24 11/7/2017 8:00
513 12/6/2017 23:00
514 12/6/2017 22:00
515 12/6/2017 21:00
516 12/6/2017 20:00
517 12/6/2017 19:00
518 12/6/2017 18:00
519 12/6/2017 17:00
520 12/6/2017 16:00
521 12/6/2017 15:00
522 12/6/2017 14:00
523 12/6/2017 13:00
524 12/6/2017 12:00
525 12/6/2017 11:00
...
8827 12/2/2016 5:00
8828 12/2/2016 4:00
8829 12/2/2016 3:00
Name: TimeStamp, Length: 305, dtype: object
Can anyone help me fix my problem or is there any function that can help me accomplish my task? Any help is greatly appreciated!
source to share
You need to make sure that eurusd['TimeStamp']
is a series with a dtype datetime64[ns]
:
eurusd['TimeStamp'] = pd.to_datetime(eurusd['TimeStamp'])
Create a boolean mask:
mask = (eurusd['TimeStamp'] > start_date) & (eurusd['TimeStamp'] <= end_date)
Now reinstall (or whatever you want to do with the output):
eurusd = eurusd.loc[mask]
source to share
You can do it with the datetime and dateutil libraries:
from dateutil import parser
from datetime import datetime
df['TimeStamp'] = df['TimeStamp'].apply(lambda x: parser.parse(x))
df[(df['TimeStamp'] < datetime(2017, 12, 7)) & (df['TimeStamp']> datetime(2015, 3, 12))] #or whichever dates you're looking for
source to share