Time series from a list of dates-python
I have a list of dates (ListA) each record that an event is presented in. How do I make a time series from a list in python3? The sequence of dates would be on the X axis and the frequency of each date would be on the Y
ListA = [2016-04-05, 2016-04-05, 2016-04-07, 2016-09-10, 2016-03-05, 2016-07-11, 2017-01-01]
Desired output:
[2016-04-05, 2], [2016-04-06, 0], [2016-04-07, 1],
[2016-04-08, 0], ……………… .., [2017-01-01, 1]
Required output format:
[[Date, Frequency],....,*]
I have a date code like:
Date=pd.date_range('2016-04-05', '2017-01-01', freq='D')
Print(Date)
What gives:
[2016-04-05, 2016-04-06, 2016-04-07, ....,]
I need something like the code below to go to the date above to get the frequency for each date.
for item in ListA:
if item>=Date[0] and item<Date[1]:
print(ListA.count(item))
source to share
Using Counter
from a module collections
, this is very straight forward:
Code:
dates = [
'2016-04-05',
'2016-04-05',
'2016-04-07',
'2016-09-10',
'2016-03-05',
'2016-07-11',
'2017-01-01'
]
from collections import Counter
counts = Counter(dates)
print(sorted(counts.items()))
Results:
[('2016-03-05', 1), ('2016-04-05', 2),
('2016-04-07', 1), ('2016-07-11', 1),
('2016-09-10', 1), ('2017-01-01', 1)]
Create a list above pandas.DatetimeIndex
:
To create a list of lists based on a range of dates is easy enough, because it Counter
will return 0
when indexed with a value for which the counter is zero.
# pandas date range
dates = pd.date_range('2016-04-05', '2017-01-01', freq='D')
# counter for date we need counted
counts = Counter(pd.to_datetime(dates))
# build a list using a list comprehension of counts at all dates in range
date_occurence_sequence = [[d, counts[d]] for d in dates]
Add per day:
And since you seem to be using pandas
, let's insert the number of matches into the dataframe indexed per day.
import pandas as pd
index = pd.date_range('2016-04-05', '2017-01-01', freq='D')
df = pd.DataFrame([0] * len(index), index=index)
df.update(pd.DataFrame.from_dict(Counter(pd.to_datetime(dates)), 'index'))
print(df.head())
Results:
0
2016-04-05 2.0
2016-04-06 0.0
2016-04-07 1.0
2016-04-08 0.0
2016-04-09 0.0
source to share