Python pandas gives wrong weekday index for DatetimeIndex

I want to get time series data and calculate the average number of rows on a weekday (Monday, Tuesday, ...). My details are as follows:

timestamp       maxCapacity
Mon Aug  4 14:47:00 EDT 2014    6741
Mon Aug  4 14:48:01 EDT 2014    6741

      

To do this, I start by indexing the data frame by timestamp. Then I create a new column, getting the weekday from the timestamp index. However, the new column does not correctly assign weekday numbers.

Here is the code to create the problem.

import wget, pandas, csv
from dateutil import parser
url = 'https://www.dropbox.com/s/kbti3i8uzy82hw6/maxCapacity?dl=1'
dataFile = 'maxCapacitySample'
if not os.path.exists(dataFile):
    wget.download(url, out=dataFile)

parse = lambda x: parser.parse(x)

tdata = pandas.read_csv(dataFile,
                        parse_dates={"Datetime":['timestamp',]},
                        index_col='Datetime',
                        keep_date_col=False,
                        date_parser=parse,
                        dialect=csv.excel_tab)

tdata['weekday'] = tdata.index.weekday
print tdata.head()

      

Output

                       maxCapacity  weekday
Datetime
2014-08-04 14:40:00-04:00         6741        0
2014-08-04 14:47:00-04:00         6741        3
2014-08-04 14:48:01-04:00         6741        3
2014-08-04 14:49:00-04:00         6741        3
2014-08-04 14:50:00-04:00         6741        3

      

The problem is that the same day (4th) is displayed on weekdays 0 and 3. What am I doing wrong?

+3


source to share


1 answer


I managed to get a workaround via:

tdata['weekday'] = pandas.to_datetime(tdata.index.values).weekday

      



Resulting DataFrame:

                           maxCapacity  weekday
Datetime
2014-08-04 14:40:00-04:00         6741        0
2014-08-04 14:47:00-04:00         6741        0
2014-08-04 14:48:01-04:00         6741        0
2014-08-04 14:49:00-04:00         6741        0
2014-08-04 14:50:00-04:00         6741        0
2014-08-04 14:51:00-04:00         6741        0
2014-08-04 14:52:00-04:00         6741        0
2014-08-04 14:53:00-04:00         6741        0
2014-08-04 14:54:00-04:00         6741        0
2014-08-04 14:55:00-04:00         6741        0
...                                ...      ...
2014-08-20 09:37:00-04:00         6652        2
2014-08-20 09:38:00-04:00         6654        2
2014-08-20 09:39:00-04:00         6651        2
2014-08-20 09:40:00-04:00         6642        2
2014-08-20 09:41:00-04:00         6648        2
2014-08-20 09:42:00-04:00         6654        2
2014-08-20 09:43:00-04:00         6646        2
2014-08-20 09:44:00-04:00         6659        2
2014-08-20 09:45:00-04:00         6650        2
2014-08-20 09:46:00-04:00         6655        2

[6589 rows x 2 columns]

      

+1


source







All Articles