Pandas raises ValueError when converting DatetimeIndex

I am converting all ISO-8601 formatted values ​​to Unix values. For some inexplicable reason, this line

a_col = pd.DatetimeIndex(a_col).astype(np.int64)/10**6

      

raises an error

ValueError: Unable to convert 0 2001-06-29

... (Abbreviated column output

Name: DateCol, dtype: datetime64 [ns] to datetime dtype

This is very strange, because I ensured that each value is in datetime.datetime format, as you can see here:

if a_col.dtypes is (np.dtype('object') or np.dtype('O')):
      a_col = a_col.apply(lambda x: x if isinstance(x, datetime.datetime) else epoch)
a_col = pd.DatetimeIndex(a_col).astype(np.int64)/10**6

      

The epoch is datetime.datetime.

When I check the dtypes of the column that gives me the error it is "object", which is exactly what I check. Is there something I am missing?

+3


source to share


1 answer


Assuming your timezone is US / Eastern (based on your dataset) and that your DataFrame has a name df

, try this:

import datetime as dt
from time import mktime
import pytz

df['Job Start Date'] = \
    df['Job Start Date'].apply(lambda x: mktime(pytz.timezone('US/Eastern').localize(x)
                                         .astimezone(pytz.UTC).timetuple()))

>>> df['Job Start Date'].head()
0     993816000
1    1080824400
2    1052913600
3    1080824400
4    1075467600
Name: Job Start Date, dtype: float64

      



First you need to make your "naive" time and time data (in US / Eastern) and then convert it to UTC. Finally, pass your new UTC datetime object as a graph to the function mtkime

from the time unit.

+1


source







All Articles