Pandas dataframe column handling with mixed date formats

Question

Pandas dataframe column handling with mixed date formats

I imported a CSV file with mixed data formats - some date formats recognized by read_csv plus some Excel sequential datetime format (e.g. 41,866.321).

After importing the data, the column type is displayed as an object (taking into account different data types) and dates (both types of formats) have a dtype string.

I would like to use the to_datetime method to convert the recognized string date formats to datetime in the dataframe column, leaving the unrecognized strings in excel format, which I can then isolate and fix off the line. But if I don't use the row-row method (too slow) it can't do it.

Does anyone have a smarter way to solve this?

Update: After reworking some more, I found this solution using coerce = True to force the data type conversion of the column, and then identified null values that I can cross-reference the original file. But if there is a better way to do it (like capturing unrecognized timestamps in place) please let me know.

df1['DateTime']=pd.to_datetime(df1['Time_Date'],coerce=True)
nulls=df1['Time_Date'][df1['Time_Date'].notnull()==False]

+3

python pandas datetime csv

Will H 14 nov. 14 at 12:14

source to share

1 answer

Will H · Accepted Answer · 2014-11-16T20:48:46+0000

After digging a little more, I found this solution, using coerce = True to force the data type conversion of the column and then identifying null values that I can convert back to the original file. But if there is a better way to do this (like setting unrecognized timestamps in place), please let me know.

df1['DateTime']=pd.to_datetime(df1['Time_Date'], errors='coerce')
nulls=df1['Time_Date'][df1['Time_Date'].notnull()==False]

Pandas dataframe column handling with mixed date formats

More articles: