Adding timezone to time in pandas dataframe
I have my column with time in seconds. And the timezone of that time is in UTC, but Pandas doesn't know that. I would like to add this information.
df_data['transaction_creation_date']
0 1484161304
1 1489489785
2 1489161124
3 1488904824
4 1484908677
5 1485942900
6 1490854506
7 1485895432
8 1485975392
9 1489266328
10 1488525196
11 1490363033
12 1490617794
13 1486560642
14 1487170224
15 1484923852
So, I am doing something like this:
df_times = pd.DatetimeIndex(pd.to_datetime(df_data['transaction_creation_date'], unit='s'))
df_times = df_times.tz_localize(pytz.utc)
And when I print the timestamps stored in df_times
, then I have:
print(df_times.strftime('%s'))
['1484157704' '1489486185' '1489157524' ..., '1490684098' '1490284646'
'1489602636']
So...
My UTC time is on line 0: 1484161304
after I added the timezone information it changed to 1484157704
...
My time zone is "Europe / Warsaw" and the difference between my time zone and UTC is 3600
as well 1484161304 - 1484157704 = 3600
.
So, Pandas was treating my UTC times as "Europe / Warsaw" and switching them back one hour to make them UTC, which messed up my data.
How do I set the UTC time zone for my time to prevent this from happening?
source to share
So I was unable to reproduce your results, but I am using a slightly different method to show the generated timestamp. I did not use the somewhat poorly maintained one %s
, but instead directly calculating the number of seconds since UTC epoch:
Code:
utc_at_epoch = pytz.utc.localize(dt.datetime(1970, 1, 1))
for t in df_times.tz_localize(pytz.utc):
print(int((t - utc_at_epoch).total_seconds()))
Test code:
import pandas as pd
import datetime as dt
import pytz
df_data = pd.DataFrame([
1484161304,
1489489785,
1489161124,
], columns=['transaction_creation_date'])
print(df_data)
df_times = pd.DatetimeIndex(pd.to_datetime(
df_data['transaction_creation_date'], unit='s'))
utc_at_epoch = pytz.utc.localize(dt.datetime(1970, 1, 1))
for t in df_times.tz_localize(pytz.utc):
print(int((t - utc_at_epoch).total_seconds()))
Results:
transaction_creation_date
0 1484161304
1 1489489785
2 1489161124
1484161304
1489489785
1489161124
source to share