Timezone offset icon modified by Python dateutil?

Does anyone know why python dateutil changes the sign of the GMT offset when it parses a datetime field?

Apparently this function is a well-known result of not only dateutil but other parsing functions as well. But this results in the result of invalidating the datetime unless the hack preprocessing is applied:

from dateutil import parser

jsDT = 'Fri Jan 02 2015 03:04:05.678910 GMT-0800'
python_datetime = parser.parse(jsDT)
print(python_datetime)
>>> 2015-01-02 03:04:05.678910+08:00

jsDT = 'Fri Jan 02 2015 03:04:05.678910 GMT-0800'
if '-' in jsDT:
    jsDT = jsDT.replace('-','+')
elif '+' in jsDT:
    jsDT = jsDT.replace('+','-')
python_datetime = parser.parse(jsDT)
print(python_datetime)
>>> 2015-01-02 03:04:05.678910-08:00

      

+3


source to share


2 answers


Seems to dateutil

use POSIX style characters. It is not related to Python. Other software does this too. From tz database :

# We use POSIX-style signs in the Zone names and the output abbreviations,
# even though this is the opposite of what many people expect.
# POSIX has positive signs west of Greenwich, but many people expect
# positive signs east of Greenwich.  For example, TZ='Etc/GMT+4' uses
# the abbreviation "GMT+4" and corresponds to 4 hours behind UT
# (i.e. west of Greenwich) even though many people would expect it to
# mean 4 hours ahead of UT (i.e. east of Greenwich).

      

The tz database is used almost everywhere .

Example:

$ TZ=Etc/GMT-8 date +%z
+0800

      

You are probably expecting a different time zone:

>>> from datetime import datetime
>>> import pytz
>>> pytz.timezone('America/Los_Angeles').localize(datetime(2015, 1, 2, 3, 4, 5, 678910), is_dst=None).strftime('%Y-%m-%d %H:%M:%S.%f %Z%z')
'2015-01-02 03:04:05.678910 PST-0800'

      

Note: PST

rather than GMT

.

Although dateutil

uses POSIX style signs even for abbreviations PST

:



>>> from dateutil.parser import parse
>>> str(parse('2015-01-02 03:04:05.678910 PST-0800'))
'2015-01-02 03:04:05.678910+08:00'

      

datetime.strptime()

in Python 3 interprets it "correctly":

$ TZ=America/Los_Angeles python3                                               
...
>>> from datetime import datetime
>>> str(datetime.strptime('2015-01-02 03:04:05.678910 PST-0800', '%Y-%m-%d %H:%M:%S.%f %Z%z'))
'2015-01-02 03:04:05.678910-08:00'

      

Pay attention to the sign.

Despite the confusion over POSIX-style characters; The behavior is dateutil

unlikely to change. See dateutil

error: "GMT + 1" is parsed as "GMT-1" and @Lennart Regebro's answer:

Analyzing GTM + 1 in this way is actually part of the Posix specification. It is therefore a feature and not a bug.

See how the TZ

environment variable is defined in the POSIX specification
, glibc uses a similar definition .

It is not clear why it dateutil

uses POSIX syntax TZ

to interpret timezone information in a timeline. The syntax is not exactly the same, for example the POSIX syntax requires a semicolon: hh[:mm[:ss]]

in the utc offset which is missing from your input.

+4


source


This source code is for dateutil.parser.parse .

Check for something like GMT + 3 or BRST + 3. notice that this does not mean "I am 3 hours after GMT" but "my time is +3 GMT". If found, we'll reverse the logic so that the timezone parsing code gets it to the right.



And one more comment:

Something like GMT + 3, the time zone is not GMT.

+1


source







All Articles