Str for bytes in Python3.3

How can I get b'\xe3\x81\x82'

from '\xe3\x81\x82'

?

Finally, I want u'\u3042'

what the Japanese letter "あ" means,

b'\xe3\x81\x82'.decode('utf-8')

does u'\u3042'

but

'\xe3\x81\x82'.decode('utf-8')

raises the following error:

AttributeError: 'str' object has no attribute 'decode'

      

because b'\xe3\x81\x82'

are bytes and '\xe3\x81\x82'

are str.

I have a database with data like '\xe3\x81\x82'

.

+3


source to share


1 answer


If you have bytes masking as Unicode code points, encode latin-1:

'\xe3\x81\x82'.encode('latin1').decode('utf-8')

      



Latin-1 (ISO-8859-1) maps Unicode codes one by one for bytes:

>>> '\xe3\x81\x82'.encode('latin1').decode('utf-8')
'あ'

      

+4


source







All Articles