Str for bytes in Python3.3

Question

How can I get b'\xe3\x81\x82'

from '\xe3\x81\x82'

?

Finally, I want u'\u3042'

what the Japanese letter "あ" means,

b'\xe3\x81\x82'.decode('utf-8')

does u'\u3042'

but

'\xe3\x81\x82'.decode('utf-8')

raises the following error:

AttributeError: 'str' object has no attribute 'decode'

because b'\xe3\x81\x82'

are bytes and '\xe3\x81\x82'

are str.

I have a database with data like '\xe3\x81\x82'

.

+3

papico 01 dec. 14 at 12:32

1 answer

Martijn pieters · Answer 1 · 2014-12-01T12:36:37+0000

If you have bytes masking as Unicode code points, encode latin-1:

'\xe3\x81\x82'.encode('latin1').decode('utf-8')

Latin-1 (ISO-8859-1) maps Unicode codes one by one for bytes:

>>> '\xe3\x81\x82'.encode('latin1').decode('utf-8')
'あ'