32-bit unicode in python

Python has an escape sequence \u

for displaying Unicode values. However, this is limited to 16-bit Unicode values ​​only. it

>>> '\u1020'
'α€ '

      

Whereas 32-bit unencoded values ​​don't work. it

>>> '\u00001000'
'\x001000'

      

This is obviously wrong. The python documentation mentions

The \ u0020 escape sequence indicates the insertion of a Unicode character with ordinal value 0x0020 (space character) at this position.

+3


source to share


1 answer


Python As Unicode clearly mentions the use of '\U'

32-bit Unicode to represent sequences.

>>> "\u0394"                          # Using a 16-bit hex value
'Ξ”'
>>> "\U00000394"                      # Using a 32-bit hex value
'Ξ”'

      



In this case

>>> '\U00001000'
'α€€'

      

+6


source







All Articles