Unicode escape won't work with some characters

I have a program where I want to use some Unicode characters like μ and index p. When I do this

print u"\xb5"

      

it works fine, but when i do it,

print u"\u209A"

      

I am getting this error message:

Traceback (most recent call last):
  File "C:/Users/tech/Desktop/Circuit Design Tool/Test 2.py", line 1, in <module>
    print u"\u209A"
  File "C:\Python27\lib\encodings\cp1252.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u209a' in position 0: character maps to <undefined>

      

Why is this happening? Are these correct Unicode escape sequences?

+3


source to share


3 answers


The Windows console simply does not support Unicode for applications that use the I / O functions of the C standard library (such as Python).

Basically you can, as other comments suggest, change the code page to 65001 (and set the PYTHONIOENCODING environment variable to match utf-8), in practice there are some long-standing bugs in the host Console support for this code page so that when trying it use, you could get double prints or freezes. This is usually unusable.



The reliable way to get Unicode from the Windows Console (well, just as reliable as you are, the user still has to select a TTF font to be able to see it) is to invoke Win32 WriteConsoleW

/ ReadConsoleW

act directly instead of relying on the C stdlib. If you really need to do this, the win_unicode_console package completes it for you.

(Generally the easier option is to ditch the Windows console and use some other environment, such as an IDE.)

+1


source


Because of this, your console's standard encoding cp1252

and cannot decode your Unicode. Instead, you need another correct encoding, eg utf-8

.

Since my terminal's standard encoding utf-8

, it prints it correctly:

>>> print u"\u209A"

      

But if I use encoding cp1252

it throws an error like what you got:



>>> u"\u209A".encode('cp1252')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/encodings/cp1252.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode character '\u209a' in position 0: character maps to <undefined>
>>> 

      

You can change the default encoding to utf8

using the following command on Windows:

chcp 65001

      

OR you can also change it visually check this question for more information: Unicode characters in the Windows command line - how?

0


source


To set the command line on Windows to display utf-8 strings, use the command chcp

(for utf-8 do - chcp 65001

) -

chcp 65001

      

For other such encodings and their respective code pages (cp) check here .

0


source







All Articles