Python decoding with errors = Replace
Using Python 2.7, I grab some HTML from the site as strings and immediately decode it to unicode. Since I need to find out later where the decode errors occurred, I thought it would be better to use errors = "replace" to exclude exceptions from non-ASCII characters:
linkname = curlinkname.decode("utf-8", errors="replace")
In most cases, this replaces the placeholder problem symbol. However, when I run the code, I still get an exception from this line at one specific character (Ε«):
UnicodeEncodeError: 'charmap' codec can't encode character u'\u016b' in position 1: character maps to <undefined>
What's happening?
+3
source to share