Python: MySQLdb library encoding problem
I have a mysql db. I have set charset to utf8;
...
PRIMARY KEY (`username`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 |
...
I am connecting to db in python with MySQLdb;
conn = MySQLdb.connect(host = "localhost",
passwd = "12345",
db = "db",
charset = 'utf8',
use_unicode=True)
When I make a request, the response is decoded with "windows-1254". Sample response;
curr = conn.cursor(MySQLdb.cursors.DictCursor)
select_query = 'SELECT * FROM users'
curr.execute(select_query)
for ret in curr.fetchall():
username = ret["username"]
print "repr-username; ", repr(username)
print "username; "username.encode("utf-8")
...
:
repr-username; u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
username; şükrüçağlüli
When I type the username with "windows-1254" it works fine;
...
print "repr-username; ", repr(username)
print "username; ", username.encode("windows-1254")
...
Output:
repl-username; u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
username; şükrüçağlüli
When I try to use some other characters like the Cyrillic alphabet, the decoding changes dynamically. How can I prevent this?
+3
source to share
1 answer
I think the items are being encoded incorrectly while INSERT into the database.
I recommend python-ftfy (from https://github.com/LuminosoInsight/python-ftfy ) (helped me solve simillar problem):
import ftfy
username = u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
print ftfy.fix_text(username) # outputs şükrüçağlüli
+3
source to share