Python: MySQLdb library encoding problem

I have a mysql db. I have set charset to utf8;

...
  PRIMARY KEY  (`username`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 | 
...

      

I am connecting to db in python with MySQLdb;

conn = MySQLdb.connect(host = "localhost",
                               passwd = "12345",
                               db = "db",
                               charset = 'utf8',
                               use_unicode=True)

      

When I make a request, the response is decoded with "windows-1254". Sample response;

curr = conn.cursor(MySQLdb.cursors.DictCursor)
select_query = 'SELECT * FROM users'
curr.execute(select_query)

for ret in curr.fetchall():
    username = ret["username"]
    print "repr-username; ", repr(username)
    print "username; "username.encode("utf-8")
...

      

:

repr-username;  u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
username;  şükrüçağlüli

      

When I type the username with "windows-1254" it works fine;

...
print "repr-username; ", repr(username)
print "username; ", username.encode("windows-1254")
...

      

Output:

repl-username;  u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
username;  şükrüçağlüli

      

When I try to use some other characters like the Cyrillic alphabet, the decoding changes dynamically. How can I prevent this?

+3


source to share


1 answer


I think the items are being encoded incorrectly while INSERT into the database.

I recommend python-ftfy (from https://github.com/LuminosoInsight/python-ftfy ) (helped me solve simillar problem):



import ftfy

username = u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
print ftfy.fix_text(username) # outputs şükrüçağlüli

      

+3


source







All Articles