Django download and parse CSV file with correct encoding

I am trying to upload and process a CSV file in my Django project, but I am getting an encoding error, the CSV file is generated on mac with excel ..

reader = csv.reader(request.FILES['file'].read().splitlines(), delimiter=";")
    if withheader:
        reader.next()

data = [[field.decode('utf-8') for field in row] for row in reader]

      

In this example code, I am getting an error: http://puu.sh/1VmXc

If I use the latin-1 decoder, I get a different "error".

data = [[field.decode('latin-1') for field in row] for row in reader]

      

the result is: vยพgmontere and the result should be: vtegmontere

Does anyone know what to do? .. I've tried a lot!

+4


source to share


1 answer


  1. The Python 2 module csv

    comes with a lot of unicode hassle. Try this insteadunicodecsv

    or use Python 3.
  2. Excel on Mac exports to CSV with broken encoding. Don't use it, instead use something useful like LibreOffice (has a much better CSV export with options).
  3. When handling custom files: either make sure the files are sequentially encoded to UTF-8 and only decoded to UTF-8 (recommended), or use an encoding detection library such as chardet .


+4


source







All Articles