Using csv module to read ascii delimited text?

You may or may not know ASCII delimited text , which has the pleasant advantage of using non-keyboard characters to separate fields and lines.

Writing this is pretty simple:

import csv

with open('ascii_delim.adt', 'w') as f:
    writer = csv.writer(f, delimiter=chr(31), lineterminator=chr(30))
    writer.writerow(('Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue'))
    writer.writerow(('Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!'))

      

However, when reading lineterminator

it does nothing, and if I try to do:
open('ascii_delim.adt', newline=chr(30))

      

He throws away ValueError: illegal newline value:

So how can I read in my ASCII delimited file? Have I retreated to execution line.split(chr(30))

?

+3


source to share


4 answers


You can do this by effectively translating the end-of-line characters in the file into newlines csv.reader

for hardcoding:

import csv

with open('ascii_delim.adt', 'w') as f:
    writer = csv.writer(f, delimiter=chr(31), lineterminator=chr(30))
    writer.writerow(('Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue'))
    writer.writerow(('Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!'))

def readlines(f, newline='\n'):
    while True:
        line = []
        while True:
            ch = f.read(1)
            if ch == '':  # end of file?
                return
            elif ch == newline:  # end of line?
                line.append('\n')
                break
            line.append(ch)
        yield ''.join(line)

with open('ascii_delim.adt', 'rb') as f:
    reader = csv.reader(readlines(f, newline=chr(30)), delimiter=chr(31))
    for row in reader:
        print row

      



Output:

['Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue']
['Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!']

      

+4


source


The documentation says:

The reader is hardcoded to recognize either "\ r" or "\ n" as the end of the line, and ignores the liner. This may change in the future.



Thus, the module csv

cannot read CSV files that use their own line terminators.

+2


source


Hey, I've been struggling with a similar problem all day. I wrote a function heavily inspired by @martineau that should solve it for you. My function is slower, but can parse files delimited by any lines. Hope this helps!

import csv

def custom_CSV_reader(csv_file,row_delimiter,col_delimiter):

    with open(csv_file, 'rb') as f:

        row = [];
        result = [];
        temp_row = ''
        temp_col = ''
        line = ''
        go = 1;

        while go == 1:
            while go == 1:
                ch = f.read(1)

                if ch == '':  # end of file?
                    go = 0

                if ch != '\n' and ch != '\t' and ch != ',':
                    temp_row = temp_row + ch
                    temp_col = temp_col + ch
                    line = line + ch

                if row_delimiter in temp_row:
                    line = line[:-len(row_delimiter)]

                    row.append(line)

                    temp_row = ''
                    line= ''

                    break

                elif col_delimiter in temp_col:
                    line = line[:-len(col_delimiter)]
                    row.append(line)
                    result.append(row)

                    row = [];
                    temp_col = ''
                    line = ''
                    break
    return result

      

0


source


Per documents foropen

:

newline controls the behavior of the universal newline (applicable only to text mode). This can be None

, ''

, '\n'

, '\r'

and '\r\n'

.

so open

will not process your file. Per csv

docs
:

Note reader

hard-coded to recognize either '\r'

or '\n'

both end of the line, and ignores the determinant.

so that he doesn't do it. I also looked to see if there was a config str.splitlines

, but it uses a certain set of boundaries.

Have I retreated to execution line.split(chr(30))

?

It looks like this, sorry!

-1


source







All Articles