Read a huge .txt file with python

I have a problem reading a huge txt file using python. I have to read all ~ 500M lines of a 33GB .txt file one by one, but for some unknown reason my script stops at line 7446633 and gives no errors. The script is as follows:

file = open ("file.txt","r")
i = 0
for line in file:
    i = i + 1
print i
file.close()

      

I tried the script on multiple machines and both 32-bit and 64-bit versions of python, but not with luck ..

Does anyone know what the problem is?

+3


source to share


1 answer


Try using the "c" operator.

with open("file.txt") as input_file:
    for line in input_file:
        process_line(line)

      



Also you might think about parallel string processing using celery or something similar.

Later edit: if that doesn't work try opening files and then use a range to read lines (read in batches).

-2


source







All Articles