Read a huge .txt file with python
I have a problem reading a huge txt file using python. I have to read all ~ 500M lines of a 33GB .txt file one by one, but for some unknown reason my script stops at line 7446633 and gives no errors. The script is as follows:
file = open ("file.txt","r")
i = 0
for line in file:
i = i + 1
print i
file.close()
I tried the script on multiple machines and both 32-bit and 64-bit versions of python, but not with luck ..
Does anyone know what the problem is?
+3
source to share
1 answer
Try using the "c" operator.
with open("file.txt") as input_file:
for line in input_file:
process_line(line)
Also you might think about parallel string processing using celery or something similar.
Later edit: if that doesn't work try opening files and then use a range to read lines (read in batches).
-2
source to share