Continuous analysis of CSV files that are updated by another process

If I have a bunch of files csv

and they update periodically. Let's say csv files:

file1.csv, file2.csv file3.csv


The update process adds data to the last line of the file csv


it is possible to read data from a file csv

and update it and store it in array

or collection(deque)


Is there a way to collect data from the csv file as it is updated?


source to share

2 answers

You can use a python package called Watchdog .

This example shows recursively tracking the current directory for file system changes and logging to the console:

import time
from watchdog.observers import Observer
from import LoggingEventHandler

if __name__ == "__main__":
    event_handler = LoggingEventHandler()
    observer = Observer()
    observer.schedule(event_handler, path='.', recursive=True)
        while True:
    except KeyboardInterrupt:


You can use this in conjunction with Ignacio's answer - use file_pointer.tell()

to get the current position in the file and then seek()

next time and read the rest of the file. For example:

# First time
with open('current.csv', 'r') as f:
    data = f.readlines()
    last_pos = f.tell() 

# Second time
with open('current.csv', 'r') as f:
    new_data = f.readlines()
    last_pos = f.tell()




Compare the current file size with the current offset in the file. If the size is larger, read the new data.



All Articles