Continuous analysis of CSV files that are updated by another process
If I have a bunch of files csv
and they update periodically. Let's say csv files:
file1.csv, file2.csv file3.csv
The update process adds data to the last line of the file csv
.
it is possible to read data from a file csv
and update it and store it in array
or collection(deque)
.
Is there a way to collect data from the csv file as it is updated?
source to share
You can use a python package called Watchdog .
This example shows recursively tracking the current directory for file system changes and logging to the console:
import time
from watchdog.observers import Observer
from watchdog.events import LoggingEventHandler
if __name__ == "__main__":
event_handler = LoggingEventHandler()
observer = Observer()
observer.schedule(event_handler, path='.', recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
You can use this in conjunction with Ignacio's answer - use file_pointer.tell()
to get the current position in the file and then seek()
next time and read the rest of the file. For example:
# First time
with open('current.csv', 'r') as f:
data = f.readlines()
last_pos = f.tell()
# Second time
with open('current.csv', 'r') as f:
f.seek(last_pos)
new_data = f.readlines()
last_pos = f.tell()
source to share