How can I read .ARC files from a Heritrix crawler using Python?
I looked at the Heritrix documentation website and they listed a Python .ARC file reader. However, when I clicked on it, it was not found. http://crawler.archive.org/articles/developer_manual/arcs.html
Does anyone else know of any Heritrix ARC reader that uses Python?
(I asked this question before, but closed it due to inaccuracy)
+2
source to share
1 answer
Google can't find anything similar: http://archive-access.cvs.sourceforge.net/viewvc/archive-access/archive-access/projects/hedaern/
+1
source to share