How can I parse a gzip file in Python and then add to it even if it doesn't exist?

Let's say you have a file that contains multiple values, one per line. You want to open this file, parse these values, do some processing, and later add new values ​​to this file. If the file does not exist, it must be created and added after processing.

The usual way to do it would be something like this:

with open(path, 'a+') as data:
  for line in file:
    values.update(line.strip())

  for line in magic_processing():
    print(line, file = data)

      

However, doing this with using gzip.open

instead open

fails with the following message:

AttributeError: 'GzipFile' object has no attribute 'extrastart'

      

This seems to be due to this error , which leads me to believe that there must be a nicer or standard way to do this.

+3


source to share


1 answer


When you use open(file, 'a+')

with a mode a+

, which means you add it as well:

  • a+

    Open for reading and adding (write at the end of the file). The file is created if it does not exist. The starting position of the file to be read is at the beginning of the file, but output is appended to the end of the file (but on some Unix systems, regardless of the current search position).


But you tried to update lines that might be in the middle of the file. So I would suggest using r+

or w+

instead of

with open(path, 'r+') as data:
  for line in file:
    values.update(line.strip())

  for line in magic_processing():
    print(line, file = data)

      

-1


source







All Articles