How to remove parts of a file in python?

I have a file named a.txt that looks like this:

I am the first line
I am the second. There may be more lines here.

I am below a blank line.
I am the line.
There are more lines here.

Now I want to delete the content above the empty line (including the empty line). How can I do this with Pythonic?

0


source to share


5 answers


Basically, you cannot delete stuff from the beginning of the file, so you have to write to a new file.

I think pythonic looks like this:

# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
    # while the line is not empty drop it
    for line in lines:
        if not line.strip():
            break

    # now lines is at the point after the first paragraph
    # so write out everything from here
    with open("output.txt", 'wt') as out:
        out.writelines(lines)

      

Here are some simpler versions of this, without with

for older Python versions:



lines = open("input.txt", 'rt')
for line in lines:
    if not line.strip():
        break
open("output.txt", 'wt').writelines(lines)

      

and a very straight forward version that just splits the file on an empty line:

# first, read everything from the old file
text = open("input.txt", 'rt').read()

# split it at the first empty line ("\n\n")
first, rest = text.split('\n\n',1)

# make a new file and write the rest
open("output.txt", 'wt').write(rest)

      

Note that this can be quite fragile, for example windows are often used \r\n

as one line, so an empty line will be \r\n\r\n

. But often you know that the file format only uses one kind of strings, so that might be a good thing.

+3


source


Naive approach, repeating line by line in the file one after the other from top to bottom:



#!/usr/bin/env python

with open("4692065.txt", 'r') as src, open("4692065.cut.txt", "w") as dest:
    keep = False
    for line in src:
        if keep: dest.write(line)
        if line.strip() == '': keep = True

      

+2


source


The fileinput module (from the standard library) is handy for this kind of thing. It sets everything up so you can act as if you were editing a file "in place":

import fileinput
import sys

fileobj=iter(fileinput.input(['a.txt'], inplace=True))
# iterate through the file until you find an empty line.
for line in fileobj:
    if not line.strip():
        break
# Iterators (like `fileobj`) pick up where they left off. 
# Starting a new for-loop saves you one `if` statement and boolean variable.
for line in fileobj:
    sys.stdout.write(line)

      

+1


source


Any idea how big the file will be?

You can read the file into memory:

f = open('your_file', 'r')
lines = f.readlines()

      

which will read line by line and store those lines in a list (s).

Then close the file and run it again with 'w':

f.close()
f = open('your_file', 'w')
for line in lines:
    if your_if_here:
        f.write(line)

      

This will overwrite the current file. Then you can choose which lines from the list you want to write. This is probably not a good idea if the file gets large, as the entire file must be in memory. But there is no need to create a second file to output your output for this.

0


source


from itertools import dropwhile, islice

def content_after_emptyline(file_object):
    return islice(dropwhile(lambda line: line.strip(), file_object), 1, None)

with open("filename") as f:
    for line in content_after_emptyline(f):
        print line,

      

0


source







All Articles