Python using re-module to parse imported text file

def regexread():
    import re

    result = ''
    savefileagain = open('sliceeverfile3.txt','w')

    #text=open('emeverslicefile4.txt','r')
    text='09,11,14,34,44,10,11,  27886637,    0\n561, Tue, 5,Feb,2013, 06,25,31,40,45,06,07,  19070109,    0\n560, Fri, 1,Feb,2013, 05,21,34,37,38,01,06,  13063500,    0\n559, Tue,29,Jan,2013,'

    pattern='\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d'
    #with open('emeverslicefile4.txt') as text:     
    f = re.findall(pattern,text)

    for item in f:
        print(item)

    savefileagain.write(item)
    #savefileagain.close()

      

The above function, as written, parses the text and returns sets of seven numbers. I have three problems.

  • First, the "read" file, which contains exactly the same text as text = '09, ... etc, returns TypeError expected string or buffer

    , which I cannot solve even after reading some of the messages.
  • Secondly, when I try to write the results to the "write" file, nothing is returned and
  • Third, I am not sure how to get the same output I get with the print statement, which is three lines of seven numbers, each of which is the result I want.

This is the first time I've used a regex, so be careful!

+3


source to share


2 answers


This should do the trick, check the comments for an explanation of what Im doing here =) Good luck.



import re
filename = 'sliceeverfile3.txt'
pattern  = '\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d'
new_file = []

# Make sure file gets closed after being iterated
with open(filename, 'r') as f:
   # Read the file contents and generate a list with each line
   lines = f.readlines()

# Iterate each line
for line in lines:

    # Regex applied to each line 
    match = re.search(pattern, line)
    if match:
        # Make sure to add \n to display correctly when we write it back
        new_line = match.group() + '\n'
        print new_line
        new_file.append(new_line)

with open(filename, 'w') as f:
     # go to start of file
     f.seek(0)
     # actually write the lines
     f.writelines(new_file)

      

+9


source


You seem to be on the right track ...

You will be iterating over a file: How to iterate over a file in python



and apply regex to each line. The above link should really answer all three questions when you realize that you are trying to write an "item" that does not exist outside of this loop.

0


source







All Articles