Python loop exception

Hey i'm new to python and i need help. I wrote down the following code:

 try:
  it = iter(cmLines)
  line=it.next()
  while (line):
    if ("INFERNAL1/a" in line) or ("HMMER3/f" in line) :
      title = line
      line = it.next()
      if word2(line) in namesList: //if second word in line is in list
        output.write(title)
        output.write(line)
        line = it.next()
        while ("//" not in line):
          output.write(line)
          line = it.next()
        output.write(line)
    line = it.next()
except Exception as e:
  print "Loop exited becuase:"
  print type(e)
  print "at " + line
finally:
  output.close()

      

  • When the loop ends, it always throws an exception that notifies that the loop has stopped. It didn't end prematurely though. How do you stop this?

  • Is there a better way to write my code? Something more stylish. I have a large file with a lot of information and I am trying to catch only the information I need. Each piece of information has the format:

    Infernal1/a ...
    Name someSpecificName
    ...
    ...
    ...
    ...
    // 
    
          

thank

+3


source to share


5 answers


RocketDonkey's answer is in place. Due to the complexity of how you iterate, there is no easy way to do this with a loop for

, so you need to explicitly handle it StopIteration

.

However, if you rethink the problem a bit, there are other ways to get around this. For example, a trivial state machine:

try:
    state = 0
    for line in cmLines:
        if state == 0:
            if "INFERNAL1/a" in line or "HMMER3/f" in line:
                title = line
                state = 1
        elif state == 1:
            if word2(line) in NamesList:
                output.write(title)
                output.write(line)
                state = 2
            else:
                state = 0
        elif state == 2:
            output.write(line)
            if '//' in line:
                state = 0
except Exception as e:
    print "Loop exited becuase:"
    print type(e)
    print "at " + line
finally:
    output.close()

      

Alternatively, you can write a generator function that delegates to subgenerators (via yield from foo()

if you're in 3.3, via for x in foo(): yield x

if not), or various other possibilities, especially if you rethink your problem to a higher level.

This may not be what you want to do here, but it is usually worth thinking about if I can include this loop while

and two explicit calls next

in the loop for

? "even if the answer is" No, by not making things less readable. "

As a side note, you can probably simplify things by replacing try

/ finally

with a statement with

. Instead of this:



output = open('foo', 'w')
try:
    blah blah
finally:
    output.close()

      

You can simply do this:

with open('foo', 'w') as output:
    blah blah

      

Or, if output

not a normal file, you can still replace the last four lines:

with contextlib.closing(output):
    blah blah

      

+2


source


When you call line = it.next()

when nothing goes beyond StopIteration

:

>>> l = [1, 2, 3]
>>> i = iter(l)
>>> i.next()
1
>>> i.next()
2
>>> i.next()
3
>>> i.next()
Traceback (most recent call last):
  File "<ipython-input-6-e590fe0d22f8>", line 1, in <module>
    i.next()
StopIteration

      



This will happen in your code every time because you call it at the end of your block, so an exception is thrown before the loop has a chance to traverse back and find it line

empty. As a group help fix, you can do something like this when you catch the exception StopIteration

and get out of it (as it indicates that this is being done):

# Your code...
except StopIteration:
    pass
except Exception as e:
  print "Loop exited becuase:"
  print type(e)
  print "at " + line
finally:
  output.close()

      

+1


source


I like Parser Combinators as they lead to a much more declarative programming style.

For example, in the Parcon library :

from string import letters, digits
from parcon import (Word, Except, Exact, OneOrMore,
                    CharNotIn, Literal, End, concat)

alphanum = letters + digits

UntilNewline = Exact(OneOrMore(CharNotIn('\n')) + '\n')[concat]
Heading1 = Word(alphanum + '/')
Heading2 = Word(alphanum + '.')
Name = 'Name' + UntilNewline
Line = Except(UntilNewline, Literal('//'))
Lines = OneOrMore(Line)
Block = Heading1['hleft'] + Heading2['hright'] + Name['name'] + Lines['lines'] + '//'
Blocks = OneOrMore(Block[dict]) + End()

      

And then, using the Alex MartelliBunch

class:

class Bunch(object):
    def __init__(self, **kwds):
        self.__dict__.update(kwds)

names = 'John', 'Jane'
for block in Blocks.parse_string(config):
    b = Bunch(**block)
    if b.name in names and b.hleft.upper() in ("INFERNAL1/A', 'HMMER3/F"):
        print ' '.join((b.hleft, b.hright))
        print 'Name', b.name
        print '\n'.join(b.lines)

      

This file:

Infernal1/a ...
Name John
...
...
...
...
//
SomeHeader/a ...
Name Jane
...
...
...
...
//
HMMER3/f ...
Name Jane
...
...
...
...
//
Infernal1/a ...
Name Billy Bob
...
...
...
...
//

      

result:

Infernal1/a ...
Name John
...
...
...
...
HMMER3/f ...
Name Jane
...
...
...
...

      

0


source


1 / Exception handling

To avoid handling the exception StopIteration

, you should look at the Pythonic way of handling sequences (as Abartern mentioned):

it = iter(cmLines)
for line in it:
    # do

      

2 / Information on capture

Alternatively, you can try to catch your information pattern with regular expressions. You know the exact expression for the first line. Then you want to catch the name and compare it against some list of valid names. Finally, you're looking for the next one //

. You can create a regex including strings and use a group to catch the name you want to check,

(...)

Matches any regular expression within parentheses, and specifies the start and end of the group ; the contents of a group can be retrieved after a match has been performed, and can be matched later in a string with the special \ number sequence described below. To match literals ('or') ', use (or) or enclose them inside a character class: [(] [)].

Here is an example of using group regex in Python doc

>>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
>>> m.group(0)       # The entire match
'Isaac Newton'
>>> m.group(1)       # The first parenthesized subgroup.
'Isaac'
>>> m.group(2)       # The second parenthesized subgroup.
'Newton'
>>> m.group(1, 2)    # Multiple arguments give us a tuple.
('Isaac', 'Newton')

      

More on Regex .

Link

Iterator next () raising an exception in Python: https://softwareengineering.stackexchange.com/questions/112463/why-do-iterators-in-python-raise-an-exception

0


source


You can explicitly ignore StopIteration

:

 try:
     # parse file
     it = iter(cmLines)
     for line in it:
         # here `line = next(it)` might raise StopIteration
 except StopIteration:
     pass
 except Exception as e:
     # handle exception

      

Or call line = next(it, None)

and check None

.

To separate concerns, you can split the code into two parts:

  • Split input into records:
from collections import deque
from itertools import chain, dropwhile, takewhile

def getrecords(lines):
    it = iter(lines)
    headers = "INFERNAL1/a", "HMMER3/f"
    while True:
        it = chain([next(it)], it) # force StopIteration at the end
        it = dropwhile(lambda line: not line.startswith(headers), it)
        record = takewhile(lambda line: not line.starswith("//"), it)
        yield record
        consume(record) # make sure each record is read to the end

def consume(iterable):
    deque(iterable, maxlen=0)

      

  • the output entries that interest you:
from contextlib import closing

with closing(output):
    for record in getrecords(cmLines):
        title, line = next(record, ""), next(record, "")
        if word2(line) in namesList:
           for line in chain([title, line], record):
               output.write(line)

      

0


source







All Articles