Ctrl-C ends my script, but it doesn't get caught in the KeyboardInterrupt exception

I have a python script that contains a large loop that reads a file and does some things (I use several packages like urllib2, httplib2 or BeautifulSoup).

It looks like this:

try:
    with open(fileName, 'r') as file :
        for i, line in enumerate(file):
            try:
                # a lot of code
                # ....
                # ....
            except urllib2.HTTPError:
                print "\n >>> HTTPError"
            # a lot of other exceptions
            # ....
            except (KeyboardInterrupt, SystemExit):
                print "Process manually stopped"
                raise
            except Exception, e:
                print(repr(e))
except (KeyboardInterrupt, SystemExit):
    print "Process manually stopped"
    # some stuff

      

The problem is that the program stops when I press Ctrl-C, but it doesn't hit either of my two KeyboardInterrupt exceptions, although I'm pretty sure it is currently in a loop (and at least inside a big try / except).

How is this possible? At first I thought it was because one of the packages I am using is not handling exceptions correctly (for example, only with "except:"), but if it does, my script will not stop. But the script REALLY stops and it needs to be caught by at least one of my two other than, right?

Where am I going wrong?

Thanks in advance!

EDIT:

With the addition of a clause finally:

after try-except and printing the trace in both try-except blocks it usually shows up None

when I press Ctrl-C, but I managed to get it one day (it seems that it comes from urllib2, but I don't know if this is the reason for why I can't catch KeyboardInterrupt):

Traceback (last call last):

File "/home/darcot/code/Crawler/crawler.py", line 294, in get_articles_from_file
  content = Extractor(extractor='ArticleExtractor', url=url).getText()
File "/usr/local/lib/python2.7/site-packages/boilerpipe/extract/__init__.py", line 36, in __init__
  connection  = urllib2.urlopen(request)
File "/usr/local/lib/python2.7/urllib2.py", line 126, in urlopen
  return _opener.open(url, data, timeout)
File "/usr/local/lib/python2.7/urllib2.py", line 391, in open
  response = self._open(req, data)
File "/usr/local/lib/python2.7/urllib2.py", line 409, in _open
  '_open', req)
File "/usr/local/lib/python2.7/urllib2.py", line 369, in _call_chain
  result = func(*args)
File "/usr/local/lib/python2.7/urllib2.py", line 1173, in http_open
  return self.do_open(httplib.HTTPConnection, req)
File "/usr/local/lib/python2.7/urllib2.py", line 1148, in do_open
  raise URLError(err)
URLError: <urlopen error [Errno 4] Interrupted system call>

      

+3


source to share


2 answers


I already suggested in my comments to the question that this problem is most likely to be caused by a section of code that is not accounted for in the question. However, the exact code does not have to be relevant, as Python should normally throw an exception KeyboardInterrupt

when the Python code is interrupted by Ctrl-C.

You mentioned in the comments that you are using the boilerpipe

Python package . This Python package is using JPype

to create a Java language binding ... I can reproduce your problem with the following Python program:

from boilerpipe.extract import Extractor
import time

try:
  for i in range(10):
    time.sleep(1)

except KeyboardInterrupt:
  print "Keyboard Interrupt Exception"

      

If you interrupt this program with Ctrl-C, no exception will be thrown. The program seems to exit immediately, leaving the Python interpreter unable to throw an exception. When the import boilerpipe

is removed the problem goes away ...

A debug session with gdb

indicates that a massive number of threads have started in Python if boilerpipe

imported:

gdb --args python boilerpipe_test.py
[...]
(gdb) run
Starting program: /home/fabian/Experimente/pykeyinterrupt/bin/python boilerpipe_test.py
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7fffef62b700 (LWP 3840)]
[New Thread 0x7fffef52a700 (LWP 3841)]
[New Thread 0x7fffef429700 (LWP 3842)]
[New Thread 0x7fffef328700 (LWP 3843)]
[New Thread 0x7fffed99a700 (LWP 3844)]
[New Thread 0x7fffed899700 (LWP 3845)]
[New Thread 0x7fffed798700 (LWP 3846)]
[New Thread 0x7fffed697700 (LWP 3847)]
[New Thread 0x7fffed596700 (LWP 3848)]
[New Thread 0x7fffed495700 (LWP 3849)]
[New Thread 0x7fffed394700 (LWP 3850)]
[New Thread 0x7fffed293700 (LWP 3851)]
[New Thread 0x7fffed192700 (LWP 3852)]

      



gdb

session without import boilerpipe

:

gdb --args python boilerpipe_test.py
[...]
(gdb) r
Starting program: /home/fabian/Experimente/pykeyinterrupt/bin/python boilerpipe_test.py
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7529533 in __select_nocancel () from /usr/lib/libc.so.6
(gdb) signal 2
Continuing with signal SIGINT.
Keyboard Interrupt Exception
[Inferior 1 (process 3904) exited normally 

      

So my guess is that your Ctrl-C signal is being processed on a different thread, or that it JPype

is doing other odd things that break the processing of Ctrl-C.

The EDIT: . As a possible workaround, you can register a signal handler that catches the signal SIGINT

that the process receives when you press Ctrl-C. The signal handler runs even if boilerpipe

and are imported JPype

. This way you will receive a notification when the user presses Ctrl-C, and you can handle this event at the central point of your program. You can terminate the script if you like in this handler. If you don't, the script continues where it left off after the signal handler function returns. See example below:

from boilerpipe.extract import Extractor
import time
import signal
import sys

def interuppt_handler(signum, frame):
    print "Signal handler!!!"
    sys.exit(-2) #Terminate process here as catching the signal removes the close process behaviour of Ctrl-C

signal.signal(signal.SIGINT, interuppt_handler)

try:
    for i in range(10):
        time.sleep(1)
#    your_url = "http://www.zeit.de"
#    extractor = Extractor(extractor='ArticleExtractor', url=your_url)
except KeyboardInterrupt:
    print "Keyboard Interrupt Exception" 

      

+2


source


You are most likely issuing CTRL-C when your script is outside the try block and therefore does not log the signal.



0


source







All Articles