How do I determine if a thread has died and then restart it?

I have an application that starts a series of threads. Sometimes one of these threads dies (usually due to a network problem). How can I correctly detect a thread crashing and restart only that thread? Here's some sample code:

import random
import threading
import time

class MyThread(threading.Thread):
    def __init__(self, pass_value):
        super(MyThread, self).__init__()
        self.running = False
        self.value = pass_value

    def run(self):
        self.running = True

        while self.running:
            time.sleep(0.25)

            rand = random.randint(0,10)
            print threading.current_thread().name, rand, self.value
            if rand == 4:
                raise ValueError('Returned 4!')


if __name__ == '__main__':
    group1 = []
    group2 = []
    for g in range(4):
        group1.append(MyThread(g))
        group2.append(MyThread(g+20))


    for m in group1:
        m.start()

    print "Now start second wave..."

    for p in group2:
        p.start()

      

In this example, I start 4 threads, then I start 4 more threads. Each thread randomly generates int

between 0 and 10. If this int

is equal 4

, it throws an exception. Please note that I am not join

threads. I want a list of threads to execute group1

and group2

. I found that if I join threads, it will wait for the thread to end. My thread must be a daemon process, so rarely (if ever) gets caught in an ValueError

Exception, this example shows the code and should run continuously. After joining it, the next set of threads does not start.

How can I detect that a specific thread has died and only restarts one thread?

I tried to execute the next loop right after my loop for p in group2

.

while True:
    # Create a copy of our groups to iterate over, 
    # so that we can delete dead threads if needed
    for m in group1[:]:
        if not m.isAlive():
            group1.remove(m)
            group1.append(MyThread(1))

    for m in group2[:]:
        if not m.isAlive():
            group2.remove(m)
            group2.append(MyThread(500))

    time.sleep(5.0)

      

I took this method from this question.

The problem with this is that isAlive()

it always seems to return True

because the threads are never restarted.

Edit

Would it be more appropriate to use multiprocessing in this situation? I found this tutorial. Is it more appropriate to have separate processes if I need to restart a process? It seems like restarting the thread is difficult.

It was pointed out in the comments that I should check is_active()

for a stream. I don't see this in the documentation, but I can see isAlive

which one I am currently using. As I mentioned above, this returns True

, so I can never see the thread died.

+6


source to share


2 answers


You can try trying other than where you expect it to crash (if it could be anywhere, you can do it around the entire startup function) and have an indicator variable that has its status.

So something like the following:

class MyThread(threading.Thread):
    def __init__(self, pass_value):
        super(MyThread, self).__init__()
        self.running = False
        self.value = pass_value
        self.RUNNING = 0
        self.FINISHED_OK  = 1
        self.STOPPED = 2
        self.CRASHED = 3
        self.status = self.STOPPED

    def run(self):
        self.running = True    
        self.status = self.RUNNING


        while self.running:
            time.sleep(0.25)

            rand = random.randint(0,10)
            print threading.current_thread().name, rand, self.value

            try:
                if rand == 4:
                    raise ValueError('Returned 4!')
            except:
                self.status = self.CRASHED

      



Then you can use your loop:

while True:
    # Create a copy of our groups to iterate over, 
    # so that we can delete dead threads if needed
    for m in group1[:]:
        if m.status == m.CRASHED:
            value = m.value
            group1.remove(m)
            group1.append(MyThread(value))

    for m in group2[:]:
        if m.status == m.CRASHED:
            value = m.value
            group2.remove(m)
            group2.append(MyThread(value))

time.sleep(5.0)

      

+3


source


I had a similar problem and came across this question. I found that join takes a timeout argument and is_alive will return False after joining the stream. So my audit for each thread:

def check_thread_alive(thr):
    thr.join(timeout=0.0)
    return self.thr.is_alive()

      



This reveals the thread of death for me.

0


source







All Articles