How do I determine if a thread has died and then restart it?
I have an application that starts a series of threads. Sometimes one of these threads dies (usually due to a network problem). How can I correctly detect a thread crashing and restart only that thread? Here's some sample code:
import random import threading import time class MyThread(threading.Thread): def __init__(self, pass_value): super(MyThread, self).__init__() self.running = False self.value = pass_value def run(self): self.running = True while self.running: time.sleep(0.25) rand = random.randint(0,10) print threading.current_thread().name, rand, self.value if rand == 4: raise ValueError('Returned 4!') if __name__ == '__main__': group1 =  group2 =  for g in range(4): group1.append(MyThread(g)) group2.append(MyThread(g+20)) for m in group1: m.start() print "Now start second wave..." for p in group2: p.start()
In this example, I start 4 threads, then I start 4 more threads. Each thread randomly generates
between 0 and 10. If this
, it throws an exception. Please note that I am not
threads. I want a list of threads to execute
. I found that if I join threads, it will wait for the thread to end. My thread must be a daemon process, so rarely (if ever) gets caught in an
Exception, this example shows the code and should run continuously. After joining it, the next set of threads does not start.
How can I detect that a specific thread has died and only restarts one thread?
I tried to execute the next loop right after my loop
for p in group2
while True: # Create a copy of our groups to iterate over, # so that we can delete dead threads if needed for m in group1[:]: if not m.isAlive(): group1.remove(m) group1.append(MyThread(1)) for m in group2[:]: if not m.isAlive(): group2.remove(m) group2.append(MyThread(500)) time.sleep(5.0)
I took this method from this question.
The problem with this is that
it always seems to return
because the threads are never restarted.
Would it be more appropriate to use multiprocessing in this situation? I found this tutorial. Is it more appropriate to have separate processes if I need to restart a process? It seems like restarting the thread is difficult.
It was pointed out in the comments that I should check
for a stream. I don't see this in the documentation, but I can see
which one I am currently using. As I mentioned above, this returns
, so I can never see the thread died.
source to share
You can try trying other than where you expect it to crash (if it could be anywhere, you can do it around the entire startup function) and have an indicator variable that has its status.
So something like the following:
class MyThread(threading.Thread): def __init__(self, pass_value): super(MyThread, self).__init__() self.running = False self.value = pass_value self.RUNNING = 0 self.FINISHED_OK = 1 self.STOPPED = 2 self.CRASHED = 3 self.status = self.STOPPED def run(self): self.running = True self.status = self.RUNNING while self.running: time.sleep(0.25) rand = random.randint(0,10) print threading.current_thread().name, rand, self.value try: if rand == 4: raise ValueError('Returned 4!') except: self.status = self.CRASHED
Then you can use your loop:
while True: for m in group1[:]: if m.status == m.CRASHED: value = m.value group1.remove(m) group1.append(MyThread(value)) for m in group2[:]: if m.status == m.CRASHED: value = m.value group2.remove(m) group2.append(MyThread(value)) time.sleep(5.0)
source to share
I had a similar problem and came across this question. I found that join takes a timeout argument and is_alive will return False after joining the stream. So my audit for each thread:
def check_thread_alive(thr): thr.join(timeout=0.0) return self.thr.is_alive()
This reveals the thread of death for me.
source to share