Python threading as a way to terminate a script that always fails

I have been struggling for years with the PublicationSaver () class I wrote, which has a method to load XML documents as strings (not shown here) and then passes each uploaded row to self.savePublication (self, publish, myDirPath).

Every time I used it, it crashed after 25,000 lines, and it keeps the last line it crashed on, I was able to parse that line separately, so I guess the problem isn't bad XML.

I asked here but didn't answer.

I have studied a lot and it seems that I am not the only one having this problem: here

So, since I really need to do this task, I thought about this: can I wrap everything with a set of Thread in the main so that when lxml parse throws an exception I get it and send the result to main to kill the thread and start him again?

#threading
result_q = Queue.Queue()

# Create the thread 
xmlSplitter = XmlSplitter_Thread(result_q=result_q)
xmlSplitter.run(toSplit_DirPath, target_DirPath)

print "Hello !!!\n"

toSplitDirEmptyB=False

while not toSplitDirEmptyB:
  
    splitterAlive=True
    while splitterAlive:
        sleep(120)
        splitterAlive=result_q.get()
        
    xmlSplitter.join()
    print "*** KILLED XmlSplitter_Thread !!! ***\n"
    
    if not os.listdir(toSplit_DirPath):
        toSplitDirEmptyB=True
    else:
        xmlSplitter.run(toSplit_DirPath, target_DirPath)
      

Run codeHide result


Is this a valid approach? When I run the code above, it doesn't work at the moment; I mean, I never get "Hi!" is displayed and the xmlSplitter just keeps going even when it starts to crash (there is an exception rule that keeps it going).

+3


source to share


1 answer


Probably the thread is down and its blocking on the connection method. take a look here . Split the xml into chunks and try to parse the chunk to avoid memory errors.



0


source







All Articles