Python subprocess: read returncode sometimes differs from returned code
I have a Python script that calls another Python script using subprocess.Popen
. I know that the called code always returns 10, which means it failed.
My problem is that the caller only reads 10 about 75% of the time. The other 25% read 0 and mistakenly mistake the program's failure code. Same command, same environment, apparently random occurrences.
Environment: Python 2.7.10, Linux Redhat 6.4. The code presented here is a (very) simplified version, but I can still reproduce the problem using it.
This is called a script, constant_return.py:
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
"""
Simplified called code
"""
import sys
if __name__ == "__main__":
sys.exit(10)
This is the caller ID:
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
"""
Simplified version of the calling code
"""
try:
import sys
import subprocess
import threading
except Exception, eImp:
print "Error while loading Python library : %s" % eImp
sys.exit(100)
class BizarreProcessing(object):
"""
Simplified caller class
"""
def __init__(self):
"""
Classic initialization
"""
object.__init__(self)
def logPipe(self, isStdOut_, process_):
"""
Simplified log handler
"""
try:
if isStdOut_:
output = process_.stdout
logfile = open("./log_out.txt", "wb")
else:
output = process_.stderr
logfile = open("./log_err.txt", "wb")
#Read pipe content as long as the process is running
while (process_.poll() == None):
text = output.readline()
if (text != '' and text.strip() != ''):
logfile.write(text)
#When the process is finished, there might still be lines remaining in the pipe
output.readlines()
for oneline in output.readlines():
if (oneline != None and oneline.strip() != ''):
logfile.write(text)
finally:
logfile.close()
def startProcessing(self):
"""
Launch process
"""
# Simplified command line definition
command = "/absolute/path/to/file/constant_return.py"
# Execute command in a new process
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
#Launch a thread to gather called programm stdout and stderr
#This to avoid a deadlock with pipe filled and such
stdoutTread = threading.Thread(target=self.logPipe, args=(True, process))
stdoutTread.start()
stderrThread = threading.Thread(target=self.logPipe, args=(False, process))
stderrThread.start()
#Wait for the end of the process and get process result
stdoutTread.join()
stderrThread.join()
result = process.wait()
print("returned code: " + str(result))
#Send it back to the caller
return (result)
#
# Main
#
if __name__ == "__main__":
# Execute caller code
processingInstance = BizarreProcessing()
aResult = processingInstance.startProcessing()
#Return the code
sys.exit(aResult)
This is what I am typing bash to call the calling script:
for res in {1..100}
do
/path/to/caller/script.py
echo $? >> /tmp/returncodelist.txt
done
It seems to have something to do with the way I read the outputs of the called program, because when I create a subprocess with process = subprocess.Popen(command, shell=True, stdout=sys.stdout, stderr=sys.stderr)
and remove all Thread stuff, it reads the correct return code (but not logged as I want anymore ...)
Any idea what I did wrong?
Many thanks for your help
source to share
logPipe also checks to see if this process is alive to determine if there is more data to read. This is not correct - you have to check if the pipeline has reached EOF by looking at zero length or using output.readlines (). I / O tubes can survive the process.
This makes logPipe much easier: change the logPipe as shown below:
def logPipe(self, isStdOut_, process_):
"""
Simplified log handler
"""
try:
if isStdOut_:
output = process_.stdout
logfile = open("./log_out.txt", "wb")
else:
output = process_.stderr
logfile = open("./log_err.txt", "wb")
#Read pipe content as long as the process is running
with output:
for text in output:
if text.strip(): # ... checks if it not an empty string
logfile.write(text)
finally:
logfile.close()
Second, don't attach your log entries until after process.wait () for the same reason - I / O pipes can survive the process.
What I think is happening under the covers is that SIGPIPE is being emitted and handled incorrectly - possibly misinterpreted as a process termination condition. This is because the pipe closes at one end or the other without being flushed. SIGPIPE can sometimes be frustrating in larger applications; maybe the Python library is swallowing it or doing something childish with it.
edit As @Blackjack points out, SIGPIPE is automatically blocked by Python. Therefore, it violates the SIGPIPE offenses. Second theory: Documentation behind Popen.poll ():
Check if the child process has ended. Set and return the return code attribute.
If you bind this (for example strace -f -o strace.log ./caller.py
) it appears to be done via wait4 (WNOHANG). You have 2 threads waiting to work with WNOHANG and one of them works fine, but only one call will return with the process exit code. If there is no blocking in the implementation of subprocess.poll (), then it is likely that the race will assign process.resultcode or potential failure to do so correctly. Limiting your Popen.waits / polls to one thread should be a good way to avoid this. See man waitpid
.
edit as an aside, if you can keep all your stdout / stderr data in memory, subprocess.communicate () is much easier to use and doesn't require logPipe threads or background threads at all.
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate
source to share