Trying to understand multiprocessing with core in python

Using the code below, I am getting weird output:

import  sys 
from  multiprocessing import Process
import time
from time import strftime

now =time.time()    
print time.strftime("%Y%m%d %H:%M:%S", time.localtime(now)) 

fr= [1,2,3]
for row in fr:
    print 3

print 1

def worker():
    print 'worker line'
    time.sleep(1)
    sys.exit(1)

def main():
    print 'start worker'
    Process(target=worker, args=()).start()
    print 'main line'

if __name__ == "__main__":
    start_time = time.time()
    main()
    end_time = time.time()
    duration = end_time - start_time
    print "Duration: %s" % duration

      

Output:

20120324 20:35:53
3
3
3
1
start worker
main line
Duration: 0.0
20120324 20:35:53
3
3
3
1
worker line 

      

I thought I would get this:

20120324 20:35:53
3
3
3
1
start worker
worker line
main line
Duration: 1.0

      

Why does this launch run twice? Using python 2.7 on WinX64:

20120324 20:35:53
3
3
3
1
worker line 

      

+3


source to share


2 answers


the problem is basically what is multiprocessing

really meant to work on a posix system, one with syscall. on these operating systems, the process can split in two, the child magically clones the state from the parent and both resume work in the same place, and now the child has a new process ID. In this situation, can arrange some mechanism to send state from parent to child as needed, with the certainty that the child will already have more python state needed. fork(2)

multiprocessing

Windows doesn't fork()

.



And so I multiprocessing

must take the slack. This basically involves starting a completely new python interpreter using a multi-processor child script. Almost immediately, the parent will ask the child to use something that is in the parent state, and so the child will have to recreate that state from scratch by importing your script into the child.

So, whatever happens during import in your script will happen twice, once in the parent and again in the child, as it recreates the python environment it needs to serve the parent.

+5


source


This is what I get when I run my code on Linux using Python 2.7.3:

20120324 23:05:49
3
3
3
1
start worker
main line
Duration: 0.0045280456543
worker line

      

I don't know why you are running twice, but I can tell you why it doesn't return the expected duration time or print in the "correct" order.



When you start a process with multiprocessing

, the startup is asynchronous. That is, the function .start()

immediately returns to the parent process, so that the parent process can continue to run and do other things (for example, start more processes) while the child process does its own work in the background. If you want to block the process of the parent process until the process of the child process exits, you must use the function .join()

. For example:

def main():
    print 'start worker'
    p = Process(target=worker, args=())
    p.start()
    p.join()
    print 'main line'

      

0


source







All Articles