Running an entire Twisted application in multiple processes

I am aware of the limitations of Twisted for multiprocessing applications, but my question is different. I am not trying to start a server or client using multiple processes. I already have an application running that takes up several directories and does some operations on them. I want to split the work into chunks by creating a process with the same application for each subdirectory. I can do this by running the application multiple times from the shell and passing in a different subdirectory as an argument each time.

Basically I have something like:

from multiprocessing import Pool
...
p = Pool(num_procs)
work_chunks = [work_chunk] * len(configs)
p.map(run_work_chunk, zip(work_chunks, configs))
p.close()
p.join()

      

Where:

def run_work_chunk((work_chunk, config)):
    from twisted.internet import reactor
    d = work_chunk.configure(config)

    d.addCallback(lambda _: work_chunk.run())
    d.addErrback(handleLog)
    print "pid=", getpid(), "reactor=", id(reactor)
    reactor.run()
    return

class WorkChunk(object):
    ...
    def run(self):
        # do stuff
        ...
        reactor.stop()

      

Let's say num_procs

equals 2, then the output will look something like this:

pid = 2 reactor = 140612692700304

pid = 6 reactor = 140612692700304

And you cannot see the exit for workers working in other chunks.

The problem is that when called, reactor.stop()

it stops all reactors because every process uses the same reactor. I thought that when creating a new process, the entire stack was copied, but in this case, it copies the reactor reference, so all processes use the same reactor object.

Is there a way to instantiate a different reactor object for each process? (as if it were really a completely different process and not a child process)

+3


source to share


1 answer


Is there a way to instantiate a different reactor object for each process? (as if it were really a completely different process and not a child process)

If you really mean process

, the best way is to run the code multiple times (and / or fork

/ exec

to create new processes from your initial process).



There is no magic for managing multiple reactors, it is done in the same way you run multiple programs in any other context.

0


source







All Articles