Registering Python with multiprocessing on Windows
I have a fairly large Python project that is currently running on Linux, but I'm trying to migrate to Windows. I have reduced the code to a complete example that can be run to illustrate my problems: I have two classes, Parent and Child. First, the parent is initialized, creates the registrar and spawns the child to do the work:
import logging
import logging.config
import multiprocessing
class Parent( object ):
def __init__(self, logconfig):
logging.config.dictConfig(logconfig)
self.logger = logging.getLogger(__name__)
def spawnChild(self):
self.logger.info('One')
c = Child(self.logger)
c.start()
class Child(multiprocessing.Process):
def __init__(self, logger):
multiprocessing.Process.__init__(self)
self.logger = logger
def run(self):
self.logger.info('Two')
if __name__ == '__main__':
p = Parent({
'version':1,
"handlers": {
"console": {
"class": "logging.StreamHandler",
"stream": "ext://sys.stdout"
},
},
"root": {
"level": "DEBUG",
"handlers": [
"console",
]
}
}
)
p.spawnChild()
On linux (specifically ubuntu 12.04) I get the following (expected) output:
user@ubuntu:~$ python test.py
One
Two
But, on Windows (specifically Windows 7), it fails with an etching error:
C:\>python test.py
<snip>
pickle.PicklingError: Can't pickle <type 'thread.lock'>: it not found as thread.lock
The problem comes down to Windows not having a true fork, so objects have to be pickled when dispatched between threads. But the registrar cannot be pickled. I've tried using __getstate__ and __setstate__ to avoid pickling, and reference by name in Child:
def __getstate__(self):
d = self.__dict__.copy()
if 'logger' in d.keys():
d['logger'] = d['logger'].name
return d
def __setstate__(self, d):
if 'logger' in d.keys():
d['logger'] = logging.getLogger(d['logger'])
self.__dict__.update(d)
This works the same on Linux as it did before, and now Windows won't fail with PicklingError. However, my output only comes from the parent:
C:\>python test.py
One
C:\>
It looks like the child is unable to use the logger despite the absence of complaint messages: "There is no logger for the '__main__" handler or any other error message. I looked around and there are ways I could completely restructure how I enter into my program, but this is definitely a last resort, I hope that I just missed something obvious and that the wisdom of the crowd can point me to this.
source to share
In most cases, objects are Logger
not picked up because they are using internal objects theading.Lock
and / or file
. When trying to work around the problem, avoid pickling Logger
, but you end up creating a completely different one Logger
in the child process that has the same name Logger
as the parent; the effects logging.config
you cause are lost. To get the behavior you want, you will need to recreate the logger in the child process and call again logging.config.dictConfig
:
class Parent( object ):
def __init__(self, logconfig):
self.logconfig = logconfig
logging.config.dictConfig(logconfig)
self.logger = logging.getLogger(__name__)
def spawnChild(self):
self.logger.info('One')
c = Child(self.logconfig)
c.start()
class Child(multiprocessing.Process):
def __init__(self, logconfig):
multiprocessing.Process.__init__(self)
self.logconfig = logconfig
def run(self):
# Recreate the logger in the child
logging.config.dictConfig(self.logconfig)
self.logger = logging.getLogger(__name__)
self.logger.info('Two')
Or, if you want to keep using __getstate__
/ __setstate__
:
class Parent( object ):
def __init__(self, logconfig):
logging.config.dictConfig(logconfig)
self.logger = logging.getLogger(__name__)
self.logconfig = logconfig
def spawnChild(self):
self.logger.info('One')
c = Child(self.logger, self.logconfig)
c.start()
class Child(multiprocessing.Process):
def __init__(self, logger, logconfig):
multiprocessing.Process.__init__(self)
self.logger = logger
self.logconfig = logconfig
def run(self):
self.logger.info('Two')
def __getstate__(self):
d = self.__dict__.copy()
if 'logger' in d:
d['logger'] = d['logger'].name
return d
def __setstate__(self, d):
if 'logger' in d:
logging.config.dictConfig(d['logconfig'])
d['logger'] = logging.getLogger(d['logger'])
self.__dict__.update(d)
source to share