Registering Python with multiprocessing on Windows

I have a fairly large Python project that is currently running on Linux, but I'm trying to migrate to Windows. I have reduced the code to a complete example that can be run to illustrate my problems: I have two classes, Parent and Child. First, the parent is initialized, creates the registrar and spawns the child to do the work:

import logging
import logging.config
import multiprocessing

class Parent( object ):
    def __init__(self, logconfig):
        logging.config.dictConfig(logconfig)
        self.logger = logging.getLogger(__name__)

    def spawnChild(self):
        self.logger.info('One')
        c = Child(self.logger)
        c.start()

class Child(multiprocessing.Process):
    def __init__(self, logger):
        multiprocessing.Process.__init__(self)
        self.logger = logger

    def run(self):
        self.logger.info('Two')

if __name__ == '__main__':
    p = Parent({
            'version':1, 
            "handlers": {
                "console": {
                    "class": "logging.StreamHandler",
                    "stream": "ext://sys.stdout"
                },
            },
            "root": {
                "level": "DEBUG",
                "handlers": [
                    "console",
                    ]
                }
            }
        )
    p.spawnChild()

      

On linux (specifically ubuntu 12.04) I get the following (expected) output:

user@ubuntu:~$ python test.py 
One
Two

      

But, on Windows (specifically Windows 7), it fails with an etching error:

C:\>python test.py
<snip>
pickle.PicklingError: Can't pickle <type 'thread.lock'>: it not found as thread.lock

      

The problem comes down to Windows not having a true fork, so objects have to be pickled when dispatched between threads. But the registrar cannot be pickled. I've tried using __getstate__ and __setstate__ to avoid pickling, and reference by name in Child:

def __getstate__(self):
    d = self.__dict__.copy()
    if 'logger' in d.keys():
        d['logger'] = d['logger'].name
    return d

def __setstate__(self, d):
    if 'logger' in d.keys():
        d['logger'] = logging.getLogger(d['logger'])
    self.__dict__.update(d)

      

This works the same on Linux as it did before, and now Windows won't fail with PicklingError. However, my output only comes from the parent:

C:\>python test.py
One

C:\>

      

It looks like the child is unable to use the logger despite the absence of complaint messages: "There is no logger for the '__main__" handler or any other error message. I looked around and there are ways I could completely restructure how I enter into my program, but this is definitely a last resort, I hope that I just missed something obvious and that the wisdom of the crowd can point me to this.

+3


source to share


1 answer


In most cases, objects are Logger

not picked up because they are using internal objects theading.Lock

and / or file

. When trying to work around the problem, avoid pickling Logger

, but you end up creating a completely different one Logger

in the child process that has the same name Logger

as the parent; the effects logging.config

you cause are lost. To get the behavior you want, you will need to recreate the logger in the child process and call again logging.config.dictConfig

:

class Parent( object ):
    def __init__(self, logconfig):
        self.logconfig = logconfig
        logging.config.dictConfig(logconfig)
        self.logger = logging.getLogger(__name__)

    def spawnChild(self):
        self.logger.info('One')
        c = Child(self.logconfig)
        c.start()

class Child(multiprocessing.Process):
    def __init__(self, logconfig):
        multiprocessing.Process.__init__(self)
        self.logconfig = logconfig

    def run(self):
        # Recreate the logger in the child
        logging.config.dictConfig(self.logconfig)
        self.logger = logging.getLogger(__name__)

        self.logger.info('Two')

      



Or, if you want to keep using __getstate__

/ __setstate__

:

class Parent( object ):
    def __init__(self, logconfig):
        logging.config.dictConfig(logconfig)
        self.logger = logging.getLogger(__name__)
        self.logconfig = logconfig

    def spawnChild(self):
        self.logger.info('One')
        c = Child(self.logger, self.logconfig)
        c.start()

class Child(multiprocessing.Process):
    def __init__(self, logger, logconfig):
        multiprocessing.Process.__init__(self)
        self.logger = logger
        self.logconfig = logconfig

    def run(self):
        self.logger.info('Two')

    def __getstate__(self):
        d = self.__dict__.copy()
        if 'logger' in d:
            d['logger'] = d['logger'].name
        return d

    def __setstate__(self, d):
        if 'logger' in d:
            logging.config.dictConfig(d['logconfig'])
            d['logger'] = logging.getLogger(d['logger'])
        self.__dict__.update(d)

      

+2


source







All Articles