Starmap modifies a parameter before passing it?

I have a strange error that I am facing when trying to use multiprocessing.Pool.starmap. The minimum code needed to reproduce the error is below:

from multiprocessing import Pool

# Ignore the fact that this class is useless as-is, it has more code but it wasn't relevant to the bug
class Coordinate(tuple) :                                                                          

    def __new__(cls, *args):                                                                   
        return tuple.__new__(cls, args)                                                        

#Essentially just stores two coordinates
class Move :                                                     

    def __init__(self, oldPos, newPos) :      
        self.oldPos = oldPos                  
        self.newPos = newPos                  

    def __str__(self) :      
        return 'Old pos : ' + str(self.oldPos) + ' -- New pos : ' + str(self.newPos)

#Dummy function to show the problem
def funcThatNeedsTwoParams(move, otherParam) :
    print(move)             
    # Second param ignored, no problem there

p = Pool(2)  
moveOne = Move(Coordinate(0, 2), Coordinate(0, 1))
moveTwo = Move(Coordinate(2, 1), Coordinate(3, 0))
moveThree = Move(Coordinate(22345, -12400), Coordinate(153, 2357))
# The numbers are irrelevant, no effect on whether problem shows up or not

moves = [moveOne, moveTwo, moveThree]
paramsForStarmap = [[move, 'other param'] for move in moves]

print(paramsForStarmap)
#Output : 
#[[<__main__.Move object at 0x1023d4438>, 'other param'], [<__main__.Move object at 0x1023d4470>, 'other param'], [<__main__.Move object at 0x1023d44a8>
for move in [params[0] for params in paramsForStarmap] :
    print(move)
#Output : 
#Old pos : (0, 2) -- New pos : (0, 1)
#Old pos : (2, 1) -- New pos : (3, 0)
#Old pos : (22345, -12400) -- New pos : (153, 2357)
p.starmap(funcThatNeedsTwoParams, paramsForStarmap)
#Output :
#Old pos : ((0, 2),) -- New pos : ((0, 1),)
#Old pos : ((22345, -12400),) -- New pos : ((153, 2357),)
#Old pos : ((2, 1),) -- New pos : ((3, 0),)

      

Basically, I have an array of parameter pairs, something like this: [[move, otherParam], [move, otherParam], ...], I print out every first parameter to show that the actions are valid until using the starmap function ... Then I call the starmap function using the pool I created earlier and tell it to use the parameter pairs that I have. It is then inexplicable that each move coordinate becomes tuples of the form ((coordinate),) instead of (coordinate).

I can't figure out why starmap would change the properties of the object passed to it, any help would be greatly appreciated, thanks.

+3


source to share


1 answer


It is interesting. The problem is not only related to starmap

. This happens with all Pool

functions - apply

, map

etc. And as it turns out the problem is not related at multiprocessing

all. This happens when you pickle / paste the class Coordinate

:

>>> c = Coordinate(0,2)
>>> print(c)
(0, 2)
>>> str(pickle.loads(pickle.dumps(c)))
'((0, 2),)'

      

Subclass pickling is tuple

not as straightforward as it sounds, but it does. You can fix this by defining a method __reduce__

that captures the etching process:

class Coordinate(tuple):
    def __new__(cls, *args):
        return tuple.__new__(cls, args)

    def __reduce__(self):
        return (self.__class__, tuple(self))

      



Now it's perfectly marinated:

>>> c = Coordinate(0,2)
>>> pickle.loads(pickle.dumps(c))
(0, 2)

      

And your example code works great.

+2


source







All Articles