Python: OOP overhead?

I was working on a real-time application and noticed that some OOP design patterns create incredible overhead in Python (tested with 2.7.5).

How simple is it, why is it that simple dictionary value accessors take almost 5 times as long as the dictionary is encapsulated by another object?

For example, by running the code below, I got:

Dict Access: 0.167706012726
Attribute Access: 0.191128969193
Method Wrapper Access: 0.711422920227
Property Wrapper Access: 0.932291030884

      

Executable code:

class Wrapper(object):
    def __init__(self, data):
        self._data = data

    @property
    def id(self):
        return self._data['id']

    @property
    def name(self):
        return self._data['name']

    @property
    def score(self):
        return self._data['score']


class MethodWrapper(object):
    def __init__(self, data):
        self._data = data

    def id(self):
        return self._data['id']

    def name(self):
        return self._data['name']

    def score(self):
        return self._data['score']


class Raw(object):
    def __init__(self, id, name, score):
        self.id = id
        self.name = name
        self.score = score


data = {'id': 1234, 'name': 'john', 'score': 90}
wp = Wrapper(data)
mwp = MethodWrapper(data)
obj = Raw(data['id'], data['name'], data['score'])


def dict_access():
    for _ in xrange(100):
        uid = data['id']
        name = data['name']
        score = data['score']


def method_wrapper_access():
    for _ in xrange(100):
        uid = mwp.id()
        name = mwp.name()
        score = mwp.score()


def property_wrapper_access():
    for _ in xrange(100):
        uid = wp.id
        name = wp.name
        score = wp.score


def object_access():
    for _ in xrange(100):
        uid = obj.id
        name = obj.name
        score = obj.score


import timeit
print 'Dict Access:', timeit.timeit("dict_access()", setup="from __main__ import dict_access", number=10000)
print 'Attribute Access:', timeit.timeit("object_access()", setup="from __main__ import object_access", number=10000)
print 'Method Wrapper Access:', timeit.timeit("method_wrapper_access()", setup="from __main__ import method_wrapper_access", number=10000)
print 'Property Wrapper Access:', timeit.timeit("property_wrapper_access()", setup="from __main__ import property_wrapper_access", number=10000)

      

+3


source to share


1 answer


This has to do with dynamic lookup, which the Python interpreter (CPython) does to send all your calls, indexing, etc. Dynamic search queries provide a lot of flexibility in the language, but at an execution cost. When you use a Method Wrapper, this (at least) happens:

  • search mwp.id

    is a method, but it is also just an object assigned to an attribute and needs to be searched like any other
  • call mwp.id()

  • inside the method, find self._data

  • find __getitem__

    inself._data

  • call __getitem__

    (this will at least be a C function, but you still had to go through all those dynamic queries to get here)

By comparison, the "Dict Access" test case only needs to search __getitem__

and then call it.



As Matteo Italia points out in a comment, this is a concrete implementation. In the Python ecosystem, you now also have PyPy (uses JIT and runtime optimizations), Cython (compiles in C, with optional static type annotations, etc.), Nuitka (compiles in C ++, it is supposed to take the code as is) and several other implementations.

One way to optimize these searches in pure Python with CPython is to get direct references to objects and assign them to local variables outside of loops, and then use local variables inside the loops. This is an optimization that can potentially result from cluttering the code and / or breaking encapsulation.

+5


source







All Articles