How to achieve data locality using polymorphism?

(The title may be "less than optimal".)

Let's assume the code is like this:

class Foo {/*stuff*/};
class Bar1 : Foo {/*stuff*/};
class Bar2 : Foo {/*stuff*/};

std::vector<Foo*> foos;

// Populate 'foos' with Foo, Bar1 and Bar2 objects

// Iterate through foos
for(Foo* foo : foos) foo->doSomething();

      

Basically, it foos

is a vector with object pointers Foo

. However, scrolling through this vector can cause cache misses. A theoretical remedy would be to store actual objects instead of pointers, but this is not allowed in C ++ (no polymorphism with arrays).

That said: How can you improve data location (and minimize cache misses) when a lot of polymorphic objects are required?

This is interesting to me since everyone tells me that cache hits / skips are of great importance in performance-critical software and therefore pointers like the above code example should be avoided. However, this essentially means discarding polymorphism.

+3


source to share


1 answer


I think it often happens that you have to sacrifice efficiency in order to use polymorphism, but in this case, perhaps you can maintain separate vectors Bar1

and Bar2

. You can think of them as "pool" Bar1

and Bar2

.

And then fill the vector of object Foo

pointers with pointers to objects in the pools Bar1

and Bar2

:

template<typename Bar>
void populateFoos(std::vector<Foo*>& foos, std::vector<Bar>& bars) {
    for (auto& bar : bars)
        foos.emplace_back(&bar);
}

std::vector<Bar1> bar1s;
std::vector<Bar2> bar2s;

std::vector<Foo*> foos;

// Populate Bar1s
bar1s.emplace_back();
bar1s.emplace_back();

// Populate Bar2s
bar2s.emplace_back();

// Populate 'foos' with Bar1 and Bar2 objects
populateFoos(foos, bar1s);
populateFoos(foos, bar2s);

// Iterate through foos
for(auto foo : foos) 
    foo->doSomething(); 

      



You need to be careful that you do not invalidate the object pointers by Foo

reallocating the pools Bar1

and Bar2

.

+1


source







All Articles