How to achieve data locality using polymorphism?
(The title may be "less than optimal".)
Let's assume the code is like this:
class Foo {/*stuff*/};
class Bar1 : Foo {/*stuff*/};
class Bar2 : Foo {/*stuff*/};
std::vector<Foo*> foos;
// Populate 'foos' with Foo, Bar1 and Bar2 objects
// Iterate through foos
for(Foo* foo : foos) foo->doSomething();
Basically, it foos
is a vector with object pointers Foo
. However, scrolling through this vector can cause cache misses. A theoretical remedy would be to store actual objects instead of pointers, but this is not allowed in C ++ (no polymorphism with arrays).
That said: How can you improve data location (and minimize cache misses) when a lot of polymorphic objects are required?
This is interesting to me since everyone tells me that cache hits / skips are of great importance in performance-critical software and therefore pointers like the above code example should be avoided. However, this essentially means discarding polymorphism.
source to share
I think it often happens that you have to sacrifice efficiency in order to use polymorphism, but in this case, perhaps you can maintain separate vectors Bar1
and Bar2
. You can think of them as "pool" Bar1
and Bar2
.
And then fill the vector of object Foo
pointers with pointers to objects in the pools Bar1
and Bar2
:
template<typename Bar>
void populateFoos(std::vector<Foo*>& foos, std::vector<Bar>& bars) {
for (auto& bar : bars)
foos.emplace_back(&bar);
}
std::vector<Bar1> bar1s;
std::vector<Bar2> bar2s;
std::vector<Foo*> foos;
// Populate Bar1s
bar1s.emplace_back();
bar1s.emplace_back();
// Populate Bar2s
bar2s.emplace_back();
// Populate 'foos' with Bar1 and Bar2 objects
populateFoos(foos, bar1s);
populateFoos(foos, bar2s);
// Iterate through foos
for(auto foo : foos)
foo->doSomething();
You need to be careful that you do not invalidate the object pointers by Foo
reallocating the pools Bar1
and Bar2
.
source to share