Std :: sort does a lot of unnecessary swaps
I have the following code where I am trying a lexicographic comparison of two simple vectors
class ExpensiveClass {
public:
static int increment;
std::vector<int> bigVector;
ExpensiveClass() {
for (int j = 0; j < 1 + increment; ++j)
bigVector.emplace_back(j);
++increment;
}
ExpensiveClass(const ExpensiveClass& other) {
std::cout << "Expensive copy constructor called!" << std::endl;
this->bigVector = other.bigVector;
}
};
int ExpensiveClass::increment = 0;
bool sortFunc(const ExpensiveClass& a, const ExpensiveClass& b) {
bool ret = a.bigVector < b.bigVector;
if (ret == false)
std::cout << "Need to swap" << std::endl;
return ret;
}
int main()
{
std::vector<ExpensiveClass> vectorOfVectors;
vectorOfVectors.reserve(100);
for (int i = 0; i < 100; ++i)
vectorOfVectors.emplace_back();
std::cout << "Start sorting.." << std::endl;
std::sort(vectorOfVectors.begin(), vectorOfVectors.end(), sortFunc);
}
and I would like to know why std::sort
does a lot of swaps even on a sorted vector?
source to share
I think we are falling into the dark corners of the STL. I played around with your code and found that yours sortFunc()
doesn't always return true
. This is because the passed arguments are out of order. The algorithm std::sort
selects items from yours vectorOfVectors
that are out of order, i.e. The index a
inside is vectorOfVectors
not guaranteed to be higher than the index b
. But I don't know how to go deeper into this, so I created separate questions: https://stackoverflow.com/questions/30539898/counter-intuitive-behavior-of-stdsort
Anyway, my point is that you made some assumptions about the behavior std::sort
, however this is implementation dependent.
source to share
Many sorting algorithms that perform well on average compare the elements of the input containers for different reasons, such as to randomize across different inputs to create a good average time complexity, or to rotate swaps around an average threshold, etc.
There are several other algorithms (such as mergesort and heapsort) that start by copying elements and creating a helper structure; for these, you can count on every element of the array copied at least once. These algorithms provide guaranteed asymptotically optimal worst-case performance at the expense of average average performance.
stl :: sort does not guarantee which sorting algorithm it will use. In fact, some STL implementations use two different algorithms: start with an algorithm with good average performance (worst case), and if the algorithm doesn't complete fast enough, stl :: sort switches to a slower average case algorithm with guaranteed asymptotically optimal worst case performance ...
On the other hand, the STL gives you several options for sorting your container: besides stl :: sort (which should work well), you have stl :: stable_sort (which preserves the relative order of equivalent elements) stl :: partial_sort, etc.
I think that in your case, you might prefer stl :: stable_sort. This ensures that there are no swaps between items that are already in the correct order. I doubt stl :: stable_sort actually minimizes the number of swaps, but less should be replaced than stl :: sort.
source to share