Why are there wide runtime variations when running UnionFind on Windows and Mac OS?
I recently attended courses on data structure, I learned that QuickUnion performance is better than QuickFind when connecting two elements. But when I test the same code in GCC instead of Windows Mac instead of Mac OS X, I got a completely different result, but I didn't know why. Here is the QuickFind code.
#ifndef INC_03_QUICK_UNION_UNIONFIND1_H
#define INC_03_QUICK_UNION_UNIONFIND1_H
#include <cassert>
using namespace std;
namespace UF1 {
class UnionFind {
private:
int *id;
int count;
public:
UnionFind(int n) {
count = n;
id = new int[n];
for (int i = 0; i < n; i++)
id[i] = i;
}
~UnionFind() {
delete[] id;
}
int find(int p) {
assert(p >= 0 && p < count);
return id[p];
}
bool isConnected(int p, int q) {
return find(p) == find(q);
}
void unionElements(int p, int q) {
int pID = find(p);
int qID = find(q);
if (pID == qID)
return;
for (int i = 0; i < count; i++)
if (id[i] == pID)
id[i] = qID;
}
};
}
#endif //INC_03_QUICK_UNION_UNIONFIND1_H
And QuickUnion:
#ifndef INC_03_QUICK_UNION_UNIONFIND2_H
#define INC_03_QUICK_UNION_UNIONFIND2_H
#include <cassert>
using namespace std;
namespace UF2{
class UnionFind{
private:
int* parent;
int count;
public:
UnionFind(int count){
parent = new int[count];
this->count = count;
for( int i = 0 ; i < count ; i ++ )
parent[i] = i;
}
~UnionFind(){
delete[] parent;
}
int find(int p){
assert( p >= 0 && p < count );
while( p != parent[p] )
p = parent[p];
return p;
}
bool isConnected( int p , int q ){
return find(p) == find(q);
}
void unionElements(int p, int q){
int pRoot = find(p);
int qRoot = find(q);
if( pRoot == qRoot )
return;
parent[pRoot] = qRoot;
}
};
}
#endif //INC_03_QUICK_UNION_UNIONFIND2_H
Then UnionFindTestHelper, a class that can help you test two types of data structures:
#ifndef INC_03_QUICK_UNION_UNIONFINDTESTHELPER_H
#define INC_03_QUICK_UNION_UNIONFINDTESTHELPER_H
#include <iostream>
#include <ctime>
#include "UnionFind1.h"
#include "UnionFind2.h"
using namespace std;
namespace UnionFindTestHelper{
void testUF1( int n ){
srand( time(NULL) );
UF1::UnionFind uf = UF1::UnionFind(n);
time_t startTime = clock();
for( int i = 0 ; i < n ; i ++ ){
int a = rand()%n;
int b = rand()%n;
uf.unionElements(a,b);
}
for(int i = 0 ; i < n ; i ++ ){
int a = rand()%n;
int b = rand()%n;
uf.isConnected(a,b);
}
time_t endTime = clock();
cout<<"UF1, "<<2*n<<" ops, "<<double(endTime-startTime)/CLOCKS_PER_SEC<<" s"<<endl;
}
void testUF2( int n ){
srand( time(NULL) );
UF2::UnionFind uf = UF2::UnionFind(n);
time_t startTime = clock();
for( int i = 0 ; i < n ; i ++ ){
int a = rand()%n;
int b = rand()%n;
uf.unionElements(a,b);
}
for(int i = 0 ; i < n ; i ++ ){
int a = rand()%n;
int b = rand()%n;
uf.isConnected(a,b);
}
time_t endTime = clock();
cout<<"UF2, "<<2*n<<" ops, "<<double(endTime-startTime)/CLOCKS_PER_SEC<<" s"<<endl;
}
}
#endif //INC_03_QUICK_UNION_UNIONFINDTESTHELPER_H
Finally main.cpp:
#include <iostream>
#include "UnionFindTestHelper.h"
using namespace std;
int main() {
int n = 100000;
UnionFindTestHelper::testUF1(n);
UnionFindTestHelper::testUF2(n);
return 0;
}
Teacher tested QuickUnion can save half the time than QuickFind, but when I tested on Windows 10 x64, the two execution results are almost the same. I don't know if I can make mistakes or differences in operating systems.
source to share
First, you wrote:
I found out that QuickUnion performance is better than QuickFind when connecting two items. But when I test the same code ...
But your test program doesn't just test the performance of the connection, but the union plus find.
Second, here's the order of growth for N elements for QuckUnion and QuickFind:
QuickFind:
find
: O (1)union
: O (N)QuickUnion:
find
: O (tree height)union
: O (tree height)
QuickUnion is not always faster than QuickFind with your test program.
- QuickFind is effective when you do much more
find
thanunion
. - QuickUnion can be more efficient if you do more
union
thanfind
.
Finally, performance in the QuickUnion data structure is not good in high trees .
In your test program, the height of the tree will depend on the results of the function rand()
. This explains why your results vary from one system to another. You must rewrite your test program without rand()
it to make it reproducible.
source to share