Binary search optimization in c?
I am writing a key entry where I have an index between the key and the number of rivers. This is sorted by key. Is there a way to do this better than what I have for speed optimization?
typedef struct
{
char key[MAX_KEYLEN];
int rec;
} KeyRecPair;
typedef struct
{
KeyRecPair *map;
int numRecs;
} KeyRecMap;
int GetRecFromKey(char *key, KeyRecMap *theMap)
{
int cmpValue, bottom = 0;
int half = theMap->numRecs / 2;
int top = theMap->numRecs - 1;
while (bottom != top)
{
cmpValue = strncmp(key, theMap->map[half].key, MAX_KEY_LEN);
if (cmpValue > 0)
{
/*top stays*/
bottom = half + 1;
half = bottom + (top - bottom) / 2;
continue;
}
if (cmpValue < 0)
{
/*bottom stays*/
top = half - 1;
half = bottom + (top - bottom) / 2;
continue;
}
return theMap->map[half].rec;
}
if (0 == strncmp(key, theMap->map[half].key, MAX_KEY_LEN))
return theMap->map[half].rec;
return 0;
}
source to share
A good chunk of your time will be spent on strncmp.
I suggest forcing it to be inline , or overwriting it inline to avoid calling the overhead function.
If you are feeling brave it may be possible to roll out the Loop once or twice and see a performance boost.
If the string was actually a fixed-length char array, you could make the length a multiple of 4 and compare 4 bytes at a time with the unsigned Int comparisons, instead of 1 byte at a time.
If you don't have a profiler , you should get one. Profilers make it easy to understand what the relative cost of different implementations is.
Another option is to choose a different way to organize your data. Check out AVL trees for inspiration. Choosing some sort of Hashing like the others mentioned above might be a viable option
source to share
Instead of using a binary search to find an item, a hashmap might be more appropriate since it has O (1) search characteristics. However, this can be slow with a load of collisions with a naive approach. However , this article describes a way to create a hashmap tree that has O (log (n) / log (32)) access times, which is usually superior to conventional hashmap implementations. (Fixed implementation of aray + linked list).
source to share
Is there a reason to optimize this? Did you run the program with a profiler and determine that the search takes up a significant portion of the total execution time? Are you just wondering how fast you can get it? (Or, in my opinion, good reasons.) If you're just arbitrarily optimized for this, don't do it.
Also, don't forget to check. It's hard to say which of the two versions of the function is faster on a modern system (it was easier on my Z80). How many cache misses may or may not be more important than the number of mispredicted branches.
source to share
The only potential optimization I can think of is to use something similar to the golden ratio when calculating, half
instead of dividing the remaining subset into two halves with equal number of elements, i.e.
if (cmpValue > 0)
{
/*top stays*/
bottom = half + 1;
half = bottom + (top - bottom) * 3 / 5;
continue;
}
if (cmpValue < 0)
{
/*bottom stays*/
top = half - 1;
half = bottom + (top - bottom) * 2 / 5;
continue;
}
source to share
While I would expect a decent optimizer to do this for you, I would put the Map-> map in local to have half the chance to hit the register instead of dereferencing it on every access. Again, I expect the optimizer to do this for you, so you can check the build output as well.
I have looked at Visual Studio 2008 release in release and it does a pretty good job of code. For example, the comparison code looks like this:
; 30 : if (cmpValue > 0)
test eax, eax
jle SHORT $LN11@GetRecFrom
; 31 : {
; omitted inner block for > case.
$LN11@GetRecFrom:
; 37 : if (cmpValue < 0)
jge SHORT $LN2@GetRecFrom
basically it's branch to branch without re-checking cmpValue. Nice touch.
There is little benefit to having Map-> local, but it's tiny. If MAX_KEY_LEN is not a short of 4 and the structures are not padded, you must make sure to put the int first in your structure.
source to share