Searching for an algorithm to find n bits in a bitte

Background: I am writing a physical memory allocator that allocates 4KiB chunks. It uses a bitset to mark which 4KiB chunks of memory have been used. I don't have a standard C library available to me.

Question: I'm looking for an algorithm that will find n consecutive unset bits in the smallest span, so that I can leave the largest gaps in jumbled bits.

Example: Let's say a bitset contains the following bits:

 0010 0000 0111 0001 1100 0011

      

If I want to set 4 bits, the algorithm should return bit # 18.

+3


source to share


4 answers


I think you can do it in 2 passes:

Pass # 1:
Scan the array, noting the number of consecutive zero bits and the beginning of those bits.

In your example, the scan will create:

2 bits, starting at 0
6 bits, starting at 3
3 bits, starting at 12
4 bits, starting at 18

      



Pass # 2:
Scanning data from pass # 1, finding the target value (4) or the lowest value greater than the target.

Both passes seem trivial to write in C, and it should be a general solution to work in all cases.

After you get this working, I also see some optimizations, so you might not need to run Pass # 2 at all in certain cases.

+2


source


as I mentioned in my comment, things happen completely differently when you try to deal with concatenation. but to solve the problem you are asking now. it's pretty easy using Red Black Tree

which one I am using to call RBTree

.

there are thousands of RBTree

implementations, so you can choose one for your language. where i provide my memory path allocate

only using python like pseudocode. (as I said, if you try to figure out that freeing memory is a different problem.)

RBTree:

: number 0.

value: first position of consecutive 0's.

so in your case you should initialize like:

rbt=new RBTree()
rbt.insert(2, 0)
rbt.insert(6, 3)
rbt.insert(3, 12)
rbt.insert(4, 18)

      

Forgive me if I'm wrong.

when you want to allocate a chunk of memory:

func alloc(num_of_chunks):
    # try to find the key-value-pair that is the min one that satisfy: chunk.key >= num_of_chunks
    chunk=rbt.find_ceil(num_of_chunks) 
    if chunk is Nil: raise NotFound
    ret=chunk.value
    # may locate some chunks that have bigger size than required.
    if chunk.key>num_of_chunks:
        rbt.insert(chunk.key-num_of_chunks, chunk.value+num_of_chunks)
    return ret

      

to save the tree.

The advantage of using RBTree:



  • fast. linear search I suggest in a comment, O(n)

    but using RBTree, it shrinks to O(lg n)

    , which means it is more scalable.

  • easy to maintain. there are thousands of well implemented libraries that suit your requirements.

  • if you use RBTree keeping track of the block position (i.e. the field value

    in my code) you will have a good track in the allocated memory and can give good results when trying to solve coalescences.


Update

this answer appears to have to do with heap allocation, which can cause the trigger and egg problem, which is actually not the case.

if you know that a piece is not selected, it should not be used. therefore, RBTree data can be stored in an unused chunk, which means that the metadata is actually allocated in memory space. so in C, a node in a problem like this could be:

struct node {
    int length; // key
    struct node *left, *right;
}

      

in the first bytes in the chunk.

so you need to remember root

.

struct node *root;
// your code should operate on rootp, since rotation on RBTree may have root changed.
// all interfaces related should all receive struct node ** type.
struct node **rootp = &root;

      

", so what about the meaning? you haven't defined a field to store the block address."

yes, I have identified. since the data itself is stored in a chunk, so the address struct node

is the address of the block.

This way you can avoid heap allocation and it seems I am not answering how you can find a suitable bit sequence already ... but I think so, you could manage your memory allocation better.

+1


source


int find_free(int alloca_mem, int req)
{
    int min_len = INT_MAX;
    int start = 0;
    int min_len_index = -1;
    int i;
    for (i = 0; i < NUM_BITS; ++i)
        if (alloca_mem & (1 << i)) {
            int len = start - i;
            if (len > req && len < min_len) {
                min_len_index = i;
                min_len = len;
            }
            start = i + 1;
        }
    }
    return men_len_index;
 }

      

0


source


Call the machine word size w (often w = 32 or 64) and the total number of memory pages 4 KB m. If there are g spaces between the blocks already allocated, and there is some free block of length-n that fits completely into the machine word (and you are happy with the choice of the smallest such block, and not one that crosses the border between two words), then its can be found in O (m / w + g) time , using the method wraps around the addition of "ripples":

  • Given a bitmap word representing w 4KB pages, invert all of its bits. Now any block of k consecutive free 4KB pages is a block of k consecutive 1-bits.
  • If you add an integer whose only 1-bit is the LSB of that block, the carries will pulsate completely, leaving all k bits in that block at 0 and the next bit at 1.

Using this alone, it is not too hard to find a space that contains n bits in O (m / w + g), without using expensive or "weird" instructions like multiply, divide, or scan bits. But in order to find the smallest such block, you seem to need either a bit scan or a split. Fortunately, we can limit the number of times we need to use these instructions for no more than w-1 operations in total over the entire bitmap, since this is the largest number of times the best break in that area can be "closed" to less, (Instead of using subdivision, you can use code like the last paragraph to find the MSB index in O (logs w) time using simple instructions, this might be faster, but the point is moot.since so few subdivisions are required.) We still need O (g) multiplications, which may not be perfect, but are fast on modern CPUs.

const int m = /* HOW MANY BLOCKS */;
const int w = sizeof (unsigned) * CHAR_BIT;
unsigned bits[m / w];

unsigned findSmallestBlock(unsigned n) {
    if (n > w) return UINT_MAX;    // Can only find within-word blocks.
    unsigned bestI = UINT_MAX;    // Index with bits[]
    unsigned bestLsb = 0;  // Has a 1-bit in the LSB of the gap; 0 = "no solution"
    unsigned bestShifted = ~0;    // The gap bits, "flush right".
    for (int i = 0; i < sizeof bits / sizeof bits[0]; ++i) {
        // Find the shortest block in bits[i], if one exists
        // First, handle an edge case
        if (bestLsb == 0 && bits[i] == 0) {
            // We don't handle this edge case
            bestI = i;
            bestLsb = 1;
            bestShifted = ~0;
            if (n == w) break;    // Perfectly snug
        }
        unsigned y = ~bits[i];
        unsigned probe = y & (y - 1);    // The LSB of the gap we will test
        while (probe << (n - 1)) {    // Left-shifting it too far => 0 => stop.
            y += probe;    // Ripple!
            unsigned edge = y & (y - 1);    // Extract new LSB.  Overflow to 0 OK.
            unsigned gap = edge - probe;    // Every bit in the gap is 1.
            // Is the gap big enough?
            if (probe << (n - 1) <= gap) {
                // The gap is big enough.  Is it the tightest fit so far?
                // Doing this with bit-scan operations is easy; without them we can
                // use integer multiplication and division, but we want to keep the
                // divisions to a minimum.  Needs an edge case to be handled above.
                if (gap < bestShifted * probe) {
                    // Found a new, shorter gap.
                    // The division shifts the entire block right so that it starts
                    // at bit 0.  It expensive, but happens at most w-1 times.
                    bestI = i;
                    bestLsb = probe;
                    bestShifted = gap / probe;
                    // Is it perfectly snug?  If so, we can stop now.
                    if (probe << (n - 1) > gap >> 1) goto done;    // Yes, "goto".
                }
            }
            y -= edge;            // Clear the *new* LSB; ignore intervening bits
            probe = edge << 1;    // Again, skip over all intervening bits
        }
    }

done:
    if (bestLsb == 0) return UINT_MAX;    // Didn't find anything

    // Find the index of the 1-bit in bestLsb in O(log w) time.
    unsigned pos = bestI * w;
    unsigned z = w >> 1;    // The number of bits we will try to shift right
    while (bestLsb > 1) {
        unsigned x = bestLsb >> z;
        if (x) {
            bestLsb = x;
            pos += z;
        }
        z >>= 1;
    }

    return pos;
}

      

A word about time complexity: O (m / w + g) is the same as O (m) time that some other solutions suggest when g is about the same size as m, which can happen: for example if every second page is 4KB was distributed, then g = m / 2. But if g is small - either because there aren't many pages allocated, or because almost all of them have, or because they were just allocated and deallocated in the pattern, resulting in a few whitespaces - then the time complexity is closer to O (m / w).

One more thing: without changing the time complexity, this code can be adapted to find blocks that span word boundaries by tracking the final break in each word and doing a one-off test at the beginning of each word to see if the previous word grows, the final break will work. It is also possible to search for blocks larger than w (otherwise: any space that satisfies the query for n> kw pages must contain at least k full words of 0 values โ€‹โ€‹in the [] bits - using this can significantly reduce the constant coefficient when searching for large spaces). Both of these extensions require more accounting.

0


source







All Articles