Initialize dynamic array in O (1) times in C / C ++

Is there a way to initialize an entire dynamic array in O (1) times? Is there something similar bool a[10] = {false}

in the case of a static array?

+3


source to share


4 answers


No, no, not in C ++, not in c, not in any language or processor that I know of. But it can be done with 1 line of code.

in C: for char array:

char a[N];
memset(a, '1', N);  // only works for data 1 byte long

      



in C ++

std::vector<T> v;
std::fill(v.begin(), v.end(), value);

      

+4


source


For a dynamic array, each element you want to set must be individually considered by the CPU, so the time complexity is O (N).

Or that?

But actually it is not. This is also not how processors work.

Big-oh notation refers to the asymptotic behavior of the system as the number of elements increases to infinity. That is: we have to take it with a grain of salt when we apply it to a small number of elements!

Processors also behave differently for small or large items because they have different cache levels that imply different access latencies. And you see all this time in the development of high-performance algorithms. See, for example, this page which describes how to optimize matrix multiplication: The Locking to Maintain Performance section is all about reordering computations so that they stay in the processor's cache.

Now let's talk about your question more specifically.

Consider below a handy graph of latency numbers each of which a computer scientist should know ( source ).

Delay numbers

The main point here is that a reference to main memory (random) costs about 100 ns, while references to L1 cache are 0.5 ns and to L2 cache costs 7 ns. Due to the effects of caching, reading sequentially from RAM gives a significant speed boost.

This means that, all other things being equal, you can read 200 numbers from L1 or 14 numbers from L2 in the time it takes to read one number from RAM.

And that brings us to value models.

Let's consider the initialization of two dynamic arrays:

std::vector<int> a(20,1);
std::vector<int> a(21,1);

      

From the above, we expect the extra item to take on a value of ~ 0.5 ns (since the entire array fits into the cache), whereas storing the array in memory takes 100 ns. That is, the marginal cost of adding an additional element is negligible. In fact, the marginal cost of adding even many elements will be negligible (how much depends on the processor).

Let's apply these ideas. Consider these two statements.

int m = array1[20];
std::vector<int> a(900,1);

      

If you think about it in terms of O (1) and O (N), you might think of something silly, like "the second statement will take 900 times longer than the first." A more complex thought you may have is that "the latent coefficients of the second O (N) operator can be small, so it is difficult to know which will be faster."



With some knowledge of caching, you might instead say to yourself, "for these small values ​​of N, asymptotic analysis does not work." You might think that "these statements might take the same time" or "the second statement might be faster than the first one due to caching effects."

Experiment

We can use a simple experiment to demonstrate this.

The following program initializes arrays of different lengths:

#include <vector>
#include <iostream>
#include <chrono>
#include <string>

int main(int argc, char **argv){
  const int len = std::stoi(std::string(argv[1]));
  auto begin = std::chrono::high_resolution_clock::now();
  for(int i=0;i<10000;i++){
    std::vector<int> a(len,1);
    (void)a;
  }
  auto end = std::chrono::high_resolution_clock::now();
  std::cout << std::chrono::duration_cast<std::chrono::nanoseconds>(end-begin).count() << std::endl; //Nanoseconds
}

      

The program runs many initializations of a dynamic array to average over fluctuations due to other programs running on the machine.

We run this program many times:

for n in {1..100}; do ./a.out 1; done | awk '{print "1 "$1}' | xclip

      

To deal with differences in program startup and machine memory state due to other running programs.

On my Intel (R) Core i5 processor with a 2.67GHz M480 processor with 32K L1 cache, L2000K cache and 3072K L3 cache, the result is as follows:

Sublinear initialization

Note that the increase in time for small arrays is sublinear (combined with the later behavior on large arrays) after about 1000 elements. This is not O (N) behavior. After that, adding 10 times the number of items results in approximately 10 times the time. This behavior is O (N).

Trying the same test on my Intel (R) Xeon (R) E5-2680 v3 @ 2.50GHz processor with 32K L1 cache, 256K L2 cache and 30720K L3 cache gave very similar results:

Sublinear initialization

Bottom line

Array initialization is in O (N) since it grows with the number of elements. However, in practice, the cost does not increase linearly due to caching. Therefore, the "O (1)" command can take the same time as the "O (N)" command, depending on the value of N.

+6


source


It is not possible to initialize an array in less than O (n) time, by the very definition of big-O. The CPU needs to fill the array of elements with n

n

zeros.

When you declare an array bool a[10] = {false}

in a static or global context, the CPU still spends O (n) time filling it with zeros, but this happens at program load time. Likewise, declaring this array in an automatic context (local array variable) results in processor cycles being consumed when entering the function in which the array is declared.

Finally, you can initialize a dynamically allocated array using this syntax:

int *array = new int[ARRAY_SIZE](); // Note the parentheses

      

Again, this is done O (n) times after allocation.

+2


source


define: tells the compiler that there is an array type char and size 100, but has not yet been created The compiler takes note of but does not reserve memory char a [100];

initialize: Tells ompiler that an array exists and reserves 100 bytes of space for the char array b [100] = {}; or char b [100] = {0}; the value of each element is undefined, not initialized, but available for use.

If you want to set the initial value of the array, you need to write it manually char c [100] = {}; for (; i <100; i ++) {c [i] = 0; }

So the array creation and space reservation is 1 (0); Setting the value takes longer

0


source







All Articles