Can XorShift return zero?

I read about XORShift PRNG, especially about the document here

The guy here claims that

The number is in the range [1, 2 ** 64]. Note that it will never be 0.

Looking at some code that makes sense:

uint64_t x;
uint64_t next(void) {
   x ^= x >> 12; // a
   x ^= x << 25; // b
   x ^= x >> 27; // c
   return x * UINT64_C(2685821657736338717);
}

      

If x

it is equal to zero, then each new number will also be equal to zero. But doesn't that make him less useful? The usual usage pattern would be about the same as min + rand() % (max - min)

, or converting 64 bits to 32 bits if you only need int

. But if it 0

never comes back, it can be a serious problem. Also, the bits are not 0

or 1

with the same probability that they are obviously 0

missing, so zeros or are slightly less likely. I can't even find any mention of this on Wikipedia, so am I missing something?

So what is a good / suitable way to generate random, equally distributed numbers from XorShift64 * in a given range?

+3


source to share


2 answers


Short answer: No, it cannot return zero.

According to Numeric Recipes "it produces a full period 2^64-1

[...], the missing value is zero.



The bottom line is that these offset values ​​have been carefully chosen to make very long sequences (possibly a complete one without a zero), and therefore you can be sure that every number is being generated. Zero is indeed the fixed point of this generator, so it creates 2 sequences: Zero and another containing all the other numbers.

Thus, IMO, for a sufficiently small range, it is max-min

enough to make a function (next() - 1) % (max - min) + min

or even completely eliminate the subtraction, since zero will be returned modulo. If a better quality equal distribution is required, the "normal" method should be used, using next()

as a base generator with a range[1, 2^64)

+2


source


I'm pretty sure there is x

one for which the xorshift operation returns 0.

Evidence:

First, we have the following equations:

a = x ^ (x >> 12);
b = a ^ (a << 25);
c = b ^ (b >> 27);

      

Substituting them:

b = (x ^ x >> 12) ^ ((x ^ x >> 12) << 25);

c = b ^ (b >> 27) = ((x ^ x >> 12) ^ ((x ^ x >> 12) << 25)) ^ (((x ^ x >> 12) ^ ((x ^ x >> 12) << 25)) >> 27);

      

As you can see, although it c

is a complex equation, it is completely abelian.

This means that you can express bits c

as fully logical bit expressions x

.

Thus, you can just build a system of equations for a bit b0

, b1

, b2

, ... so:

(Note: the coefficients are just examples, I have not calculated them, but this is how it would look):

c0 = x1 ^ !x32 ^ x47 ...
c1 = x23 ^ x45 ^ !x61 ...
...
c63 = !x13 ^ ...

      



From now on, you have 64 equations and 64 unknowns. You can just solve it with Gauss-elim , you always have one unique solution.

Except in some rare cases, i.e. if the determinant of the coefficients of the system of equations is zero, but this is very unlikely due to the size of such a large matrix.

Even if it does, it will mean that you have a loss of information at each iteration, i.e. you cannot get all the 2^64

possible values x

, just a few.

Now consider the much more likely possibility that the coefficient matrix is ​​nonzero. In this case, for all possible 2^64

values x

, you have all possible values 2^64

c

, and they are all different.

This way you can get zero.

Expansion: you actually get zero for zero ... sorry, the proof is more useful to show that this is not as easy as it seems for the first place. The important part is that you can express the bits c

as a Boolean function of bits x

.


There is another problem with the random number generator. And this is that even if you modify the function in some way so as not to have this problem (for example, adding 1 per iteration):

  • You still can't guarantee that it won't end up in a short loop * for any possible values x

    . What if there is a 5-cycle length for the value 345234523452345? Can you prove all possible initial values? I can not.

  • In fact, having a truly pseudo-random iteration function, your system will most likely terminate after 2^32

    iterating. It has an almost trivial combinatorial reason, but "unfortunately this edge is too small to contain it"; -)

So:

  • If the loop length 2^32

    for your PRNG is ok, then use a tried and tested iteration function compiled somewhere on the net.
  • If it is not, update the bit length at least 2^128

    . This will result in roughly cycle length 2^64

    , which is not that bad.
  • If you still want 64-bit output, use an internal 128-bit numeric value, but return (x>>64) ^ (x&(2^64-1))

    (i.e. the top and bottom half of the internal state x

    ).
-1


source







All Articles