Non-repeating stream of pseudo-random numbers with 'clumping'

Question

Non-repeating stream of pseudo-random numbers with 'clumping'

I am looking for a method to generate a pseudo-random stream with a somewhat odd property - I need clusters of adjacent numbers.

The tricky part is that I can maintain a limited number of states no matter how large the range is. There are algorithms that give a sequence of results with minimal state (linear congruence?)

Clumping means there is a higher likelihood that the next number will be close rather than far.

An example of a desired sequence (mod 10): 1 3 9 8 2 7 5 6 4
I suspect this will be more obvious with a lot of flow, but hard to get there manually.

Update:
I don't understand why this is not possible, but yes, I am looking for how Welbog summed it up:

Not repetitive
No tracking
"Clumped"

+2

language-agnostic math algorithm random

OldCodeOrder 15 Sep '09 at 16:05

source to share

10 replies

Pete kirkham · Answer 1 · 2009-09-15T16:44:52+0000

Cascade multiple LFSRs with periods less than you need, concatenating them to produce a result such as the fastest changing register drives the least significant values. So, if you have L1 with period 3, L2 with period 15, and L3 with some longer period, N = L1 (n) + 3 * L2 (n / 3) + 45 * L3 (n / 45). This will obviously generate 3 thickening values and then jump over and total 3 more thickening values. Use something other than multiplication (for example, mixing some bits of higher period registers) or different periods so that the spread of the clusters is wider than the period of the first register. It won't be particularly smoothly random, but it will be lumpy and non-repetitive.

Gregg lind · Answer 2 · 2009-10-07T00:12:54+0000

~~For the record, I am engaging in "non-repetitive, non-random, non-lethal combination tracking,"~~ and I hope some simple, though experiments will shed some light. This is not formal proof. Perhaps someone will save him.

So, I can generate a sequence that has some randomness:

Given x_i, x_ (i + 1) ~ U (x_i, r), where r> x_i.

For example:

if x_i = 6, x_ (i + 1) is a random selection from (6 + epsilon, some_other_real> 6). This guarantees immutability, but at the cost that the distribution increases monotonically.

Without some condition (such as monotony) inherent in the sequence of generated numbers, how else can you guarantee stateless uniqueness?

Edit : So, after investigating RBarryYoung's claim to " Linear Congruential Generators " (nondifferentials ... that's what RBY meant) and obviously I was wrong! These sequences exist and by necessity, any PRNG, whose next number depends only on the current number and some global unchanging state, cannot have repetitions during the cycle (after some initial period of its recording).

mjv · Answer 3 · 2009-09-15T16:11:06+0000

By defining "compilation functions" in terms of the probability distribution of its size and the probability distribution of its range, you can then use simple basic random generators and create sequences.

mob · Answer 4 · 2009-09-15T16:13:42+0000

Perhaps you can create a random sequence and then do some strategic element replacement to get the property you want.

For example, if you find 3 values a, b, c in a sequence, such that a> b and a> c, then with some probability you can swap elements a and b or elements a and c.

EDIT in response to comment:

Yes, you can have a buffer in the stream of whatever size you like. Swap rules can be deterministic or based on another known reproducible pseudo-random sequence.

Mark Rushakoff · Answer 5 · 2009-09-15T16:14:06+0000

One way to get lumpy numbers is to use the normal distribution.

You run a random list with your "initial" random value, then generate a random number with the mean of the previous random value and constant variance, and repeat as needed. The overall variance of the entire list of random numbers should be approximately constant, but the "running average" of your numbers will drift randomly without any particular bias.

>>> r = [1]
>>> for x in range(20):
    r.append(random.normalvariate(r[-1], 1))
>>> r
[1, 0.84583267252801408, 0.18585962715584259, 0.063850022580489857, 1.2892164299497422, 
0.019381814281494991, 0.16043424295472472, 0.78446377124854461, 0.064401889591144235, 
0.91845494342245126, 0.20196939102054179, -1.6521524237203531, -1.5373703928440983, 
-2.1442902977248215, 0.27655425357702956, 0.44417440706703393, 1.3128647361934616, 
2.7402744740729705, 5.1420432435119352, 5.9326297626477125, 5.1547981880261782]

I know it's hard to tell by looking at the numbers, but you can see that the numbers are piling up a bit - 5X at the end and 0.X on the second line.

If you only want integers, you can just use a very large mean and variance, and truncate / divide to get an integer output. Regular distributions are by definition a continuous distribution, which means that all real numbers are potential outputs - this is not limited to integers.

Here's a quick scatter plot in Excel of 200 numbers (starting at 0, constant variance 1):

scatter data http://img178.imageshack.us/img178/8677/48855312.png

Ah, I just read that you want non-repeating numbers. Don't guarantee this under normal distribution, so you may have to consider some of the other approaches others have talked about.

John hyland · Answer 6 · 2009-09-15T16:16:20+0000

I am not aware of an existing algorithm that would do this, but it seems difficult to roll your own (depending on how stringent the "limited state" requirement is). For example:

RANGE = (1..1000)
CLUMP_ODDS = .5
CLUMP_DIST = 10

last = rand(RANGE)
while still_want_numbers
  if rand(CLUMP_ODDS)   # clump!
    next = last + rand(CLUMP_DIST) - (CLUMP_DIST / 2)  # do some boundary checking here
  else   # don't clump!
    next = rand(RANGE)
  end
  print next
  last = next
end

It's a little rudimentary, but would something like this satisfy your needs?

Charles bretana · Answer 7 · 2009-09-15T16:17:11+0000

How about (psuedo code)

// clumpiness static in that value retained between calls
static float clumpiness = 0.0f; // from 0 to 1.0        
method getNextvalue(int lastValue)
   float r = rand();  // float from 0 to 1

   int change = MAXCHANGE * (r - 0.5) * (1 - clumpiness); 

   clumpiness += 0.1 * rand() ;
   if (clumpiness >= 1.0) clumpiness -= 1.0;
   // -----------------------------------------
   return Round(lastValue + change);

Daniel Brückner · Answer 8 · 2009-09-15T16:29:38+0000

In the range [0, 10], the following should give an even distribution. random()

gives a (pseudo) random number r

c 0 <= r < 1

.

x(n + 1) = (x(n) + 5 * (2 * random() - 1)) mod 10

You can get the behavior you want by delinearizing random()

- for example, it random()^k

will skew towards small numbers for k > 1

. A possible function might be as follows, but you will have to try some exponents to find the distribution you want. And keep the exponent odd if you use the following function ...;)

x(n + 1) = (x(n) + 5 * (2 * random() - 1)^3) mod 10

jp · Answer 9 · 2009-09-16T19:46:00+0000

Is there a sequence like 0, 94, 5, 1, 3, 4, 14, 8, 10, 9, 11, 6, 12, 7, 16, 15, 17, 19, 22, 21, 20, 13, 18, 25, 24, 26, 29, 28, 31, 23, 36, 27, 42, 41, 30, 33, 34, 37, 35, 32, 39, 47, 44, 46, 40, 38, 50, 43, 45, 48, 52, 49, 55, 54, 57, 56, 64, 51, 60, 53, 59, 62, 61, 69, 68, 63, 58, 65, 71, 70, 66, 73, 67, 72, 79, 74, 81, 77, 76, 75, 78, 83, 82, 85, 80, 87, 84, 90, 89, 86, 96, 93, 98, 88, 92, 99, 95, 97, 2, 91 (mod. 100) look good?

This is the result of a small ruby program (explained below):

#!/usr/bin/env ruby

require 'digest/md5'

$seed = 'Kind of a password'
$n = 100 # size of sequence
$k = 10  # mixing factor (higher means less clumping)
def pseudo_random_bit(k, n)
  Digest::MD5.hexdigest($seed + "#{k}|#{n}")[-1] & 1
end

def sequence(x)
  h = $n/2
  $k.times do |k|
    # maybe exchange 1st with 2nd, 3rd with 4th, etc
    x ^= pseudo_random_bit(k, x >> 1) if x < 2*h
    # maybe exchange 1st with last
    if [0, $n-1].include? x
      x ^= ($n-1)*pseudo_random_bit(k, 2*h)
    end
    # move 1st to end
    x = (x - 1) % $n
    # maybe exchange 1st with 2nd, 3rd with 4th, etc
    # (corresponds to 2nd with 3rd, 4th with 5th, etc)
    x ^= pseudo_random_bit(k, h+(x >> 1)) if x < 2*(($n-1)/2)
    # move 1st to front
    x = (x + 1) % $n
  end
  x
end

puts (0..99).map {|x| sequence(x)}.join(', ')

The idea is to start with the sequence 0..n-1 and break the order by going k times over the sequence (more passes means less condensation). Each pass first considers pairs of numbers at positions 0 and 1, 2 and 3, 4 and 5, etc. (Common: 2i and 2i + 1) and flips a coin for each pair. Heads (= 1) means swapping numbers in a pair, tails (= 0) means they do not swap them. Then the same is done for pairs in positions 1 and 2, 3 and 4, etc. (General: 2i + 1 and 2i + 2). Since you mentioned that your sequence is mod 10, I have additionally swapped positions 0 and n-1 if the coin for that pair dictates so.

One number x can be displayed modulo n after k goes to any number in the segment [xk, x + k] and is approximately binomially distributed around x. Pairs (x, x + 1) of numbers do not change independently.

As a pseudo-random generator, I only used the last of the 128 output bits of the MD5 hash function, pick whichever function you want. Thanks to clumping, one will not get a "safe" (= unpredictable) random sequence.

andora · Answer 10 · 2010-07-30T09:58:12+0000

Perhaps you can combine 2 or more LCGs in a similar way as described for the LSFRs described here. Binding the least significant LCG to its seed, for a full cycle, increases the next LCG. You only need to store the seed for each LCG. Then you can weigh each part and add the parts together. To avoid repetition in the "compensated" part of LstSig, you can randomly move the LCG every complete cycle.

Non-repeating stream of pseudo-random numbers with 'clumping'

More articles: