Post Reply 
Exact Discrete pseudo-Random Sampling
04-08-2020, 03:33 AM
Post: #1
Exact Discrete pseudo-Random Sampling
There was an article (Daniel Lemire perhaps) recently on the inexactness of discrete sampling with pseudorandom number generators. (I'll use sample a single die toss.) He pointed out that a PRNG is usually based on a power of 2 but 6 is not such a power. THis leads to a sampling error in the the numbers from 1-6 are not sampled equally. For example, using an 8-bit generator (too short for real use but good for pedagogy) with an Linear Congruential Generator, X(n+1)=181*X(n)+181 mod 256, one has 42 samples of 1-6 and 4 more numbers leading to an error of 4/256. One suggested correction is to throw away numbers that are too long. Another (and probably good enough in practice) is to use a really large M (X=AX+b mod M is the recursion) like 64 or 128 or 256. There's still a bias but it's too small to matter.

I developed another method for my HP-50 dice game. Analogously to the Sophie-Germain-Safe Prime stuff, I use a prime modulus M=6*P+1, where P is another prime. The 6 is the size of the sample space. It could be replaced by whatever is useful in another problem. The point is that it's not hard (but it is expensive) to generate large M. One can use M=79=6*78+1 to see how the method works.

Next one has to find a multiplier A which is a primitive root of M. It's fairly easy on the HP50 or with a big number math package. This ensures the the LCG X=A*X mod M has cycle M-1 (=6*P). The output (as an integer) goes through P sets of each of 1 - 6.

To ensure small correlation, A/M as a continued fraction (a1,a2,a3...) should have each partial quotient Ai and the sum of partial quotients small. This takes a search. I did a short search: M=824633721547 and A=319872179753.

At this point I found that the HP50 has bugs with integers bigger than 2^40 (or something near there.) SORT does not work on large integers. FLOOR can fail; trying 'P/Q'
M
*
FLOOR
fails for large M.
However,
P/Q FXND SWAP
M SWAP
* / FLOOR
works.

If there's interest, I'll post my primitive root and quadratic residue and continued fraction codes later.
Find all posts by this user
Quote this message in a reply
04-08-2020, 05:42 AM
Post: #2
RE: Exact Discrete pseudo-Random Sampling
Don't reinvent the wheel. Look up Lemire's delightful: Fast Random Integer Generation in an Interval.

I very much doubt that there is a more efficient method.

Pauli
Find all posts by this user
Quote this message in a reply
04-08-2020, 12:34 PM
Post: #3
RE: Exact Discrete pseudo-Random Sampling
(04-08-2020 03:33 AM)ttw Wrote:  To ensure small correlation, A/M as a continued fraction (a1,a2,a3...) should have each partial quotient Ai and the sum of partial quotients small. This takes a search. I did a short search: M=824633721547 and A=319872179753.

What does that mean by "small" ?

BTW, your searched A is *not* a primitive root modulo M

>>> A=319872179753
>>> M=824633721547
>>> P = M//3                    # order(A,M) = (M-1)/3 = 274877907182
>>> pow(A,P,M)
1
Find all posts by this user
Quote this message in a reply
04-08-2020, 12:53 PM (This post was last modified: 04-08-2020 12:54 PM by ttw.)
Post: #4
RE: Exact Discrete pseudo-Random Sampling
Thanks. Perhaps my Proot stuff needs checking. (Or maybe my copying from the calculator.)

The correlation in a LCG is bounded by the sum of the partial quotients of A/M. (There's another bound, not as tight, based on the largest quotient.) So going for a small quotients is part of the process.
Find all posts by this user
Quote this message in a reply
04-08-2020, 08:56 PM (This post was last modified: 04-08-2020 09:22 PM by Albert Chan.)
Post: #5
RE: Exact Discrete pseudo-Random Sampling
For random number with small range (say, dice games), bit-mask is simple, and fast.
see O'Neill's benchmark, Efficiently Generating a Number in a Range

lua> mt19937 = require 'mt19937' -- Mersenne Twister PRNG
lua> rand = mt19937.new(os.clock())
lua> dice = ('%010o'):format(rand(0, 2^30-1))
lua> dice                                      -- upto 10 valid dice throws !
3746210703
lua> dice = dice:gsub('[07]', '')      -- remove invalid throws
lua> dice
346213
Find all posts by this user
Quote this message in a reply
04-08-2020, 11:57 PM
Post: #6
RE: Exact Discrete pseudo-Random Sampling
(04-08-2020 05:42 AM)Paul Dale Wrote:  Don't reinvent the wheel. Look up Lemire's delightful: Fast Random Integer Generation in an Interval.

Amazingly, Lemire's sample rejection trick work in any base. Big Grin

Example, say random r = [0, 10), but we wanted random range [0, 7), without bias.
Code:
r      0  1  2  3  4  5  6  7  8  9   m=10, s=7
sr     0  7 14 21 28 35 42 49 56 63
sr//m  0  0  1  2  2  3  4  4  5  6
sr %m  0  7  4  1  8  5  2  9  6  3
reject x        x        x            // reject cases = m%s = 10%7 = 3
sr//m     0  1     2  3     4  5  6   // remain cases to return

Note: we could also reject high cases (sr%m = 7,8,9), and still remove the bias.

Now, try the same, but with random r = [0, 12)
Code:
r      0  1  2  3  4  5  6  7  8  9 10 11   m=12, s=7
sr     0  7 14 21 28 35 42 49 56 63 70 77
sr//m  0  0  1  1  2  2  3  4  4  5  5  6
sr %m  0  7  2  9  4 11  6  1  8  3 10  5
reject x     x     x        x     x         // reject cases = m%s = 12%7 = 5
sr//m     0     1     2  3     4     5  6   // remain cases to return

Again, reject high cases (sr%m = 7,8,9,10,11) also remove the bias.
Find all posts by this user
Quote this message in a reply
04-09-2020, 12:54 AM
Post: #7
RE: Exact Discrete pseudo-Random Sampling
(04-08-2020 05:42 AM)Paul Dale Wrote:  Don't reinvent the wheel. Look up Lemire's delightful: Fast Random Integer Generation in an Interval.

I very much doubt that there is a more efficient method.

Pauli

How well would algorithm 5 (using full-width multiply, bit shifts and masking) work on a HP 50g though? I can see that algorithm 4 (the Java algorithm) would be useful.

— Ian Abbott
Find all posts by this user
Quote this message in a reply
04-09-2020, 04:53 AM
Post: #8
RE: Exact Discrete pseudo-Random Sampling
(04-09-2020 12:54 AM)ijabbott Wrote:  How well would algorithm 5 (using full-width multiply, bit shifts and masking) work on a HP 50g though? I can see that algorithm 4 (the Java algorithm) would be useful.

In machine code, very well. Its 32 bit ARM processor could handle this fairly efficiently.


Pauli
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 1 Guest(s)