Random Thoughts

Random Thoughts

I've worked for a long time with Monte Carlo based simulations, and other simulation types that require a lot of random numbers. In these situations there is always the discussion of which pseudo-random number generation algorithm is the best. These are pseudo-random number generators because they rely on a deterministic algorithm that will provide the same sequence of numbers for a given starting state. Random numbers/characters are useful for many things -

  • Monte Carlo simulations. Is your result real or due to the idiosyncrasy of your random number generator? Like the infamous IBM RANDU, and see this review for analysis of some popular algorithms.
  • Generating strong passwords. Since many people use password saving systems (e.g. passwords.google.com) passwords don't need to be terribly memorable and can be things like L"\wG)H%i+3?u?i77 which is very hard to guess.
  • Selecting contest winners
  • Strong cryptography

The popular web site random.org is famous for it's 'true random numbers' - however they don't tell you exactly how it works. You can find the code for the method described here at github if you read to the end.

My thought is - why only use one algorithm, when you can use "all of them"? I don't need to make yet another random number generation algorithm when I can combine all of the known high-quality ones.

(credit: Allie Brosh)

Every random number generator has strengths and weaknesses. See this review for a summary. I created "MultiRandom" to address the weakness of a single random number generator by using a set of random number algorithms, and randomly selecting which algorithm will generate the next number. This way any single weakness will be diluted by the use of multiple methods. In addition the "entropy pool" of state that is used the generate the numbers can be any arbitrary size, thus increasing the possible number of sequences and thus the period.

This list of generators at the right is an example from the current implementation. It shows the currently implemented algorithms and features multiple instances of each (each instance maintains its own state). The MersenneTwister is included, as is use of AES encryption, and SHA-512 message digests to create random numbers. In addition some 'classic' 64 bit methods such as linear congruential generators (Random64) , XORShift, and Multiply with carry are implemented. The concept allows other methods to be easily implemented as well. They need only to extend the Java Random class. So it is easy to add other high-quality algorithms. At the minimum you just implement the method "protected int next(int bits)"

For some example output, random numbers in the RAND book format, and some strings for passwords are shown at the right.


These are generated with a deterministic algorithm, even though the starting seed can be very long.

So to improve the randomness, I created a system that gathers random bytes from the internet in the form of weather reports - weather maps and barometric reports that change in relatively unpredictable ways. The system periodically reads these sources, applies a SHA-512 hash and uses the resulting bytes to add to the random seeds of the algorithm so that it can approach a truly random generation. That is, even with the same starting seed it will produce unique number sequences as the weather changes across the United states. One could even localize the system to weight local weather over global weather to insure that your random sequence is different than that in other locations. It does have the disadvantage that you have to have an internet connection to gather external entropy - but there has to be some external source beyond the algorithm to get true randomness.

The list at the left shows the sources used to update the bytes. They also include Google news so that fast-breaking news will reseed the randomness. I will include more global maps when I can identify sources to read the information directly without javascript between the url and fetching the image/data. I have experimented with traffic webcams and other sources, but find them less reliably updated than government weather maps.

So far the algorithm seems to pass the dieharder random number tests developed by George Marsiglia.

You can find the code (GPL3) at this github repository.

Bill ∩ Ross

Owner at Phobrain.com

6 年

How about picking a limited random number of these, then performing limited random operations on them?

回复

要查看或添加评论,请登录

Matthew Clark的更多文章

社区洞察

其他会员也浏览了