Anyone who considers algorithmic methods for creating random numbers is, of course, in a state of sin. For, as has been pointed out several times, there is no such thing as a random number– there are only methods to produce random numbers, and a strict arithmetic procedure of course is not such a method. – JohnVonNeumann
A pseudo-random string is one that has been produced by some deterministic means (as by an algorithm) but possesses many of the same statistical properties as a random string.
There are essentially two categories of algorithms used to produce pseudo-random strings of bits.
Heuristic pseudo random number generators (PRNG’s):
These are algorithms designed to produce pseudo random strings of bits. The output should be, but often isn’t, indistinguishable from random output. Much about the design of such generators can be found in TheArtOfComputerProgramming (DonKnuth is known for creating several). Examples include:
- MersenneTwister
- LinearCongruentialPseudoRandomNumberGenerators (one of the most popular kind, unfortunately)
- MultiplyWithCarry generators
- LinearFeedbackShiftRegister
Cryptographically secure (or strong) pseudo random number generators (CSPRNG’s):**
These are generators designed with the same goal as above, but with the additional property that given some arbitrarily long (but obviously less than the period) sequence of output from the generator, it should be computationally infeasible to determine the next bit with greater certainty than 1/2. Information about those can be found in the book FoundationsOfCryptography (vol.1), by OdedGoldreich. Examples include:
- IndirectionShiftAccumulateAddCount (ISAAC)
- BlumBlumShub
-
DevUrandom
- Linux supports two pseudo-devices to produce pseudorandom numbers, /dev/random and /dev/urandom.
- The /dev/random device produces results based on as much “true” randomness (from outside world events) as is requested, and pauses until it has collected enough such, which can potentially require unbounded time (in practice: it can be surprisingly slow sometimes).
- The /dev/urandom device will always return “immediately”; if the kernel has no “truly” random data collected at the moment, or a smaller amount than requested, it will do its best to combine whatever amount of “true” randomness it does happen to have together with deterministic pseudorandom number generators in a way intended to maximize the “true” randomness of the delivered output. The result will, on average, be significantly less cryptographically secure than use of /dev/random, but is sometimes more useful nonetheless since it returns a result in a bounded (fast) amount of time.
- The scare-quotes above in “true” randomness are used advisedly to avoid inline nitpicks which have already been addressed elsewhere.
- any good BlockCipher, StreamCipher, or HashFunction in counter mode
PseudoRandomNumberGenerators are not random, but (the good ones, at least) are random enough for most practical use. They use a deterministic function that outputs numbers based on the previous state of the generator. They usually require a seed, some number to initialize (or become) the internal state of the generator. If the same seed is used, the same output should be generated. A common (but flawed) method of seeding PseudoRandomNumberGenerators is to use the system time. This method is normally fine for simple games and simulations, but destroys any security gained by using a CSPRNG, because the seed can be more easily figured out. The best seeding methods use a TrueRandom and secure process.
It is utterly meaningless to speak of the “goodness” of a PRNG outside of the context of a particular application. What constitutes a “good” PRNG for strong encryption may be totally different from what constitutes a “good” PRNG for a Monte Carlo simulation in physics.
…because there are an unbounded number of different possible ways of judging “goodness”, yet it is formally impossible to satisfy all possible such measures.
Which is part of the “essence of the subject” previously mentioned to be inadequately addressed on this page.
Knuth provides several StatisticalTestsForRandomness in Vol. 2, and tries to formally define ‘good.’ There are definitely properties good prng’s possess. The necessity of various properties in various applications doesn’t negate their existence. Examples: Statistically random sequence (which is something all prng’s aim for and is, in many ways, measurable) Fast (speed is something an encryption algorithm might sacrifice more than, say, the next property. But it’s no less useful) Cryptographically secure (as defined way up above, and it isn’t always necessary)
Anyway, in these (and, I’m sure, other) dimensions the ‘goodness’ of a prng can be discussed. In fact, one might say it can be discussed in the context of, say, speed, outside of various applications.
One measure of the “goodness” of a PseudoRandomNumberGenerator is how many numbers in its sequence need to be observed before one can guess the next number.
Linear feedback shift registers. With a N-bit LFSR, once you know N consecutive bits, every following bit can be completely determined.
With a linear congruential generators, once you know 5 or so consecutive (32-bit ?) output numbers, you can usually guess the next value. There are algorithms to determine the internal state of LCPRNG’s from only three output values.
With the MersenneTwister, you need to know 623 consecutive 32-bit output numbers before you can guess the next value.
For CSPRNGs such as BlumBlumShub, it’s practically impossible to guess the next number.
- More carefully phrased, with current knowledge, there is no practicable way to guess future values with accuracy better than chance. It has not been proven that it is literally impossible to do so. This can be regarded as a theoretical nitpick, since most people care about current pragma, not about theoretical breakthroughs that might happen next year that turn everything on its ear. BlumBlumShub is widely regarded as a very strong source given current state of the art. (It is not casually used for random number generators in general simply because each iteration is significantly slower than the more common weaker PRNGs.)
To produce evenly distributed integers in a range (assuming your PRNG is also evenly distributed) one must always throw away bits (unless the range is [0, 2^N), because then you can just construct an N-bit value). One of the best methods I’ve seen is in JavaLanguage’s Random class:
If the internal state of a PRNG is N bits, then it will cycle and repeat the same sequence over and over again, with a cycle length of at most 2^N steps, generating at most 2^N unique numbers (often much less). If you use a PRNG that has 16 bits of internal state, but you are trying to generate 32 bit numbers (or worse, floating point numbers) (perhaps by simply concatenating consecutive 16 bit values), there are certain 32 bit numbers that will never occur. But why would you use a 16 bit PRNG, when you could use a 32 bit PRNG – or better yet, the MersenneTwister with 19937 bits of internal state?
the rest of the page still needs work
This comment appears to be meaningless. PRNGs are nothing but a source of (pseudo-) random bits. Secondly, what else would one do with data, random or not, but to “aggregate into the size and format that you need”? Clarify or delete.
Did my edits help make things less meaningless ? – DavidCary
No. I’m sorry, but no, definitely not. I appreciate that you gave it the good old college try, as they say, but no, it’s still, I’m sorry to say, slipshod, no clear improvement. For instance, “Practically all practical PRNGs generate every number in their range with roughly equal probability”. That is false as stated, and misses the essence of the subject, even though it’s kind of sort of almost kind of pragmatically almost true, except not.
Worse yet, “However, it is typically very easy to guess the next number in a sequence if you know some of the previous numbers”, this is only true of “bad” PRNGs. It is definitely false for PRNGs such as BlumBlumShub. corrected
It is true that older popular PRNGs tended to have that fault; the MersenneTwister was created to foil the extremely poor 6-dimensional behavior of most linear congruential PRNGs, which was actually visually obvious when plotted in nothing more complicated than a difference equation phase space. But that’s (hopefully) ancient history.
The previous guy seemed to take my advice to study the masters like Knuth, and then rewrite, rather than shooting from the hip, and although admittedly I didn’t study his changes for hours, he did seem to have done precisely that with his rewrite. His second draft looked very much as if he had reviewed primary sources. You’re still shooting from the hip. I don’t mean to be mean, I’m just saying that it’s good to tighten things up by reviewing masterful sources, such as Knuth, rather than just doing a second edit from one’s own thoughts.
Again, I’m not trying to give you a hard time, it’s just that this is a deep subject, and “obvious” statements are frequently misleading, partially true, or outright false. I’d be happy to clarify here and there, but of course I can’t serve as a primary source. – Doug
That … misses the essence of the subject
Yes, I’m no expert on cryptography.
So, what is the essence of the subject?
– DavidCary