
Comments and Discussions




Hi,
Very nice and useful article.
The RNG in your code:
private static uint GetUint()
{
m_z = 36969 * (m_z & 65535) + (m_z >> 16);
m_w = 18000 * (m_w & 65535) + (m_w >> 16);
return (m_z << 16) + m_w;
}
is not quite the same as the RNG that Marsaglia proposed:
#define znew ((z=36969*(z&65535)+(z>>16))<<16)
#define wnew ((w=18000*(w&65535)+(w>>16))&65535)
#define MWC (znew+wnew)
because Marsaglia's code includes &65535 at the end of the definition of wnew.
So your GetUint() code will mix 16 bits of z and w in the upper 16 bits of the return value, whereas Marsaglia's code strictly appends two 16 bit values without overlapping them.
This may have a positive, negative, or no effect on the quality of the numbers for a given purpose, but anyway we should be clear that it's not the same code that Marsaglia was evaluating. However you have some amount of evaluation you did independently on your code.
One point about both pieces of code is that the lower 16 bits returned depend only on w.
So if we use either piece of code, we have to be careful not to assume that w can be predictable (for example, if an attacker can predict w but not z, we would be wrong to assume that the algorithm would still "protect" us in some way by covering up the predictability of w: the lower 16 bits returned from each call would be completely predictable).
For the same reason, a piece of code that wanted 8 random bits should probably choose them from the top 16 bits returned.
I wonder if it might be "better" in some way to return
(w ^ ( ((z&65535)<<16)  (z>>16) ) ) [assumes z is unsigned so no sign extension]
so that no matter which subset of bits you chose, they would include contributions from z and w ?





You are correct. Thank you for reading so carefully. I've submitted a revised version of the article and code to match Marsaglia.
Update: The current version of the article now reflects the update.modified on Monday, February 22, 2010 1:53 PM





return (m_z << 16) + (m_w & 65535);
The above line is not portable if m_z may be greater than 65535. I would suggest ((m_z & 65535) << 16) + (m_w & 65535).
I also find myself somewhat dubious about the fact that the upper 16 bits and lower 16 bits of the result are independent. It would seem many applications would be likely to use only the upper or lower 16 bits. I'm not sure what form of blending would be best, but I would think some form of mixing would be helpful.
Incidentally, I wonder whether simple random number generators might be improved by simply xor'ing the output with a counter indicating how many times the generator has been invoked (perhaps one that counts by some large number, mod 2^31). Such a counter by itself would be a crummy generator, but if it's orthogonal to whatever other generator is being used, it shouldn't make things any worse and might make them better (if the other generator's period isn't a power of two, such a counter would extend the period of the generator quite nicely).






See also this newsgroup post from around the same time (Random numbers for C: Improvements. Posted: Jan 15, 1999 11:41 AM), in which Marsaglia explains the difference between the two algorithms:
> In my earlier work, done in Fortran, I had implemented
> two 16bit multiplywithcarry generators, say z and w,
> as 32bit integers, with the carry in the top 16 bits,
> the output in the bottom 16. They were combined by
> (z<<16)+w. (In Fortran, ishft(z,16)+w.) Such a
> combination seemed to pass all tests. In the above
> mentioned post, I used (z<<16)+(w&65525), and that
> does not pass all tests. So (z<<16)+w seems
> preferable; it is used below, providing a MWC that
> seems to pass all tests.






Thanks. I have submitted a revised version of the article and code incorporating the correction you suggest.





Hi John:
I like your article; it's well written and very practical.
Good Job! 5*
Thanks and regards,
Alex





Good article. I will use this the next time I need one.





I like small simple code very much! Especially for such artistic purpose like this. And even greater that the author gives a test suite.
However, there are several kinds of uses for pseudo random numbers. For instance MonteCarlo simulation, noise generation, and cryptography.
Generally I would also expect an autocorrelatoin test. You take a series of generated values and autocorrelate them. You should get a peak at offset zero and more or less constant low level result at all other offsets. In this case you can use it for simulation.
To produce white noise you also have to show that the spectrum meets your requirements.
If you look at LFSR generators they pass the autocorrelation test, produce white noise and are great PRNGs frequently used for simulation and autio or even spread spectrum noise generation. However, in crypto applications they are most crackable! They have still been in use by Russians in the eightees and have been read using Sinclair ZX81 computers running a 4 MHz 8 bit CPU. Maybe they should have been read...
Anyway. I like the article and will digg into it more deeply. But be warned that quality tests for PRNGs should be application specific.
Learndy

Airspace V  international hangar flying!
http://www.airspacev.com/ggadgets for tools & toys





Thanks for your note. I agree that generators need to be tested for autocorrelation etc. The test I provided is more of a demo than a thorough test. I don't want to imply that the test included with the project is enough. I'm leaning on George Marsaglia having tested his algorithm with his DIEHARD suite of tests.
The test I included would probably catch an error in my implementation of Marsaglia's algorithm, but not a flaw in his algorithm itself.





Hi,
you wrote that the interval 0 <= u <= 2^32 gets transformed to (0, 1) . The transformation is done by multiplying with 1 / (2^32 + 1) .
Although u can be 0 the output interval should be [0, 1)  so 0 included while 1 is excluded.
Kind regards
Gü





Good catch! You are correct. The code should have computed (u+1)/(2 + 2^32) rather than u/(1 + 2^32).





Thus the code line returning the result in the GetUnitform method has to be changed to
return (u + 1) * 2.328306435454494e010;
Note: the number slightly differs from the "old" number because (2^32 + 1) is just a little bit smaller than (2^32 + 2) > the 2^32 part dominates.





I've revised the code and the article. The new version was posted 6 Oct 2008.
modified on Monday, October 6, 2008 3:23 PM





Hello Mr. Cook.
I like your article !
Do you have any knowledge on designing RNG algorithms ?
I mean all the math theory etc...
If you do I would like to see some theory, in this or in another article.
Perhaps this site isn't the best place to write articles for math but I think some programmers interested in math may find this usefull.
Thanks.
kostas KEL





For an introduction to the mathematical theory, you might want to start with volume 2 of Donald Knuth's Art of Computer Programming, Seminumerical Algorithms. Also, you may be interested in another CodeProject article I wrote, "[^]Pitfalls in Random Number Generation





Thanks !
kostas KEL






Good observation. This has been corrected as of 6 October 2008.





you can just do this :
Random ran = new Random();
double d = ran.NextDouble();
and u will get a double between 0 and 1 then u can do what ever u want with it so why spnd time trying to create one ?





For the fun maybe? Beside that, it's probably better to turn to System.Security.Cryptography for production code.





just showed how easy it is to get a random number





One reason for a custom generator is transparency. If you're trying to reproduce a problem and stepping through the code with a debugger, everything is right there in your code under your control.
Sometimes it's helpful to compare results of code written in different environments. If they all use their own library's RNG, the results aren't comparable. But, for example, you could use this code from unmanaged C++ and from C# and produce identical sequences if you start from identical seeds.
modified on Friday, April 11, 2008 10:08 PM





If you want 1 random number fine but if you need hundreds or thousand that will be truly random then it's a problem. I am trying to write a statistics app and I can't seem to get random numbers.





For one reason, the bug that was recently discovered in the random number generator used by Vista and everything that came before.
"Why don't you tie a kerosenesoaked rag around your ankles so the ants won't climb up and eat your candy ass..."  Dale Earnhardt, 1997  "...the staggering layers of obscenity in your statement make it a work of art on so many levels."  Jason Jystad, 10/26/2001





Random in c# is SUCK!
if my program run 100 in 1 millisec. the random value would just the same.





I'll stick with what's builtin.
This is one of many areas where I know Microsoft can do better than I can.





PIEBALDconsult wrote: This is one of many areas where I know Microsoft can do better than I can
This may be true, but others can do significantly better. You probably don't need really random numbers, but if you do, the inbuilt functions like rand() are not up to the task.
Peter
"Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."





I use System.Security.Cryptography.RNGCryptoServiceProvider





That's fine for crypto work, but what about for statistical simulations (Monte Carlo etc), here you need very fast random number generation with good statistical properties. The inbuilt Random() function looks similar to rand() from c. I haven't made any serious use of random numbers in .NET, but it is possible that this simple generator could be useful.
Peter
"Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."





This PRNG is only usable for monte carlo simulations. And i think it is better than rand() for this purpose, because it passes the DIEHARD tests, the build in rand() function don't!
Where are the problems in crypthography?
1. It is only 32Bit wide  no problem for a brute force attack, at least you need 128Bit, 160Bit or 256Bit is better.
2. It is guessable. If you have a number, you can guess the previous number because the algorithm has not the strength of an cryptography hash function like SHA or even MD5. If the output must be salted and hashed to masquerade, it looses it statistical distribution because of the hash function. In this case you can use a salted hash function to produce a PRN sequence like this:
salt = something real random
number = 0; // Dosn't matter because of the salt!
number = CryptoHash( number xor salt )
number = CryptoHash( number xor salt )
(replace CryptoHash with your prefered hash function like SHA)
Due to the fact that cryptographical hash functions are slow and often don't produce a perfect statistical distribution, it is not very good to use them for monte carlo simulations.






You are right: no computer algorithm will produce 'truly' random numbers.
So the goal is to create a sequence of numbers that appears to be random for whatever purpose we want to use it for.
In cryptography, random means unpredictable. Given some numbers from the sequence, it is impossible to guess the next number.
In simulations, random means that the sequence is not unusual. The DIEHARD tests[^] basically run a large number of simulations and call out very improbable results.






There are dozens of RNG's around. Each RNG is suitable for some situations but not in others. The two big classes are simulations and cryptography, and they are nearly mutually exclusive.
Crypto RNG's have a specific purpose: to show randomness that is not predictable from previous values. For example, given enough data points, it is possible to predict the next number in a Mersenne Twister sequence. So, even though its randomness is great, its predictibility makes it a nono for cryptography applications.
Conversely, crypto RNG's tend to be slow, and may show some correlations that would be unacceptable in a simulation.
The builtin .NET RNG is decent, and will do for most people's uses. It is based on Knuth's 3rd RNG algorithm from The Art of Computer Programming, Vol.2. But for serious simulations, it is not adequate.





I think there are 2 reasons for building your own RNG:
1. As the author said :"The statistical (and not only) quality of the builtin generator might not be adequate for some tasks.". I.e.: cyptography. If someone knows the starting point (seed) of your RNG can figure out all the sequence, no matter how long that may be ! But if the algorithm is not a "common" or "well known" algorithm, then he has no chance finding anything!
2. That way you can balance, yourself, statistical quality and complexity (time delay) the way you want.
However, most apps don't need all this effort. And as you say MS usually does things work great!
But don't underestimate this article !
Of course, I always prefer creating random numbers using natural, nonmathematical processes (some people even use radioactive sources to build RNG!  Yes that is a crazy world, isn't it?) but there is always an easier way!
kostas KEL
modified on Monday, August 25, 2008 11:48 AM





KEL3 wrote: But if the algorithm is not a "common" or "well known" algorithm
Security through obscurity?
KEL3 wrote: can figure out all the sequence
Yes, but you shouldn't use all the sequence. I always skip a few on each call so "they" would also need to know when each call was made.





PIEBALDconsult wrote:
"Security through obscurity?"
Well, that's the way I like to do things! Perhaps it's the best way too, when it's applicable.
I mean, I 'm not a professional cryptographer, but if I ever need to encrypt sth, the less others know about my encryption method, the more I 'll be happy...!
While professional cryptographers try to find the best "open to public" algorithm that would be unbreakable without the key, and when they think they did they fail to prove it mathematically, I would try to do things in the good oldfashioned way! Maybe that won't work for commercial applications but it works great for me!
PIEBALDconsult wrote:
"Yes, but you shouldn't use all the sequence.
I always skip a few on each call so "they" would also need to know when each call was made."
That's a nice idea! I just tend to make things the hardest possible and that's not good enough for my peculiar character!
But even if you don't share my thoughts about all these (don't worry, that happens all the time with me) there is the second argument (for creating your own RNG) I mentioned, which is the most persuading. There are also the arguments of the author.
Well, if this work is needed and it won't take you so long to complete, why not give it a try... The more a person does, the more knowledge he gains!
kostas KEL





PIEBALDconsult wrote: Security through obscurity?
Doesn't hurt.
Look at it this way: if the algorithm turns out to be weak, a good guess at implementation details can greatly reduce the time for an successful attack, reducing your window of opportunity to fix that leak.
Besides, didn't Microsoft botch a hash function in .NET 1?
The article does it right: implement a generator, and run it trough relevant statistical tests. The only problem I have is that the guy who provided the generator algorithm has an eerily similar name to the guy providing the test suite





If you set a seed from a system time you risk having no random numbers but a set of predefined numbers.
Online poker sites had this problem initially because they used to seed their card shuffling algorithm with a system time based seed which was exploited. Players would reset their clock and "know" the locations of all the cards in the deck.
I am not an expert in Random Number Theory, just thought I would point it out.
Robert Kozak
www.nowcom.com





This doesn't matter if yo do some monte carlo simulations. For cyryptography you need a good seed with enough entropy. Microsoft clamis to use some operating system internal counters (performance data, running apps, mouse, keyboard, different oscillator frequencies and network card mac) to produce a seed for the crypto api RNG. Probably you can salt this with your own entropy (app start, mouse movement whatever), but i think it is enough for joe average cryptography.
In high security environments (eg. certification authority) you probably need a real random number device. Even devices based on radiocative decay don't pass all DIEHARD tests (George Marsaglia).
BTW: Poker websites are cryptography and monte carlo simulations togeher. They need to be secure and real random.





What if you use it for something other than Monte Carlo simulations?





If you set the seed from system time, you only get pseudo random numbers. This is almost ok for scientifical simulations (monte carlo). In fact that random number sequences are easy guessable.
If you need random numbers (even pseudo random), you must seed the generator with an unguessable start value. System time is bad, because it has a granularity from 10 milliseconds.
Assume that the particular service (poker website) ist restatet within one year: Then you can get 12460 different start values using system time. This is the same, if you use a 13 Bit Cyptograpy key. Even a tree letter password has more entropy in it.
You can use system time if it doesn't matter that the generated random numers are guessable.







General News Suggestion Question Bug Answer Joke Rant Admin Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Type  Article 
Licence  Public Domain 
First Posted  11 Apr 2008 
Views  301,684 
Downloads  10,501 
Bookmarked  153 times 

