Add your own alternative version
Stats
405K views 14.7K downloads 168 bookmarked
Posted
11 Apr 2008

Comments and Discussions




Great one!
Using the MWC for a while now and loving it  your article is great in explaining it and provides some excellent source code with the different distributions. Have my 5!





http://www.bobwheeler.com/statistics/Password/MarsagliaPost.txt
Above post has the code:
#define znew ((z=36969*(z&65535)+(z>>16))<<16)
#define wnew ((w=18000*(w&65535)+(w>>16))&65535)
#define MWC (znew+wnew)
and your code:
private static uint GetUint()
{
m_z = 36969 * (m_z & 65535) + (m_z >> 16);
m_w = 18000 * (m_w & 65535) + (m_w >> 16);
return (m_z << 16) + m_w;
}
You are missing the &65535 andoperation? Is this on purpose or by mistake?
Additional links: Wikipages has the two versions, suggesting that probably the version without &65535 is the right one.
modified 13Aug12 16:22pm.





I've noticed this too, the correct code would be this:
m_z = 36969 * (m_z & 65535) + (m_z >> 16);
m_w = 18000 * (m_w & 65535) + (m_w >> 16);
return ((m_z << 16) + (m_w & 65535));





But in this archive Marsaglia himself states:
"In my earlier work, done in Fortran, I had implemented
two 16bit multiplywithcarry generators, say z and w,
as 32bit integers, with the carry in the top 16 bits,
the output in the bottom 16. They were combined by
(z<<16)+w. (In Fortran, ishft(z,16)+w.) Such a
combination seemed to pass all tests. In the above
mentioned post, I used (z<<16)+(w&65525), and that
does not pass all tests. So (z<<16)+w seems
preferable; it is used below, providing a MWC that
seems to pass all tests."
http://mathforum.org/kb/message.jspa?messageID=1524861[^]





Sorry, my mistake, the second ")" changes everything. This comment is completely wrong:
Member 8551910 wrote: +(w>>16))&65535) That seems kind of dumb. (w>>16) shifts the leftmost 16 bits of a 32 bit number into the rightmost 16 bits. "And"ing that with 65535 is "and"ing it with 16 ones in the rightmost 16 bits, so it remains unchanged. The bug looks to me to be in your referenced link.
modified 29Jun14 5:36am.






I converted SimpleRNG to VB. The first time through GetUint() I get an Overflow exception from this statement:
m_z = 36969 * (m_z & 65535) + (m_z >> 16)
When I step through the code in C# it works just fine.
The environment is VS2008 and .NET 3.5
Any ideas?





There is a significant and functional difference between And and & which the converter I used doesn't seem to handle very well.
I don't seem to have checked the output very well, either.
Sorry to have bothered you.





Just did a quick port to C++  have a little project in mind, and this might be just the thing, thanks.
I know I should google it, just wondering if the KS test result here looks reasonable  I was scratching my head a bit on the inits of K_plus and K_minus and their output:
Testing the random number distributionusing the KolmogorovSmirnov (KS) test.
K+ statistic: 0.405208
K statistic: 0.596502
Acceptable interval: [ 0.00591058, 2.03115 ]
K+ max at 391 0.379186
K max at 539 0.557863
KS test passed
Testing gamma
Expected mean: 20 computed mean: 20.0289
Expected variance: 40 computed variance: 39.8337
Testing normal
Expected mean: 2 computed mean: 1.9709
Expected variance: 25 computed variance: 24.9908
Testing Student t
Expected mean: 0 computed mean: 0.000852398
Expected variance: 1.5 computed variance: 1.49278
Testing Weibull
Expected mean: 2.65868 computed mean: 2.66205
Expected variance: 1.93142 computed variance: 1.93547
Testing Beta
Expected mean: 0.777778 computed mean: 0.777622
Expected variance: 0.017284 computed variance: 0.0172395
Happy to pass along what I have  thanks again, nice stuff (er, though a tad over my head )
Cheers
Tim





The KS test looks mysterious, but as long as the K and K+ values are inside the "acceptable interval," there's no reason to be suspicious. They may occasionally fall outside the interval, but unless that keeps happening with several seeds then you're probably OK. In this case 0.405208 and 0.596502 fit inside, larger than 0.00591058 and smaller than 2.03115.





Thanks  just did a test with numReps = 10000000  surprisingly quick, KS test reported in well under a minute on a P4 at 2.8Ghz (dual core I think, though not sure how much that helps), and that's most probably the std::list push_back operations taking up the time [edit: and the sort]. (I used a bit of STL in the test harness, leaving the SimpleRNG class to the basic C runtime for the math ops etc.)
Still might be a bit of tweaking for the exception code  not formatting the params yet.
The little project involves some bitmap manipulation, so speed is good.
Thanks again,
Tim





You should explain about bad seeds for this generator.
The most basic one is: if you initialise either w or z component of the generator with 0, it will be stuck on 0 (making that component of the generator useless).
But also if you initialise the w component with 0x464FFFFF, 0x8C9FFFFE, or 0xD2EFFFFD, it will get "stuck" forever on a same value output. Similarly for a value of 0x9068FFFF for the z component.





Bravo for the comment!
Just because you can do it in 3 lines of code, and result looks ok to eye, doesn't mean the algorithm is good...
I would never use SimpleRNG in any securitybased code as its next values are possibly predictable and even the bad seeds can destroy the main purpose: generating random values.
For any other case, I really don't see why I would use any "lightweight random generator".





Hi, just a note to say that it seems the single parameter constructor: Public Shared Sub SetSeed(u As UInteger) is proving to result in the more "usefully random" results for me when using the GetUniform() function.
When calling the seed constructor with dual parameters, it seems the second value has only a very small affect on the variation of the result.
SetSeed(1, 1); GetUniform = 0.564106363773296;
SetSeed(2, 1); GetUniform = 0.56411055472488;
SetSeed(3, 1); GetUniform = 0.564114745676464;
SetSeed(4, 1); GetUniform = 0.564118936628048;
SetSeed(5, 1); GetUniform = 0.564123127579632;
SetSeed(1, 2); GetUniform = 0.12820853682784;
SetSeed(2, 2); GetUniform = 0.128212727779423;
SetSeed(3, 2); GetUniform = 0.128216918731007;
SetSeed(4, 2); GetUniform = 0.128221109682591;
SetSeed(5, 2); GetUniform = 0.128225300634175;
SetSeed(1, 3); GetUniform = 0.692310709416722;
SetSeed(2, 3); GetUniform = 0.692314900368305;
SetSeed(3, 3); GetUniform = 0.692319091319889;
SetSeed(4, 3); GetUniform = 0.692323282271473;
SetSeed(5, 3); GetUniform = 0.692327473223057;
I was thinking that any unique combination of parameters would result in a significantly random output. Am I misunderstanding the intention of the second param?
Thanks in advance for help





If you run these sequences further, not just generating one sample, I believe you'll see them diverge.





Hello, thanks for the article and code; my apologies, I am new to pseudo random number generators and Mathematics is a weak point of mine, but I have a question:
I would like to know up front the point the sequence repeats, I believe this is called the period, so that I can "wrap" around some dynamically generated game content.
Is this even possible to determine?





I think the period is on the order of 2^64, so you're unlikely to ever see it. If you generate 10^9 random numbers a second, you'll run out in a few centuries.






The period of the w component is 2^(29.1). The period of the z component is 2^(30.2). So the period of the whole generator is actually approx 2^(59.3).
modified 18Jun12 0:10am.





As long as it has to be. Links to resources if you wanna dig deeper and very easy to use. Excellent!





It has been extremely helpful for my project. Quite surpisingly default implementation of .Net was producing always the same result.
With the use of this alogorithm my program (used for self learning) is running perfectly fine.
Thanks & Regards,
Vikas Kapoor
Technical Specialist,
Zensar Technologies Limited,
Pune.






Agin this is good stuff. This code looks like it might be very useful for something different than before. A small walkthrough on how one would create an additional distribution using this framework would be really nice though. Especially one that notes potential pitfalls. (Also though I know it is far too much to ask, it would be greatly appreciated if you could share your testing code. )
You got a 5 from me on this article some time back. Big kudos for improving an article over a course of years.
Ken
modified on Saturday, January 8, 2011 11:16 AM






I think there is a bug in GetUniform method which is not caught by the tests. What happens if u = 2^32? The return statement return (u + 1) * 2.328306435454494e10; increments u by one, making u+1 == 0, thus the method return 0, I suppose (haven't tested it, though). A simple fix would be: return (u + 1.0) * 2.328306435454494e10; so that the increment operation is using doubles, not ints.





Yes, 1.0 would be better than 1. Thanks for reading carefully and finding that. As it stands, the code will return 0 when u = 2^32  1. Of course that's a rare event, but I should change the code.
Update (8 January 2011): The article and the source code have been modified to reflect this change.
modified on Saturday, January 8, 2011 10:02 AM





Excellent article, but I was wondering why all the methods are static. IMHO, this cause problems if you have a multithreaded application and you want to have more than one, yet reproducible, pseudorandom series.
Problem 1: If you have static methods and variables, you need locking or some other synchronization method. Nonstatic version allows RNG instance for each thread so they are safe to use without synchronization.
Problem 2: Even if you used synchronization with static methods, the pseudo random number series are not deterministic and reproducible because the RNG is used in nondeterministic times from different threads. There are several applications where you want to reproduce exactly the same pseudo random number for a given seed.





Hi John,
Excellent article though way out of my league as far as math is concerned.
The problem I am trying to solve is to generate a random positive integer y calling the GetRandom() function x number of times using VB.net. The problem I am running into is that since the computers are so fast these days, if my x is 25 (e.g), then it generates the same number 25 times.
I don't know if it makes sense to use your solution. Might be an overkill.
Do you or any of your readers have any suggestions besides using System.Threading.thread.pause(2) so I will (hopefully) have fresh seed? Or if not, any way to port your c# library into VB.NET so I can just use it directly without compiling yours into a dll?
Thanks for your help
Kuntal





You're only meant to seed the generator once, at the start. Then you can call the GetUint() function many times, and get a different value each time.
modified 18Jun12 0:10am.





I found this to generate pretty random numbers. It's nothing fancy and I'm not sure of the statistical distribution if produces, but it works well for me.
Denny
static double GenerateRandomNumber()
{
byte[] random_bytes = new byte[4];
new RNGCryptoServiceProvider().GetBytes(random_bytes);
int random_int = BitConverter.ToInt32(random_bytes, 0);
Random random = new Random(random_int);
return random.NextDouble();
}





While trying to figure out how to generate random number for a Poisson distribution, I came across your contribution. Thank you putting together the well written article+code+test harness. So rare these days!
BTW: I am still trying to figure out how to create Poisson RNs. Any suggestions (or code snippet) would be much appreciated. Cheers!






Thank you so much. This rates a 10!





Hi John:
The GetPoisson function works nicely for me (as I have small Lamda values). But the LargePoisson function won't compile as it is looking for the LogFactorial(n) function. For the moment I've implemented the "simple/inefficient version" (just to get it to compile), but as I was intrigued by your comment "A better approach would be to use a log gamma function if one is available", I looked around your lovely clean adfree site for the LogFactorial code but couldn't find it. My math is a tad rusty (have been out of school for four decades now!), so I don't know if I can write my own; if you could point me to the better code that would be marvelous.
Cheers!
VVX





I have standalone code[^] for several functions on my web site, but not for gamma and log gamma. The standalone code collection is for welldocumented little pieces of code you can easily copy and paste. Also, I want to write them from scratch so there are no license issues, not even open source licenses.
I haven't included gamma and log gamma because their implementations are more complicated. Some day I may add these functions.





Thanks. Look forward to seeing your gamma functions.






Thanks John. Will check it out soon. Cheers!





Nice and succinct. Better if in VB





First I very much appreciate mathematical articles. Second I would really appreciate some more explanation on why GetUint works. Not obvious is definitely an understatement. Can this be extended if one needed random numbers from 1 to n where n could be gargantuan. In particular I want to generate random rational numbers and I would like to have potentially thousands or more digits in the numerator and denominator. If you have any Ideas as to how to make a good random number generator for Rational Numbers  .Net 4.0 version , it would be greatly appreciated.
Ken





If you want to understand how random number generators work, you could start with Knuth's Seminumerical Algorithms, (TAOCP volume 2).
As far as generating large random numbers, assume for a minute we had a random number generator that produced numbers 0, 1, 2, ..., 9. You could generate a random number with k digits by computing an array a[] of random digits and computing a[0] + 10*a[1] + 100*a[2] + ... + 10^(k1)*a[k1].
Now instead of working in base 10, work in base 2^32 since GetUint produces numbers between 0 and 2^32  1. For example, you could generate a 128bit random number by generating four 32bit unsigned integers a, b, c, and d and returning a + b*2^32 + c*2^64 + d*2^96.






Hi,
Very nice and useful article.
The RNG in your code:
private static uint GetUint()
{
m_z = 36969 * (m_z & 65535) + (m_z >> 16);
m_w = 18000 * (m_w & 65535) + (m_w >> 16);
return (m_z << 16) + m_w;
}
is not quite the same as the RNG that Marsaglia proposed:
#define znew ((z=36969*(z&65535)+(z>>16))<<16)
#define wnew ((w=18000*(w&65535)+(w>>16))&65535)
#define MWC (znew+wnew)
because Marsaglia's code includes &65535 at the end of the definition of wnew.
So your GetUint() code will mix 16 bits of z and w in the upper 16 bits of the return value, whereas Marsaglia's code strictly appends two 16 bit values without overlapping them.
This may have a positive, negative, or no effect on the quality of the numbers for a given purpose, but anyway we should be clear that it's not the same code that Marsaglia was evaluating. However you have some amount of evaluation you did independently on your code.
One point about both pieces of code is that the lower 16 bits returned depend only on w.
So if we use either piece of code, we have to be careful not to assume that w can be predictable (for example, if an attacker can predict w but not z, we would be wrong to assume that the algorithm would still "protect" us in some way by covering up the predictability of w: the lower 16 bits returned from each call would be completely predictable).
For the same reason, a piece of code that wanted 8 random bits should probably choose them from the top 16 bits returned.
I wonder if it might be "better" in some way to return
(w ^ ( ((z&65535)<<16)  (z>>16) ) ) [assumes z is unsigned so no sign extension]
so that no matter which subset of bits you chose, they would include contributions from z and w ?





You are correct. Thank you for reading so carefully. I've submitted a revised version of the article and code to match Marsaglia.
Update: The current version of the article now reflects the update.modified on Monday, February 22, 2010 1:53 PM





return (m_z << 16) + (m_w & 65535);
The above line is not portable if m_z may be greater than 65535. I would suggest ((m_z & 65535) << 16) + (m_w & 65535).
I also find myself somewhat dubious about the fact that the upper 16 bits and lower 16 bits of the result are independent. It would seem many applications would be likely to use only the upper or lower 16 bits. I'm not sure what form of blending would be best, but I would think some form of mixing would be helpful.
Incidentally, I wonder whether simple random number generators might be improved by simply xor'ing the output with a counter indicating how many times the generator has been invoked (perhaps one that counts by some large number, mod 2^31). Such a counter by itself would be a crummy generator, but if it's orthogonal to whatever other generator is being used, it shouldn't make things any worse and might make them better (if the other generator's period isn't a power of two, such a counter would extend the period of the generator quite nicely).






See also this newsgroup post from around the same time (Random numbers for C: Improvements. Posted: Jan 15, 1999 11:41 AM), in which Marsaglia explains the difference between the two algorithms:
> In my earlier work, done in Fortran, I had implemented
> two 16bit multiplywithcarry generators, say z and w,
> as 32bit integers, with the carry in the top 16 bits,
> the output in the bottom 16. They were combined by
> (z<<16)+w. (In Fortran, ishft(z,16)+w.) Such a
> combination seemed to pass all tests. In the above
> mentioned post, I used (z<<16)+(w&65525), and that
> does not pass all tests. So (z<<16)+w seems
> preferable; it is used below, providing a MWC that
> seems to pass all tests.






Thanks. I have submitted a revised version of the article and code incorporating the correction you suggest.







General News Suggestion Question Bug Answer Joke Praise Rant Admin Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

