In this article, you will find a fast generator for Random Variable, namely normal and exponential distributions. It is based on George Marsaglia's and Wai Wan Tsang's work.
Random Variable Generator
This article presents a fast generator for Random Variable, namely normal and exponential distributions. The algorithm is due to George Marsaglia and Wai Wan Tsang in . Here's the abstract of their paper:
We provide a new version of our ziggurat method for generating a random variable from a given decreasing density. It is faster and simpler than the original, and will produce, for example, normal or exponential variates at the rate of 15 million per second with a C version on a 400MHz PC. It uses two tables, integers ki and reals wi. Some 99% of the time, the required x is produced by: Generate a random 32-bit integer j and let i be the index formed from the rightmost 8 bits of j. If j < ki return x = j . wi.
They did 99,9% of the work, I just encapsulated their C code in a class wrapper.
In order to illustrate the generator, a Histogram template class,
THistogram is provided.
A Little Bit of Mathematical Background
The normal distribution holds an honored role in probability and statistics, mostly because of the central limit theorem, one of the fundamental theorems that forms a bridge between the two subjects.
The normal distribution is also called the Gaussian distribution, in honor of Carl Friedrich Gauss, who was among the first to use the distribution.
There are plenty of tutorials and demo applets on normal distributions on the web so I won't go into mathematical details here. A very nice web site on this topic can be found at Virtual Laboratories in Probability and Statistics.
The algorithm used to compute the mean, variance and other statistical properties are taken from the numerical recipies (see ).
Using the Generator
CRandomGenerator provides two types of random variables: normal and exponential. These are computed by two
- Standard normal variable:
float var = CRandomGenerator::RNOR();
You can also modify the mean and standard deviation of the generator:
float var = CRandomGenerator::RNOR(fMean, fSdev);
- Exponential variable:
float var = CRandomGenerator::REXP();
Initializing the Generator
Like any random generator, it has to be initialized with a "seed". Classically, one uses the current time to seed the generator. This is done implicitly in the default constructor of
CRandomGenerator, so what you have to do is to build one and only one
CRandomGenerator object in your application thread before using the generator. Doing that in your
CWinApp::InitInstance function is a good idea.
Using the Histogram
THistogram is a templated histogram class.
T, input data type: can be
TOut, result data type: can be
double, by default set to
The histogram is characterized by:
- a region, defined by a minimum and maximum spectrum value (see
- a step size (see
Computing the Histogram
You can feed the histogram with data by different manners:
- Feeding a vector of data:
vector< float > vData;
THistogram< float > histo(101);
histo.Compute( vData , true );
- Updating with a vector of data:
vector< float > vData;
histo.Update( vData );
- Updating with a single data entry:
histo.Update( fData );
- Computing the statistical characteristics of a data set: you can compute the moments of a data sets (mean, standard deviation, variance, etc.) by using the
float fMean, fAdev, fSdev, fVar, fSkew,fKurt;
THistogram<float,float>::GetMoments(vData, fAdev, fSdev,fVar, fSkew, fKurt);
This method is used in the demo to compare the normal distribution generated by
CRandomGenerator and the theoretical probability density function.
- Get the histogram results by using
- Get a normalized histogram (such that it's area is 1) by calling
- Get the coordinates of the regions can be accessed by using
The code to generate the plots looks like this:
vector < double > vDistribution ( histo.GetNormalizedHistogram() );
vector < double > vLeftPositions ( histo.GetLeftContainers() );
The demo application shows the normal and exponential distribution. The red curve represents the histogram of the random values and the blue curve represent the theoretical probability density function.
There are two projects in the demo application:
HistogramDemo uses the Plot Graphic Library for visualization. There is another article on the PGL here. You will need the GDI+ binaries to make the demo application work! (gdiplus.dll)
HistogramMatlab uses the Matlab engine for visualization. You will need Matlab installed to make the demo application work!
Note that the source code is documented using Doxygen syntax.
- 26th November, 2002
- Added theoretical distribution in
- Added histogram area computation
- Fixed normalized histogram, added
- 13th September, 2002
- Added a new demo project using Matlab
- 11th September, 2002
- Fixed demo project and added binary in demo
This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below.
A list of licenses authors might use can be found here.
Jonathan de Halleux is Civil Engineer in Applied Mathematics. He finished his PhD in 2004 in the rainy country of Belgium. After 2 years in the Common Language Runtime (i.e. .net), he is now working at Microsoft Research on Pex (http://research.microsoft.com/pex).