Click here to Skip to main content
Click here to Skip to main content

A Normal / Exponential Random Generator and Histogram class

By , 2 Dec 2002
 

Sample Image - zigurat.png

Introduction

Random Variable Generator

This article present a fast generator for Random Variable, namely normal and exponential distributions. The algorithm is due to George Marsaglia and Wai Wan Tsang in [1]. Here's the abstract of their paper:

We provide a new version of our ziggurat method for generating a random variable from a given decreasing density. It is faster and simpler than the original, and will produce, for example, normal or exponential variates at the rate of 15 million per second with a C version on a 400MHz PC. It uses two tables, integers ki and reals wi. Some 99% of the time, the required x is produced by: Generate a random 32-bit integer j and let i be the index formed from the rightmost 8 bits of j. If j < ki return x = j . wi.

They did 99,9% of the work, I just encapsulated their C code in a class wrapper.

Histogram

In order to illustrate the generator, an Histogram template class, THistogram is provided.

A little bit of mathematical background

The normal distribution holds an honored role in probability and statistics, mostly because of the central limit theorem, one of the fundamental theorems that forms a bridge between the two subjects.

The normal distribution is also called the Gaussian distribution, in honor of Carl Friedrich Gauss, who was among the first to use the distribution.

There are plenty tutorials and demo applets on normal distributions on the web so I won't go into mathematical details here. A very nice web site on this topic can be found at Virtual Laboratories in Probability and Statistics

The algorithm used to compute the mean, variance and other statistical properties are taken from the numerical recipies (see [2]).

Using the generator

The CRandomGenerator provides two types of random variables: normal and exponential. These are computed by 2 static functions, RNOR and REXP:

  • standard normal variable:
    float var = CRandomGenerator::RNOR();
    	
    You can also modify the mean and standard deviation of the generator:
    float fMean=...;
    float fSdev=...;
    float var = CRandomGenerator::RNOR(fMean, fSdev);
    	
  • exponential variable:
    float var = CRandomGenerator::REXP();
    	

Initializing the generator

Like any random generator, it has to be initialized with a "seed". Classically, one uses the current time to seed the generator. This is done implicitly in the default constructor of CRandomGenerator, so what you have to do is to build one and only one CRandomGenerator object in your application thread before using the generator. Doing that in your CWinApp::InitInstance function is a good idea.

CWinApp::InitInstance
{
   ...
   // build a CRandomGenerator to seed the generator
   CRandomGenerator randn;
}

Using the histogram

THistogram is a templated histogram class.

Template parameters

  • T, input data type: can be float or double,
  • TOut, result data type: can be float or double, by default set to double

Characteristics

The histogram is characterized by

  • a region, defined by a minimum and maximum spectrum value (see Get/SetMinSpectrum,Get/SetMaxSpectrum)
  • a step size (see GetStep)

Histogram diagram

Computing the histogram

You can feed the histogram with data by different manners:

  • Feeding a vector of data,
    vector< float > vData;
    ...  // computing the data
    	 
    // Creating an histogram with 101 regions
    THistogram< float > histo(101);
    // Computing the histogram, min and max values are automatically computed... 
    histo.Compute( vData , true /* compute min, max */);
    
  • Updating with a vector of data,
    vector< float > vData;
    ... 
    // updating the histogram 
    histo.Update( vData );
    
  • Updating with a single data entry,
    float fData;
    ... 
    // updating the histogram 
    histo.Update( fData );
    
  • Computing the statistical characteristics of a data set: you can compute the moments of a data sets (mean, standard deviation, variance, etc.) by using the static method GetMoments:
    float fMean, fAdev, fSdev,  fVar, fSkew,fKurt;
    // Computing succesively the mean, absolute mean, standard deviation, variance, skewness and kurtosis
    THistogram<float,float>::GetMoments(vData, fAdev, fSdev,fVar, fSkew, fKurt);
    
    This method is used in the demo to compare the normal distribution generated by CRandomGenerator and the theoretical probability density function.

Retrieving results

  • Get the histogram results by using GetHistogram.
  • Get a normalized histogram (such that it's area is 1) by calling GetNormalizedHistogram,
  • Get the coordinates of the regions can be accessed by using GetLeftContainers and GetCenterContainers.

    The code to generate the plots looks like this:

    vector < double > vDistribution ( histo.GetNormalizedHistogram() );
    vector < double > vLeftPositions ( histo.GetLeftContainers() );
    CPGLLine2D* pLine;
    ...
    pLine->SetDatas(vLeftPositions, vDistributions);
    

Demo application

The demo application shows the normal and exponential distribution. The red curve represents the histogram of the random values and the blue curve represent the theoretical probability density function.

There are two projects in the demo application:

  • HistogramDemo uses the Plot Graphic Library for visualization. There is another article on the PGL here. You will need the GDI+ binaries to make the demo application work ! (gdiplus.dll)
  • HistogramMatlab uses the Matlab engine for visualization. You will need Matlab installed to make the demo application work !

Note that the source code is documented using Doxygen syntax.

References

Update History

Nov,26,2002
Added theoretical distribution in CRandomGenerator,
Added histogram area computation,
Fixed normalized histogram, added .
Sep,13,2002 Added a new demo project using Matlab.
Sep,11,2002 Fixed demo project and added binary in demo.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jonathan de Halleux
Engineer
United States United States
Member
Jonathan de Halleux is Civil Engineer in Applied Mathematics. He finished his PhD in 2004 in the rainy country of Belgium. After 2 years in the Common Language Runtime (i.e. .net), he is now working at Microsoft Research on Pex (http://research.microsoft.com/pex).

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralBug?membertypke23 May '11 - 3:27 
I think there is a bug in line 277 of Histogram.h:
 
Old code:
 
template <class T, class TOut>
std::vector<TOut> THistogram<T,TOut>::GetCenterContainers() const
{
	std::vector<TOut> vX( m_vCounters.size());
 
	for (UINT i = 0;i<m_vCounters.size(); i++)
		vX[i]= m_tMin + (i+0.5)*m_dStep;
 
	return vX;
}
 
New code:
 
template <class T, class TOut>
std::vector<TOut> THistogram<T,TOut>::GetCenterContainers() const
{
	std::vector<TOut> vX( m_vCounters.size());
 
	for (UINT i = 0;i<m_vCounters.size(); i++)
		vX[i]= m_tMin + (i-0.5)*m_dStep;
 
	return vX;
}
 
I used the following test data:
 
	int nColDataSize = 940;
	DWORD pData[940];
	DWORD sum = 0;
	for (r = 0; r < 10; r++)
	{
		// 10
		pData[r] = 11;
		sum += pData[r];
	}
	for (r = 10; r < 30; r++)
	{
		// 20
		pData[r] = 12;
		sum += pData[r];
	}
	for (r = 30; r < 70; r++)
	{
		// 40
		pData[r] = 13;
		sum += pData[r];
	}
	for (r = 70; r < 150; r++)
	{
		// 80
		pData[r] = 14;
		sum += pData[r];
	}
	for (r = 150; r < 310; r++)
	{
		// 160
		pData[r] = 15;
		sum += pData[r];
	}
	for (r = 310; r < 630; r++)
	{
		// 320
		pData[r] = 16;
		sum += pData[r];
	}
	for (r = 630; r < 790; r++)
	{
		// 160
		pData[r] = 17;
		sum += pData[r];
	}
	for (r = 790; r < 870; r++)
	{
		// 80
		pData[r] = 18;
		sum += pData[r];
	}
	for (r = 870; r < 910; r++)
	{
		// 40
		pData[r] = 19;
		sum += pData[r];
	}
	for (r = 910; r < 930; r++)
	{
		// 20
		pData[r] = 20;
		sum += pData[r];
	}
	for (r = 930; r < 940; r++)
	{
		// 10
		pData[r] = 21;
		sum += pData[r];
	}
 
It generates 11 symmetrical bins around a value of 16 (11 to 21), with each bin doubling its count on its way towards the center value and then dividing its count when exceeding the center value.
 
Mean value of such distribution is 16.
 
When you plot the whole thing using code like
 
	THistogram<double, double> histo(11);
 
	// Create vectors
	std::vector<double> vDataX(nColDataSize);
	std::vector<double> vDataY(nColDataSize);
	for (r = 0; r < nColDataSize; r++)
	{
		vDataX[r] = (double)(r);
		vDataY[r] = (double)(pData[r]);
	}
 
	histo.Compute(vDataY);
 
	CPGLGraph* pGraph = new CPGLGraph;
	CPGLLine2D* pLine = new CPGLLine2D;
	pLine->SetInterpolationType(PGL_INTERPOLATION_STEP);
	pLine->SetLineWidth(2);
 
	// putting result in vector
	std::vector<double> vX(histo.GetCenterContainers());
	std::vector<double> vY(histo.GetHistogramD());
	pLine->SetDatas(vX, vY);
 
	//Add the line to the graph (note that an object can be added to only one graph): 
	pGraph->AddObject(pLine);
 
	//Make PGL scale the plot (automatically) 
	pGraph->ZoomAll();
 
	//Create a dialog box and display the plot: 
	CPGLGraphBitDlg graphdlg( ((CMainFrame*)AfxGetMainWnd())->GetActiveFrame()->GetActiveView(), pGraph);
	graphdlg.DoModal();
 
it centers the histogram maximum at an x-value of 17, not 16.
 
To correct that bug, use the new code.
 
Apart from that, PGL only plots 10 y-values, although I have a histogram size of 11. This must be due to the interpolation style, I assume, but I am not sure of that. So the right-most bin is always missing in the plot...?
 
Anyways, PGL is great work, it saves CodeProject community members a lot of time.
Best regards,
 
Volker

QuestionCan open in microsoft vc10 or vc08memberMember 76545647 Feb '11 - 23:42 
Hi.
I have downloaded the source code but it does not compile.
Generalmultivariate normalmemberhamidreza_buddy21 Jul '08 - 19:10 
how should i generate d dimensional samples given a Mu and a Covariance matrix?
QuestionProblems implementing in VC++6memberPaul S Ganney10 Aug '07 - 7:00 
Nice piece of work which I'd very much like to use. I don;t need the histograms, just the normal random variable generator.
 
My first problem was PGL_PI, but I #defined it and all was OK.
It's now giving me lots of errors like
c:\program files\microsoft visual studio\vc98\include\new(35) : error C2061: syntax error : identifier 'THIS_FILE'
 
which shouldn't be a problem as they're the standard files - do I need to add another library to use this code?
 
Thanks
 
Paul.
Generalsimpler algorithmmemberandrewgs738 Oct '05 - 4:29 
I have these simple algorithms which i used to create a guaranteed random
number from a normal or exponential distribution:
 
A. NORMAL DISTRIBUTION
 
z = inverse_ncdf(random()); // where inverse_ncdf is the function similar
// to 'normsinv()' of Excel (or OOO Calc) and
// random() is the function which generates a
// random number from 0 to 1.
 
return z * sd + mean; // where sd is your desired standard deviation and
// mean is your desired average.
 
B. EXPONENTIAL DISTRIBUTION
 
return -mean * log(1-random()); // where mean is your desired average,
// log is the NATURAL LOGARITHM function
// (similar with LN() function of excel)
// and random() is the function which
// generates a random number from 0 to 1.
 

I hope this helps!!!
 
Alex Andrew Mosqueda
I am your friend
GeneralAnother one Histogram Template.memberrbid14 Sep '04 - 6:31 
Hello Again,
 
I found another few catch22 on the Histogram Template...
 
First, you start with a single sample, ComputeStep() may have a divide error. (the provided vector size is 1)... sounds ridiculous, but sometimes you may want to add samples (via Update()) as you get them....
 
Calling Compute() with a bComputeMinMax = false, and you get an under/over sized sample in the input vector, the function will return as soon the first under/over sised sample is encontered. I guess it will be better to invert the condition, and instead of returning on the first non-legal value, allow inserting all legal values (also in Update()). Of course, in this case, you need to set the max/min spectrum before calling Compute()...
 
Today, also if you call Update() before Compute(), without calling first Set max/min spectrum, you also get wrong stuff.
 
And to finalize, great template. I learned a lot from it. THANKS A LOT as well.
 
--- Ricky
 
 "Beware of bugs in the above code; I have only proved it correct, 
  but not tried it." -- Donald Knuth

GeneralRandom Generator object should always be GLOBALmemberAlex Chirokov13 Jul '04 - 14:09 
Nice work!
One thing is that: application should only have one Random Generator object(one instance of the class), and this object should be global(to be available to every one) ! That is why it is better to implement it as a global function not a class! If you use two objects of random class in your applicaton they will generate correlated sequences and your results will not be trully random. Another good random generator can be taken from www.nr.com
 
be good
GeneralRe: Random Generator object should always be GLOBALsussAnonymous13 Jul '04 - 20:25 
Just make it a singleton.
Jonathan
GeneralRe: Random Generator object should always be GLOBALmemberAlex Chirokov14 Jul '04 - 1:46 
Yes you right it should be singleton! (always)
 
Btw have you tested your generator? (using DIEHARD for example). I am doing right now monte-carlo simulation, and 90% of execution time I generate random numbers (by nr routine, for which I wrote "singleton" wraperSmile | :) ). So if i can generate them faster it would speed-up simulation process dramatically, but the problem is I need very "Random" numbers (high quality randomness). I am going to try testing your code Smile | :) )
Thank you for the nice class!
 
Alex-
GeneralNon-membersmemberJoeSox23 Dec '02 - 7:44 
>You will need Matlab installed to make the demo application work !
 
So I find the link to download and read....
"Non-members
• To request a trial, you may complete this form to have a sales representative contact you. "
 
NO THANKS.
 
Can't you put an easier demo for us non-members and non PHd's???Laugh | :laugh: Confused | :confused: Unsure | :~ Dead | X| Cry | :(( Sleepy | :zzz: OMG | :OMG: WTF | :WTF:
 
Joe
 


General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web03 | 2.6.130516.1 | Last Updated 3 Dec 2002
Article Copyright 2002 by Jonathan de Halleux
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid