Click here to Skip to main content
12,547,922 members (46,114 online)
Click here to Skip to main content
Add your own
alternative version


26 bookmarked

A High Performance Monte Carlo Integration Simulation Framework

, 2 Sep 2014 CPOL
Rate this:
Please Sign up or sign in to vote.
A high performance monte carlo simulation framework for financial derivative pricing


In this article, I coded a high performance Monte Carlo Simulation framework. To achieve high performance, I use both parallel processes and parallel multithreading in each process.  MPI (message passing interface ) protocol is used to coordinate multiple process ( openmpi implementation of MPI is used in this article ), and C++ 11 threading library is used to implement multithreading. Since I come from financial industry, this Monte Carlo Framework is designed to price financial derivatives. I use static inheritance, in particular, curiously recurring template pattern to make the framework flexible enough to price any type of derivative with different kinds of payoff, while keeping high performance, and staying away from virtual function cost if it were to implement as run time dynamic inheritance.

Multi Process and Multi Threading

Monte carlo simulation is very good candidate do parallel computing, since all paths are independent to each other, and we need run large number of path in order to converge. In order to achieve high performance, I use two levels of parallels. The first level is on parallel process. MPI interface provide communication between a set of process. First, I divide simulation paths of computation into different processes, then I continue to divide the paths in each process into multiple threads. After all threads done ( using thread join to wait for all threads), then I combine those results as the final results for this process.  After done for all processes, then I combine those results coming out of processes by using MPI reduce operation to get final result of Monte Carlo simulation.

The code for MPI and initilization of monte carlo simulation as follows:

int main(int argc, char** argv)
    std::chrono::duration<double> elapsed_seconds_0;
    auto start_0 = std::chrono::high_resolution_clock::now();
    /* MPI Init*/
    int proc_id = 0;
    int num_procs = 1;
    int namelen;
    float result_each_proc;
    float result_sum_global;
    char proc_name[MPI_MAX_PROCESSOR_NAME];
    MPI_Init( &argc, &argv );
    MPI_Comm_size( MPI_COMM_WORLD, &num_procs );
    MPI_Comm_rank( MPI_COMM_WORLD, &proc_id );
    MPI_Get_processor_name( pro_name, &namelen );
    int num_path_per_process = SIM_NUM_PATH / num_procs;
    Instrument Stock(100, 0.5, 0.01);   
    //EuropeanCallOptionPayOff CallPayOff(100);
    //DerivativeInstrument<EuropeanCallOptionPayOff> DerivedInstr(0.5, &Stock, &CallPayOff);
    //MonteCarlo<EuropeanCallOptionPayOff> MonteCalc(STEP_NUM, num_path_per_process, &DerivedInstr,NUM_THREAD);
   // EuropeanPutOptionPayOff PutPayOff(100);
   // DerivativeInstrument<EuropeanPutOptionPayOff> DerivedInstr(0.5, &Stock, &PutPayOff);
   // MonteCarlo<EuropeanPutOptionPayOff> MonteCalc(STEP_NUM, num_path_per_process, &DerivedInstr,NUM_THREAD);   
   // EuropeanBarrierDownOutCallOptionPayOff BarrierOutPayOff( 100, 100);
   // DerivativeInstrument<EuropeanBarrierDownOutCallOptionPayOff> DerivedInstr(0.5, &Stock, &BarrierOutPayOff);
   // MonteCarlo<EuropeanBarrierDownOutCallOptionPayOff> MonteCalc(STEP_NUM, num_path_per_process, &DerivedInstr,NUM_THREAD);   
    //EuropeanBarrierDownInCallOptionPayOff BarrierInPayOff( 100, 100);
    //DerivativeInstrument<EuropeanBarrierDownInCallOptionPayOff> DerivedInstr(0.5, &Stock, &BarrierInPayOff);
    //MonteCarlo<EuropeanBarrierDownInCallOptionPayOff> MonteCalc(STEP_NUM, num_path_per_process, &DerivedInstr,NUM_THREAD);   
    AsianCallOptionPayOff AsianPayOff( 100 );
    DerivativeInstrument<AsianCallOptionPayOff> DerivedInstr(0.5, &Stock, &AsianPayOff);
    MonteCarlo<AsianCallOptionPayOff> MonteCalc(STEP_NUM, num_path_per_process, &DerivedInstr,NUM_THREAD);   
    std::chrono::duration<double> elapsed_seconds;
    auto start = std::chrono::high_resolution_clock::now();
    elapsed_seconds_0 = start - start_0;
    double dRes = MonteCalc.CalcMultiThreadPlainMC();
    MPI_Allreduce ( &result_each_proc, &result_sum_global, 1, MPI_FLOAT, MPI_SUM, MPI_COMM_WORLd );
    float final_result = result_sum_global / num_procs;
    auto end = std::chrono::high_resolution_clock::now();
    elapsed_seconds = end - start;
    if ( 0 == proc_id )
        std::cout << "MPI Initial" << num_procs << " processes spend" << elapsed_seconds_0.count() << "second" << std::endl;
        std::cout << SIM_NUM_PATH << "number of " << STEP_NUM << "steps of simulation takes " << elapsed_seconds.count()<<"second" << std::endl;
    return 0;

The following table shows time spent for using different number of process and thread in a 2.2GHz 32 core machine.

Thread/Process        1         2         5        10         20         32
           1    149.176     74.785     29.726      14.916      10.351    10.136
           2     74.698     37.351     15.561      10.419      8.538      7.591
           5     29.803     15.040       9.500      7.609      7.530      6.926
           10     11.967      9.035      7.298      7.008      6.850      6.757
           16     12.470      9.454      7.151      6.822      6.880      6.758

From the performance table, we can conclude that adding either the number of process or the number of thread can increase the performance until reaching a certain limit. The host machine is 32 core machine, and we can not reach the limit by just using 32 process, and adding thread definitely improve the performance. In addition, the time to initialize MPI multi process increase as the number of the process increase.

The following table shows time spent for using different number of process and thread in a 2.2GHz 32 core machine.

Process                        1          2                 5          10      20         32
Time to
initialize processes         0.049        0.052       0.063       0.165     0.855        1.973

Monte Carlo Simulation Framework

Monte Carlo Simulation are a broad class of computational algorithms that rely on repeated random sampling to obtain numerial results; typically one runs simulations many times over in order to obtain the distribution of unknown probabilistic entity. Monte Carlo Simulation can be used in numerical integration to calculate expected average, and this framework is designed to calculate expected average of financial derivative prices. The code example in this article use a struct to the stochastic underlying price the as the state variable. Users of this framework certainly can add more variable in this structure, in order to expand to multi-dimension with multiple random process with certain correlation variance. I use vector of State to define a path, and vector of paths to define all paths coming out of Monte Carlo Simulation.

struct State
    State( float price ) : m_price ( price ) {}
    float m_price;
typedef std::vector< State > Path;

The stochastic underlying process is coded here to follow the geometric Brownian motion with a risk free interest rate drift term and a stochastic term. The stochastic term dictate that the percentage change from previous state price comes from a random sample from a normal distribution with mean zero and the variance being underlying price variance multiply the time change.

        for (int num = 0; num < lSimNum; num++)
            pVecPath = &(m_vecPaths[pathindex+num]);
            (*pVecPath)[0].m_price = InitialPrice;
            S = InitialPrice;
            fPathSum = 0;
            for ( int i = 1; i < m_iSteps; i++ )
                bPathValid = true;
                number = distribution(generator);
                volterm = voldt* number;
                S = S * exp(drift + volterm);
                (*pVecPath)[i].m_price = (*pVecPath)[i-1].m_price * exp( drift + volterm );               

Different kinds of financial derivative have different types of payoff. The payoff types I code here are path independent option like vanilla call or put option, path check barrier like barrier option, and average underlying price along the path. like asian option. I use static inheritance, in particular, curiously recurring template pattern to make the framework flexible enough to price any type of derivative with different kinds of payoff, while keeping high performance, and staying away from virtual function cost if it were to implement as run time dynamic inheritance.

The actual implementation of the base payoff class is as follows:

template < class Derived>
class PayOff
    float CalcPayOff(float dUnderPrice)
        return static_cast <Derived *> (this)->CalcPayOff(dUnderPrice);
    bool CheckPathBarrier( float fPrice )
        return static_cast < Derived * >(this)->CheckPathBarrier( fPrice );
    void CalcSumPath( float fPrice, float * fRet )
        return static_cast< Derived * > (this)->CalcSumPath( fPrice, fRet );
    PayOffType m_type;

The follwing code shows how a  European call option inheritate a payoff base class.

class EuropeanCallOptionPayOff : public PayOff <EuropeanCallOptionPayOff >
    EuropeanCallOptionPayOff(float Strike) : m_strike(Strike)
        m_type = PathIndependent;
    float CalcPayOff(float dPrice)
        if (dPrice > m_strike)
            return dPrice - m_strike;
            return 0;
    float m_strike;

Derivative class is a template class with payoff as the template argument. The code is as follows:

template<typename DerivatedPayOff>
class DerivativeInstrument
    DerivativeInstrument(float Time, Instrument * pIns, PayOff<DerivatedPayOff> * pPO) : dTimeToMature(Time), pInstrument(pIns), pPayOff(pPO)
    float dTimeToMature;
    Instrument * pInstrument;
    PayOff<DerivatedPayOff> * pPayOff;

Points of Interest

When you download code, you can download openmpi head and library, and set up them correctly in make file, in order to make the code work. Let me know if you have any questions


08/31/2014,  initial publication

9/01/2014, Add code into this article

10/122014, Modify paper and code in order to generalize this framework


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Software Developer (Senior)
United States United States
I have been developing low latency high throughput applications, services and platforms in financial trading industry since 2004, mostly in C++, some in and Java.

You may also be interested in...


Comments and Discussions

GeneralMy vote of 4 Pin
CatchExAs14-Oct-14 1:37
professionalCatchExAs14-Oct-14 1:37 
GeneralRe: My vote of 4 Pin
PengHeProfessor18-Dec-14 13:50
professionalPengHeProfessor18-Dec-14 13:50 
GeneralRe: My vote of 4 Pin
CatchExAs19-Dec-14 1:36
professionalCatchExAs19-Dec-14 1:36 
GeneralRe: My vote of 4 Pin
PengHeProfessor27-Dec-14 4:57
professionalPengHeProfessor27-Dec-14 4:57 
QuestionNot a Framework Pin
Stefan_Lang8-Sep-14 23:22
memberStefan_Lang8-Sep-14 23:22 
AnswerRe: Not a Framework Pin
PengHeProfessor9-Sep-14 17:39
professionalPengHeProfessor9-Sep-14 17:39 
GeneralRe: Not a Framework Pin
Stefan_Lang9-Sep-14 20:31
memberStefan_Lang9-Sep-14 20:31 
QuestionCannot unzip source Pin
geoyar8-Sep-14 10:40
membergeoyar8-Sep-14 10:40 
AnswerRe: Cannot unzip source Pin
Stefan_Lang8-Sep-14 23:24
memberStefan_Lang8-Sep-14 23:24 
GeneralRe: Cannot unzip source Pin
geoyar9-Sep-14 10:40
membergeoyar9-Sep-14 10:40 
GeneralRe: Cannot unzip source Pin
Stefan_Lang9-Sep-14 20:48
memberStefan_Lang9-Sep-14 20:48 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.161021.1 | Last Updated 2 Sep 2014
Article Copyright 2014 by PengHeProfessor
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid