Click here to Skip to main content
15,879,326 members
Articles / Programming Languages / C#

A Time-series Forecasting Library in C#

Rate me:
Please Sign up or sign in to vote.
4.87/5 (15 votes)
1 Jul 2010CPOL5 min read 80.4K   4.4K   46   11
Class of functions that accept time-series data and return forecast values and error analysis, with allowance made for holdout set testing and n-period extension.

Introduction

This code provides a basic set of functions which accept a comma-delimited string of time-series values, the number of periods into the future to extend a forecast, and a number of periods to include in a "holdout set" for additional testing (e.g. verifying forecasted values against observed occurrences without prior knowledge of the actuals).

Background

Time-series forecasting methods use historical information only to produce estimates of future values. Time-series forecasting techniques assume the data's past pattern will continue in the future and include specific measures of error which can help users understand how accurate the forecast has been. Some techniques seek to identify underlying patterns such as trend or seasonal adjustments; others attempt to be self-correcting by including calculations about past periods' error in future forecasts.

The code included here addresses several of the most common time-series forecasting techniques, including naive/Bayes, simple moving average, weighted moving average, exponential smoothing, and adaptive rate smoothing.

In the naive/Bayes approach, the current period's value is used as the forecast for the upcoming period.

In a simple moving average, the prior n-number of values are averaged together with equal weight to produce a value for the upcoming period.

In a weighted moving average, a percentage of weight is applied to n-number of prior values by multiplying the weight by the value and summing the results to produce a value for the upcoming period. By applying different weights (which sum to 1.0), past periods can be given different emphasis. If the user wishes to place more weight upon earlier periods, they might use weights of 0.5, 0.3, and 0.2. If the user wishes to place more weight on more recent periods, they might reverse the order of those parameters.

Exponential smoothing is a version of the weighted moving average which gives recent values more weight than earlier values. However, unlike the weighted moving average, it requires only three inputs: the prior period's forecast, the current period's value, and a smoothing factor (alpha), with a value between 0 and 1.0.

Adaptive rate smoothing modifies the exponential smoothing technique by modifying the alpha smoothing parameter for each period's forecast by the inclusion of a Tracking Signal. This is a technique shown to respond much more quickly to step changes while retaining the ability to filter out random noise. For a full discussion of this technique, see Trigg and Leach, 1967 (ref below).

Using the Code

The code provided consists of a class which contains the analysis functions themselves and a demonstration form, which takes comma-separated values and displays them in a grid with an associated forecast and measures of instance-specific error. Five buttons provide samples for how the function libraries are to be called and pass parameters which, though reasonable, should be modified to suit the user's purpose. In addition to the tested set and error, the grid displays forecasted values for a number of periods into the future. Finally, several specific measures of error are calculated and displayed in labels. Users of these functions should have a solid grasp of what the individual error measures indicate in order to properly interpret both these functions and the results.

Some terminology I have used in the functions which may not be immediately obvious to the reader:

  • Extension - An integer number of periods in the future into which the function should attempt to produce forecasts. The further into the future (higher the number) we attempt to forecast, the higher the probability of error.
  • Holdout Set - An integer number of periods to withhold from the testable set of observed values (from the end). The functions calculate forecasts for these values without looking at the observed values until after the forecast is generated. In this way, forecasts for a number of periods may be verified against observed values without the inconvenience of having to wait for future periods to occur.

Generally, these functions can be called by passing in a decimal array of time-series data (examples provided), an integer extension, and an integer holdout. Other parameters are documented in the code.

C#
//
// Pass an array of time series data {1,2,3} and get a DataTable of forecasts and error
ForecastTable dt = TimeSeries.simpleMovingAverage(new decimal[3] {1,2,3}, 5, 3, 0);
grdResults.DataSource = dt;     

The simpleMovingAverage result (like the other functions in this class) is calculated according to a well-known formula, which is included in the comments.

C#
//
//Simple Moving Average
//
//            ( Dt + D(t-1) + D(t-2) + ... + D(t-n+1) )
//  F(t+1) =  -----------------------------------------
//                              n
public static ForecastTable simpleMovingAverage(decimal[] values, 
	int Extension, int Periods, int Holdout)
{
    ForecastTable dt = new ForecastTable();
    for (Int32 i = 0; i < values.Length + Extension; i++)
    {
        //Insert a row for each value in set
        DataRow row = dt.NewRow();
        dt.Rows.Add(row);
        row.BeginEdit();
        //assign its sequence number
        row["Instance"] = i;
        if (i < values.Length)
        {//processing values which actually occurred
            row["Value"] = values[i];
        }
        //Indicate if this is a holdout row
        row["Holdout"] = (i > (values.Length - Holdout)) && (i < values.Length);
        if (i == 0)
        {//Initialize first row with its own value
            row["Forecast"] = values[i];
        }
        else if (i <= values.Length - Holdout)
        {//processing values which actually occurred, but not in holdout set
            decimal avg = 0;
            DataRow[] rows = dt.Select("Instance>=" + (i - Periods).ToString() + 
		" AND Instance < " + i.ToString(), "Instance");
            foreach (DataRow priorRow in rows)
            {
                avg += (Decimal)priorRow["Value"];
            }
            avg /= rows.Length;
            row["Forecast"] = avg;
        }
        else
        {//must be in the holdout set or the extension
            decimal avg = 0;
            //get the Periods-prior rows and calculate an average actual value
            DataRow[] rows = dt.Select("Instance>=" + (i - Periods).ToString() + 
		" AND Instance < " + i.ToString(), "Instance");
            foreach (DataRow priorRow in rows)
            {
                if ((Int32)priorRow["Instance"] < values.Length)
                {//in the test or holdout set
                    avg += (Decimal)priorRow["Value"];
                }
                else
                {//extension, use forecast since we don't have an actual value
                    avg += (Decimal)priorRow["Forecast"];
                }
            }
            avg /= rows.Length;
            //set the forecasted value
            row["Forecast"] = avg;
        }
        row.EndEdit();
    }
    dt.AcceptChanges();
    return dt;
}

Each of the forecasting functions (naive(), simpleMovingAverage(), weightedMovingAverage(), exponentialSmoothing(), and adaptiveRateSmoothing()) work in a similar manner to initialize early rows with default values--since prior data is not yet available--then calculates a forecast for each value in the testable set. Finally, holdouts and extension values are calculated.

Error analysis is accomplished through the implementation of several measures: MeanSignedError(), MeanAbsoluteError(), MeanPercentError(), MeanAbsolutePercentError(), TrackingSignal(), MeanSquaredError(), CumulativeSignedError(), and CumulativeAbsoluteError().

One of the most useful measures of forecast error is the MeanAbsolutePercentError (MAPE). This value is calculated by summing the absolute value of the percent error for each period's forecast and dividing by the number of periods tested.

C#
//MeanAbsolutePercentError = Sum( |PercentError| ) / n
public static decimal MeanAbsolutePercentError
	(ForecastTable dt, bool Holdout, int IgnoreInitial)
{
    string Filter = "AbsolutePercentError Is Not Null AND Instance > " 
	+ IgnoreInitial.ToString();
    if (Holdout)
        Filter += " AND Holdout=True";
    if (dt.Select(Filter).Length == 0)
        return 1;
    return (Decimal)dt.Compute("AVG(AbsolutePercentError)", Filter);
}

References

  1. Krajewski and Ritzman, Operations Management Processes and Value Chains 7th edition (2004), Pearson Prentice Hall, Upper Saddle River, NJ, pp 535-581
  2. Trigg and Leach, "Exponential Smoothing with an Adaptive Response Rate", Operational Research, Vol. 18, No. 1, (Mar. 1967), pp 53-59

History

This code has not been thoroughly tested and may contain bugs. It is intended to be instructive about forecasting techniques and should not be relied upon for actual forecasts used in decision-making. It has not been optimized for performance or efficiency and the code as it is written is not particularly elegant. My goal was to make it as easy to read and understand as possible, so that others can create their own functions which implement the concepts of time-series forecasting. That said, I do welcome constructive feedback if you see a bug or some glaring omission, or perhaps you feel I could have explained something more clearly.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
United States United States
I learned my first programming language--Apple Basic--in 1989. Over the years, I've done projects in C/C++, Pascal, FoxPro, 4D, AS/400, dBase, perl/CGI, Access, MSSQL, VB5-6/COM+, Classic ASP, uncounted legions of Windows, Mac, and *nix scripting languages and now C# ASP.NET/MVC/Razor/jQuery.

I started getting paid to do this stuff in 1993 as the admin for a 100 node network and began writing one-off apps for the company in my spare time. By 2002, I was developing and managing enterprise software projects full time.

I sat through so many Microsoft classes that they should offer me an honorary MCSE, MCSD, and a bunch of other letters, plus name something on campus after me. I took an undergraduate degree in Management & Business Information Systems, earned an MBA, and I hold a PMP credential, though trying to bring projects to successful completion (as opposed to tracking processes into perpetuity) using the PMBOK is like trying to get to a nice restaurant in a big city by reading a book about its architecture. But, I digress...

I've worked in the building materials industry supporting wholesale trading since 1993. I maintain an unhealthy level of interest in exchange-based and cash forward trading, derivatives, simulation, forecasting, project management, and other quantitative analysis topics (e.g. queue theory, optimal inventory policy, etc.) Most recently, I finished a Systems Science certificate in Computer Modeling & Simulation.

Comments and Discussions

 
QuestionI like your project. Pin
Member 1259708621-Jul-20 17:35
Member 1259708621-Jul-20 17:35 
QuestionNice! Pin
Member 1388401023-Apr-19 21:53
Member 1388401023-Apr-19 21:53 
SuggestionThis is very cool project!! Pin
lcolorl20-Apr-15 22:10
lcolorl20-Apr-15 22:10 
QuestionA Time-series Forecasting Library in C# Pin
Ha D Er12-Apr-15 6:10
Ha D Er12-Apr-15 6:10 
QuestionUse of the code Pin
Member 963499728-Nov-12 4:21
Member 963499728-Nov-12 4:21 
Generalexponential dependency Pin
Alexzak10-Feb-11 22:58
Alexzak10-Feb-11 22:58 
GeneralDsign Pin
HumanOsc17-Jul-10 1:52
HumanOsc17-Jul-10 1:52 
GeneralNeeds more Pin
Not Active1-Jul-10 5:34
mentorNot Active1-Jul-10 5:34 
GeneralRe: Needs more Pin
Kerry Cakebread1-Jul-10 6:14
Kerry Cakebread1-Jul-10 6:14 
Mark - thanks for the feedback. I've modified the article to include one of the forecasting functions (Simple Moving Average) as well as one of the error measurement functions (Mean Absolute Percent Error). I think the article would become unmanageably large if I included the source for each function with discussion of each. Hopefully now, with the inclusion of examples, the article remains concise, allowing the reader to explore more about the individual functions by reading the comments in the downloadable code.
GeneralRe: Needs more Pin
Not Active1-Jul-10 6:38
mentorNot Active1-Jul-10 6:38 
GeneralRe: Needs more Pin
Kerry Cakebread1-Jul-10 6:42
Kerry Cakebread1-Jul-10 6:42 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.