Get Control of Your Service System: A Practical Introduction to Queueing Theory

Kerry Cakebread

4.44/5 (9 votes)

Jul 3, 2010

CPOL

6 min read

25941

414

Provides background and overview of queueing theory and a class library which implements measurement functions

Download demo - 12.8 KB

Introduction

Anyone who has spent time at a doctor's office or been to the bank has spent time waiting in line. In fact, waiting lines are an intricate part of everyday life. To some extent, they reflect the egalitarianism of modern society. Customers stand in line to purchase goods, callers wait for "the next available representative", inventory stacks up waiting to be processed or sold, and internet browsers wait for web servers and routers to process their requests.

When resources are too expensive to be individualized, one reasonable approach is to take turns. For the most part, however, we only notice the lines in front of a resource when they are excessively long or when they inconvenience us by wasting our time ("Your call is important to us so please remain on the line") or money (in the form of excessive inventory or personnel costs, for example).

Many organizations have actively taken charge of their service systems and deliberately manage their queues to provide an optimal level of service, one which meets or exceeds customer expectations but without incurring unnecessary costs (e.g., overstaffing). Even despite variations in the number of arrivals and time-spent-per-customer in a system, it is possible to calculate measures which characterize customer experiences waiting for service and help a system manager minimize investment in labor or capital equipment. Organizations that analyze these measures are often able to react faster to changes. Also, they have reasonable information available about the likely effects of service-level decisions, for example, when budget constraints cause top management to suggest a targeted (i.e., reduced) level of service. Likewise, they have more accurate expectations under rapid growth conditions.

One of the most useful properties of queue modeling is how it gracefully scales from small (sub)systems to larger ones that contain many smaller systems. It is also noteworthy that a framework exists for analyzing networks of queues, each with its own properties, which form a larger system.

You will see in this article a model of how customers queue can be a tremendously important tool to help eliminate waste and provide customer value. Additionally, I will provide and discuss sample source code for a utility application and a .NET class you can use to do much of the analysis useful in optimizing a basic queueing system.

Background

The origins of queueing theory lie in a paper published by Erlang in 1909, though a common notation for problems of this type was not available until David Kendall's work in 1953. In 1961, further insight into the mathematical analysis was codified as "Little's Law" through the work of John Little at MIT.

The theory was developed originally to deal with the problem of telephone operator availability when a telephone customer wished to place a call. Although it took only a few seconds for an operator to make a connection, new requests often arrived at the switchboard faster than they could be served, resulting in a failed dialing attempt. As a result, telephone networks needed more and more operators on duty to service the volume. However, with the advent of queueing systems and analysis, the switchboard was able to place callers in a queue and fulfill their dialing request as soon as an operator was available. The telephone company could then decide on an appropriate level of service and staff its switchboards accordingly. The same dialing paradigm is still in use today, although the circuits are digital and the queues virtual.

However, despite the apparent maturity of these models and their wide use in operations and supply chain management, modeling these queues is unfamiliar to many in the IT profession.

A queueing system is characterized by its input population (a waiting line of customers), the service facility (employees, machines, etc.), and a priority rule that specifies the order in which customers are to be served (FIFO, LIFO, time-slicing, etc.). Queueing analysis uses a particular form of state equations known as Markov chains to model the system in each state. Incoming traffic to these systems is modeled via a Poisson distribution, subject to Erlang's queueing theory assumptions:

Arrivals and departures are random and independent events
Probabilities within the system do not change
All incoming traffic can be routed to any other server within the network
Congestion is cleared as soon as servers are free

Of course, in reality, these assumptions (and others, implicit in the models) do not always hold 100% true, but the differences between real world and theory can often be set aside safely because they are not statistically significant. In the cases that do not fall under the scope of mathematical queueing theory, alternative means of analysis (such as simulation) have been developed.

It is convenient within queueing theory to work with probability distributions which exhibit a memory-less property, so that the mathematics involved can be simplified. Hence, queueing models are frequently developed as Poisson processes through the use of exponential distribution.

Typically, a queue is described by two letters and a number which specifies a distribution for arrival times, a distribution for the service process, and the number of servers in the system, though other codes may be appended to specify additional characteristics. The most common types of arrival and service distributions are Markovian (M, which implies an exponential distribution and a Poisson process), Deterministic (D, implying constant arrivals or service times), or General (G). Thus, a queue having Markovian arrival times, Deterministic service times, and a single server would be described by the abbreviation M/D/1.

Using the code

Included with this article is a sample source code for a .NET project that will do much of the analysis useful in optimizing a basic queueing system. The queueing library could be called by a helpdesk activity report, included in a decision-support application, or simply used with the accompanying demo interface to experiment.

The class supports both single and multiple server models (determined by the servers parameter passed to each method). It also supports a finite-source model, in which there is a known quantity of customers expected to enter the system (rare). The functions determine whether to use finite-source models based on the numberOfCustomers parameter passed to most methods.

The public methods included in the Queue class are:

avgSystemUtilization() - the average utilization of the system (how busy it is)
probNCustomersInSystem() - the probability that N customers are in the system at any given time
probZeroCustomersInSystem() - the probability that the system is empty at any given time
avgCustomersInSystem() - the average number of customers in the system at any given time
avgCustomerInLine() - the average number of customers waiting in line (not yet served) at any given time
avgTimeInSystem() - the average time it takes for a customer to get through the system (including service time)
avgTimeInLine() - the average time it takes to get to the front of the line

These methods can be called, for example, as follows:

Double avgCustsInLine = Queue.avgCustomersInLine(arrivalRate, 
                        serviceRate, servers, numberOfCustomers);

This call invokes the avgCustomersInLine() function, shown here:

public static double avgCustomersInLine(double arrivalRate, 
       double serviceRate, int servers, int numberOfCustomers)
{
    if (servers <= 0)
        throw new Exception("You must pass in a positive number of servers");
    //terminate with an error if queue is infeasible
    if (arrivalRate >= (servers * serviceRate))
    {
        throw new Exception("Infeasible queue");
    }
    if (numberOfCustomers > 0)
    {//finite source model
        double probZeroInSys = probZeroCustomersInSystem(arrivalRate, 
                               serviceRate, servers, numberOfCustomers);
        return numberOfCustomers - (((arrivalRate + serviceRate) / 
                                      arrivalRate) * (1 - probZeroInSys));
    }
    else
    {
        double avgUtilization = avgSystemUtilization(arrivalRate, 
                                serviceRate, servers, numberOfCustomers);
        if (servers == 1)
        {//single server model
            double avgCustInSys = avgCustomersInSystem(arrivalRate, 
                                  serviceRate, servers, numberOfCustomers);
            return avgUtilization * avgCustInSys;
        }
        else
        {//multiple server model
            double probZeroInSys = probZeroCustomersInSystem(arrivalRate, 
                                   serviceRate, servers, numberOfCustomers);
            return (probZeroInSys * Math.Pow((arrivalRate / serviceRate), 
                    servers) * avgUtilization) /
                (fact(servers) * Math.Pow(1 - avgUtilization, 2));
        }
    }
}

Selected references and further reading

Cooper, Robert B. Introduction to Queueing Theory, 2^nd ed. New York: Elsevier-North Holland, 1980.
Little, J.D.C. "A Proof for the Queueing Formula: L=lambda*W." Operations Research, vol. 9, (1961), pp 383-387.
Krajewski and Ritzman, Operations Management Processes and Value Chains 7^th edition (2004), Pearson Prentice Hall, Upper Saddle River, NJ, pp 535-581
Moore, P.M. Queues, Inventories and Maintenance. New York: John Wiley & Sons, 1958.
Saaty, T.L. Elements of Queueing Theory with Applications. New York: McGraw-Hill, 1961.