The Option Pattern

Andrew Shapira

4.59/5 (16 votes)

Feb 11, 2007

CPOL

28 min read

86666

217

This article is the first in a series about a ubiquitous design pattern - the Option pattern. This pattern is used in most nontrivial programs.

Download source code - 16.0 KB

Introduction

This article is the first in a series about a ubiquitous design pattern called the Option Pattern. In its plain form, termed the simple option pattern, the pattern's distinguishing feature is the manipulation of some set whose cardinality is either 0 or 1. Our article emphasizes two themes:

differentiating the abstract Option pattern from its implementation, and
using the Option pattern explicitly rather than implicitly.

Let's look at an example. Suppose that a C# type MyDictionary (a hash table) has lookup methods, and another type DictionaryTester initiates lookups and prints results. Traditional techniques for achieving this functionality use the option pattern only implicitly, e.g., with null:

class MyDictionary<K,V> {
  ...
  public V GetValueOrDefault(K key)  { ... }
  public bool TryGetValue(K key, out V value) { ... }
}

class DictionaryTester {
  public void Go(
      MyDictionary<double,string> dictionary
    , double key
  )
  {
    string value = dictionary.GetValueOrDefault(key);
    if (value == null) {
      Console.WriteLine("'{0}' is not in the dictionary", key);
    } else {
      Console.WriteLine("'{0}' maps to '{1}'", key, value);
    }
  }
}

The above setup has many problems:

The interfaces of MyDictionary's methods are more complicated than we'd like. Many programmers will require a substantial amount of time to understand what these methods do.
The TryGetValue method returns a system of data (two variables), but TryGetValue has no way to enforce the system's semantics after returning the system to the client. In using two variables, MyDictionary defines an abstract data type without informing the compiler. This has many undesirable consequences. The worst is that clients may accidentally refer to value when TryGetValue returns false. Also, if a client wants to communicate the system to other methods, the client has only bad options - either do the irritating and redundant work of encapsulating the system, or pass two variables around in a cumbersome and error prone way which may not clearly indicate the original semantics. There are also the usual problems that come with a lack of type safety. Some of these problems are also present with the value returned by the GetValueOrDefault method, because in client code, there can be no explicit identification of what a null (default) value means; different pointers that may or may not be null may intermingle and mask the intended semantics.
Using the TryGetValue method is cumbersome. Calling it requires two statements - one for each returned value. In addition, the out construct confuses many programmers.
One cannot, in general, use the GetValueOrDefault method with value types, because GetValueOrDefault indicates that a dictionary entry is not present by using a special value that might be a valid data value. Similarly, if the dictionary may contain null values, then the GetValueOrDefault method cannot be used in general.
For lookups, one may have to call different MyDictionary methods depending on whether one is retrieving value types, reference types, or instances of reference types that are guaranteed to be non-null. Having just one method would be better.

Let's see what might be gained from using the Option pattern explicitly, say, with our type Option<T>:

class MyDictionary<K,V> {
  ...
  public Option<V> TryGetValue(K key)  { ... }
}

class DictionaryTester {
  public void Go(
      MyDictionary<double,string> dictionary
    , double key
  )
  {
    Option<string> entry = dictionary.TryGetValue(key);
    if (entry.IsPresent) {
      Console.WriteLine("'{0}' maps to '{1}'", key, entry.Value);
    } else {
      Console.WriteLine("'{0}' is not in the dictionary", key);
      // The following line would throw an exception.
      // Console.WriteLine("'{0}' maps to '{1}'", key, entry.Value);
    }
  }
}

Here, Option<T> is our generic type which allows clients to explicitly use the Option pattern. The Go method checks entry.IsPresent to see whether the entry option is present, i.e., whether the dictionary contains an entry for key. When entry.IsPresent is true, and only then, entry contains a value (data) and the value can be accessed through entry.Value. Access is safe; the Option<T> type would throw an exception were entry.Value is called when entry.IsPresent is false. The explicit use of the Option pattern in this second version is simple, safe, clear, and concise. It has virtually none of the problems of the first version, which used the Option pattern only implicitly.

The first version uses two of the most historically popular techniques for using the Option pattern:

using a special value such as null or -1 to indicate the value not present state, and
using two variables, with one being a boolean-valued variable defining the system state as either value present or value not present, and the other variable holding the value when the state is value present.

Other important techniques for using the Option pattern include using exceptions, and C#'s System.Nullable type. All of these techniques are critically evaluated in [Sha07b].

Many programmers use these techniques on a daily basis. The Option pattern is used in almost every programming language, and applies over a wide range of abstractions, e.g., from assembly languages to functional languages. Yet, almost incredibly, the mainstream programming community has known the Option pattern not as an abstract pattern, but rather, only in terms of implementation mechanisms such as null that are better hidden by a programming language feature abstraction or a data type abstraction. In this series of articles, we show that differentiating the abstract pattern from its implementation is a powerful tool.

This differentiation provides a unifying framework that gives insight into code related to options, including code that uses exceptions [Sha07d]. It also points to the intimate relationship between options, pointers that cannot be null, and types whose instances are pointed to only by pointers that cannot be null [Sha07c]. These ideas, too, may be applied to situations without null, such as enumerations with special values like None. They also give rise to the concept of explicit use of the Option pattern, a concept made precise in [Sha07c]. Explicit use's superiority over most implicit uses becomes evident from this article and our survey in [Sha07b].

Finally, and perhaps most importantly, recognizing that a particular code is using an abstract pattern facilitates realizing that some better way may be possible. In lieu of language features for such a better way, we introduce our C# Option types and show how they greatly improve on popular techniques such as using null to stand for the not present state.

We have been amazed at what turns up when converting existing code to use the Option types. These conversions have unearthed bugs almost completely mechanically, as well as revealing previously un-comprehended complexity. Sometimes, conversions have also made it clear that the responsibility of a type or method should be different than previously thought. We have also found that explicitly using the Option pattern clarifies our thinking about language features like null pointers, exceptions, and C#'s System.Nullable type, about design patterns like Try-Get, and about special values such as special constants and enumeration elements like None.

Our Option<T> type, and in general, data structures for options, as well as data structures for n-tuples, are examples of small containers - structures that contain at most a pre-specified number of elements. In [Sha07e], we show that the idea of small containers gives rise to language extensions that have wide applicability - far beyond just to small containers.

The remainder of this article is organized as follows. The next section presents the core of our C# Option<T> type. Examples illustrate Option<T>'s use. This section also shows how a general purpose tool such as Option<T> can allow clients to use the abstract option pattern explicitly. Performance characteristics of Option<T> are also examined. The following section reviews the simple option pattern as an abstract pattern, and discusses the support that various programming languages do or do not provide for explicitly using options. Next is a discussion of the annotated option pattern variant, including a description of the variant, an example showing this pattern being used in UNIX's well known fopen library call, and our Option<V,A> data type that allows clients to use annotated options explicitly. This is followed by a short presentation of different views of options. We then give some practical recommendations about using options. These recommendations include a list of two dozen or so situations where one might find good opportunities for converting code to explicitly use options. We conclude with a summary.

While some of this article uses C# terminology, much of it requires no knowledge of C#.

The Core of the Simple Option Type Option<T>

In this section, we review the core of our C# Option<T> type, visit a few examples of its use, and discuss Option<T>'s performance characteristics.

Description

Our Option<T> type is designed to facilitate explicit use of the simple option pattern. It embodies a philosophy of safety, conciseness, and explicitness. This philosophy is distinctly different than that associated with historically popular techniques for using options. As we discuss elsewhere, the Option<T> type isn't the first of its kind. It is true, however, that generic types for the general-purpose explicit use of options have been relatively rare in mainstream programming.

At its core, the Option<T> type is simple and short:

[SerializableAttribute]
public struct Option<T>
  : IEnumerable<T>, IEquatable<T>
{
  bool _valueIsPresent;
  T _value;

  public Option(T value)
  {
    if (value == null)
      { throw new ArgumentNullException(); }
    _valueIsPresent = true;
    _value = value;
  }

  public bool IsPresent
    { get { return _valueIsPresent; } }

  public T Value
  {
    get {
      if (! _valueIsPresent)
        { throw new InvalidOperationException("Value is not present."); }
      return _value;
    }
  }
}

The Option<T> type resides in the ZRiver.Collections namespace, reflecting the small container view of options.

The Option<T> type is immutable, might be thread safe, and implements the IEnumerable<T> and IEquatable<T> interfaces. One version of Option<T> also implemented the ICollection<T> interface, thereby allowing clients to treat Option<T> instances as first class collections. This functionality was removed when it became clear that immutability is more important.

The Option<T> type also implements all that is necessary for an interface not shown here, called the IOptionCore<T> interface. More information about IOptionCore<T> is available in the accompanying source code. At least in the current version, this interface's definition is guarded by #if false because IOptionCore<T>'s usefulness outside this series of articles is uncertain.

A given Option<T> instance is always in one of two states: present or not present. Clients construct options that are, respectively, in these states by using Option<T>'s one-argument constructor or C#'s default parameterless struct constructor. Clients use the IsPresent property to determine whether the instance is in the present or not present state. When IsPresent returns true, clients can retrieve a value by calling the Value property's get accessor. When IsPresent returns false, and if a client calls Value's get accessor, the Option<T> type throws an exception.

Demonstration Example

Below is a small program showing the basic use of the Option<T> type. Comments explain the expected output.

using System;

using ZRiver.Collections;

class OptionTestMain
{
  public static void Main()
  {
    Option<int> a = new Option<int>();
    test(a);                // prints "caught exception"
    a = new Option<int>(3);
    test(a);                // prints "3"
    a = Option.NotPresent;  // same as new Option<int>()
    test(a);                // prints "caught exception"
    a = 4;                  // same as new Option<int>(4)
    test(a);                // prints "4"
  }

  static void test(Option<int> a)
  {
    try {
      int v = a.Value;
      Console.WriteLine(v);
    } catch {
      Console.WriteLine("caught exception");
    }
  }
}

The Option.NotPresent property resides in a non-generic type named Option. Because C# distinguishes identically named types that have different numbers of generic arguments, the following types can coexist in one namespace:

Option<T> implements the simple option pattern, and is the subject of this section.
Option<V,A> implements the annotated option pattern, and is discussed later.
Option includes Option.NotPresent, and provides other services for the Option<T> and Option<V,A> types.

The parameterless Option type has generic Min<T> and Max<T> methods that can be used with any type T that implements the IComparable<T> interface. Below is an example of using Option.Min<int> to find the minimum value (when it is defined) of three options.

Option<int> min3(
    Option<int> a
  , Option<int> b
  , Option<int> c
)
{
  Option<int> m = Option.Min<int>(a, b);
  return Option.Min<int>(m, c);
}

The Option<T> and Option<V,A> types also have devices for interoperating with other techniques for using the Option pattern, such as using two variables, or special values like null.

Command Line Processing Example

Let's look at another example of how using the Option<T> type compares to some typical ways of using options. Consider a hypothetical utility called Scour that is designed to be used from the command line as follows:

scour -inputfile f [-logfile g]

We can write a ScourArguments type that parses for this design and presents the parsed arguments to the rest of the program. This type can present the optional logfile argument to the rest of the program as some kind of instance of the abstract option pattern. Without Option<T>, the ScourArguments type might look like the following:

class ScourArguments
{
  public ScourArguments(string[] args)
    { .. set _logFile, _haveLogFile }

  public bool HaveLogFile 
    { get { return _haveLogFile; } }

  public string LogFile  {
    get {
      if (! _haveLogFile) {
        throw new InvalidOperationException(
          "Attempted to retrieve log file when none was present."
        );
      }
      return _logFile;
    }
  }

  readonly string _logFile;
  readonly bool _haveLogFile;
}

This setup uses the two variable technique for the Option pattern. While it does have the advantage of prohibiting access to a value when the system is in the not present state, it's awfully verbose and error prone.

We might instead use the null means not present semantics internally in the ScourArguments type, and hide the details from the rest of the program:

class ScourArguments
{
  public ScourArguments(string[] args)
    { .. set _logFile }

  public bool HaveLogFile
    { get { return _logFile != null; } }

  public string LogFile  {
    get {
      if (_logFile == null) {
        throw new InvalidOperationException(
          "Attempted to retrieve log file when none was present."
        );
      }
      return _logFile;
    }
  }

  readonly string _logFile;
}

This, too, is verbose and error prone. We could try exposing the null means not present semantics to the rest of the program:

class ScourArguments
{
  public ScourArguments(string[] args)
    { .. set _logFile }

  // return null if no log file was specified
  public string LogFile
    { get { return _logFile; } }

  readonly string _logFile;
}

While less verbose than its predecessors, the above setup suffers the usual problems with the null means not present semantics that are described elsewhere in this series of articles.

Explicitly using the Option pattern is safe, concise, and clear:

class ScourArguments
{
  public ScourArguments(string[] args)
    { .. set _logFile }

  public Option<string> LogFile
    { get { return _logFile; } }

  readonly Option<string> _logFile;
}

The System.Nullable type cannot be used here because we are using the string type, a reference type.

Optional Method Arguments Example

Suppose that a method accepts several values, each of which may independently be considered to be present or not present. With the Option<T> type, this is easy:

void m(Option<int> a, Option<bool> b, Option<string> c, Option<string> d)

Here, the Option<T> type allows using a, b, c, and d in any of ^4=16 possible combinations. This technique works for any group of value types and reference types. Accomplishing this is problematic with the plain C# argument machinery.

Contained Pointers are Pure

In [Sha07c], we define a given variable, object, or other piece of data v as being pure in a given programming environment if the environment enforces the invariant that v is never null. (Ideally, the environment would have no notion of null.) The Option<T> implementation follows a philosophy of preferring pure pointers over impure pointers. All pointers exposed to clients by a given Option<T> instance are pure. The Option<T> type enforces this by doing a check during construction:

public Option(T value)
{
  if (value == null)
    { throw new ArgumentNullException(); }
  _valueIsPresent = true;
  _value = value;
}

A peculiarity of C# is that the check value == null works even when value's type T is a value type. Let's see why this works. When processing the expression value == null, C#'s static compiler sees that value cannot be directly compared to null, and seeks a conversion from value to a type that can be compared to null. The compiler finds such a conversion: from the value type T to the nullable type T?. It performs this conversion, yielding the expression w == null, where w = new Nullable<T>(value). The compiler then rewrites w == null as ! w.HasValue, which is false.

Without Option<T>'s check for null during construction, client code like the following would be common:

void print(Option<string> option)
{
  if (option.IsPresent) {
    if (option.Value == null) {
      Console.WriteLine("option is null");
    } else {
      Console.WriteLine("option is not null ('{0}')", option.Value);
    }
  } else {
    Console.WriteLine("option is not present");
  }
}

Because the Option<T> type returns only pure pointers, the client code is simplified:

void print(Option<string> option)
{
  if (option.IsPresent) {
    Console.WriteLine("option is '{0}'", option.Value);
  } else {
    Console.WriteLine("option is not present");
  }
}

Instead of storing null values in an option, we can nest options instead:

Option<Option<T>>

Nesting is discussed further in the section of [Sha07b] about the null means not present semantics.

Speed

The Option<T> type is a struct, and therefore instances of Option<T> are passed around on the stack. This should be very fast - just slightly slower than using null, roughly the same speed as using an auxiliary boolean variable to hold status information, and much faster than throwing exceptions. In languages that directly support explicitly using options, and possibly even in some that don't, using options with reference types should be as fast as using impure pointers, and possibly faster, since the compiler, runtime, or application might end up initiating fewer tests for null.

It is nice if, when T is a value type, the static compiler or the JIT compiler optimizes away Option<T>'s tests for null. This optimization seems to not occur in .NET 2.0. Thus, in .NET 2.0 at least, constructing Option<T> instances incurs overhead to test for null even when T is a value type. This overhead is small, but eliminating it would still be nice.

Storage

When T is a value type, each instance a of Option<T> will use storage to store an instance of T in a's _value field. This storage is allocated even when a is in the not present state. Using this storage could be undesirable if T is large. A workaround is to box instances of T that are contained in Option<T> instances. This might be best done by creating a type specifically for boxing instances of T, e.g., BoxedT; then, Option<BoxedT> can be used with much better type safety than Option<Object>. Or, a future BoxingOption<T> type could perform the boxing automatically.

Runtimes that support explicitly using options can represent the not present state for pointers by using the same representation that runtimes currently use for null. This avoids using storage for a boolean indicator.

The Simple Option Pattern is an Abstract Code Pattern

We have seen a general purpose generic type designed to enable clients to explicitly use the abstract option pattern, some examples of using the type, and its performance characteristics. Let's now examine what exactly the abstract option pattern is. Ubiquitous in programming, the simple option is defined as follows.

A simple option, at any given time, is in either the not present state or the present state. When an option is in the present state, the option also has an associated value. Clients should be able to easily ascertain the state of the option, and to retrieve the associated value when the option is in the present state. A program attempting to retrieve a value when the option is in the not present state results in a near-fatal or fatal error, e.g., throwing an exception.

There are many variants, such as not generating a fatal error upon attempted access to a value when no value is present. We will refer to all of these abstract patterns as the option pattern.

An option is said to be present if the option is in the present state. Technically, this definition creates an ambiguity because the term "present" could refer to either:

a data structure being in the present state, or
the existence of the data structure itself.

In practice, we have found this ambiguity to be less important than having the natural way of talking about option instances that comes from referring to options as being "present". With appropriately named options, this terminology is intuitive, e.g., argument.IsPresent or webQueryResponse.IsPresent.

We again emphasize the distinction between:

the option pattern, which is abstract, and
techniques, concrete data types, or collections of utility code that allow programmers to use the option pattern in their code.

To review, instances of (2) include:

our Option<T> type;
using null, -1, or some other special value to mean not present;
using two variables - one to store the primary value, the other to store a boolean value indicating whether the system's state is present or not present;
built-in programming language features;
exceptions.

The Option pattern provides a unifying framework. We have found that familiarity with the abstract pattern helps to eliminate extraneous details, focus on the essentials of particular coding problems, recognize non-obvious occurrences of the option pattern, immediately understand pitfalls, and form ideas about how to improve code. We have argued and will continue to argue in the remainder of this article and in [Sha07b] that explicit use of the abstract option pattern can make code simpler, clearer, and safer, and is superior to most techniques currently used for options.

Let us briefly look at the historical view of the Option pattern in mainstream programming. As we have discussed, the Option pattern is rarely recognized in mainstream programming as an abstract pattern. Several design patterns books including the "Gang of Four Book" [GHJV95] have no obvious reference to the Option pattern. Nor do programming languages or libraries typically provide a type like Option<T>.

The most widely used languages that directly support options during normal operations appear to be functional languages such as Haskell and OCaml. While these languages have significant user bases, these are much smaller than those of C, C++, C#, and Java - four languages that account for a large part of the mainstream programming community.

Within the larger community, recognition of the abstract pattern has almost always been in the context of the pattern's implementation rather than its abstract form or function. In particular, discussion often includes the term "null", as in the database term "nullable", C#'s System.Nullable type, and the idea that a null value means not present. This is undesirable because usually the term "null" is most directly associated with null pointers, which are an implementation mechanism and not an abstract pattern.

C++, C#, and Java have limited support for options, through exceptions. Code that throws and catches exceptions use the Option pattern only implicitly, if at all. Still, exceptions can sometimes be a good alternative to explicitly using options.

C does not support explicitly using options. The closest thing it has is null. C++ has C-style pointers as well as "C++ references" (these are never null), but C++ does not support explicitly using options during normal operations. C#'s Nullable type and the associated language constructs support options, but Nullable cannot be used with reference types, has unsafe behavior involving default values, propagates the use of null values, and includes "null" in its name and thus has an unfortunate suggestion of emphasizing implementation mechanisms (null) over function.

The Annotated Option Pattern

In a common generalization of the simple option pattern called the annotated option pattern, there is a set of cardinality 1, and the element's type is drawn from a pre-specified collection of two data types. This is essentially a tagged union with two possible types.

Like the simple option pattern, the annotated option pattern is an abstract pattern that may be used or implemented in a variety of ways. In the next subsection, we elaborate on this a bit. We then look at an example of using annotated options with our type Option<V,A>. The following subsection presents the core of our annotated option type Option<V,A>.

The Annotated Option Pattern is an Abstract Code Pattern

One way to view the annotated option pattern is as an abstract pattern similar to the simple option pattern that adds a feature whereby if the option is not present, clients are given an annotation that explains why. Such an annotation may, for example, be an enumeration element or a string.

Annotated option states differ from the simple option states. The annotated option states are value present and annotation present. Thus, at all times, either a value is present or an annotation is present, and never are a value and annotation present simultaneously.

Examples of Annotated Options

Annotated options, while much less common than simple options, are still quite common. They arise frequently in parsing. Another typical use is with a method that may return a handle (a pointer), and, when a handle cannot be returned, places explanatory information in an auxiliary variable.

For example, UNIX C implementations have the fopen library call for opening a file:

FILE *fopen(char *path, char *mode);

The return conventions of fopen are common - return a non-null handle when successful, and when unsuccessful, return null and as a side effect, set a global variable called errno that provides information about the failure. This setup uses the annotated option pattern.

In [Ric06, p420], annotated options appear in another guise in the Win32 API and in COM:

"... most Win32 functions return a BOOL set to FALSE to indicate that something went wrong, and then it's up to the caller to call GetLastError to find the reason for the failure. On the other hand, COM methods return an HRESULT with the high bit set to 1 to indicate failure. The remaining bits represent a value that helps you determine the cause of the failure."

Let's continue looking at fopen, and see how fopen's interface might be modified to explicitly use annotated options. The germane features of fopen's setup are expressed in C# as follows:

class File {
  public static File Open(string path, string mode);
  public int errno;
}

We could restructure the interface to explicitly use an annotated option, using our Option<V,A> type:

class File
{
  public enum OpenFailure {
      FileDoesNotExist
    , FileIsLocked
    , PathIsInvalid
    , PermissionDenied
    , ...
  }
  public static Option<File,OpenFailure> Open(string path, string mode)
  { ... }
}

This new interface might then be used as follows:

class TestFile
{
  public static void Main()
  {
    Option<File,File.OpenFailure> handle = 
             File.Open("myfile","rw");
    if (handle.IsPresent) {
      File f = handle.Value;
      f.Write("hello");
      f.Close();
      Console.WriteLine("Wrote to file.");
    } else {
      Console.WriteLine("Could not open: {0}.", handle.Annotation);
    }
  }
}

The example uses our Option<V,A> type. The Option<V,A> type's IsPresent property returns true if the instance is in the value present state. The Option<V,A> type throws an exception when it receives a request to access a value when the state is annotation present, or to access an annotation when the state is value present.

In the example above, when Main's call to File.Open yields an opened file, the File handle is available through handle.Value. When the file was not opened, a reason for the failure is given in handle.Annotation.

The Core of the Annotated Option Type Option<V,A>

Our implementation of the annotated option pattern is through a C# class which takes two generic type arguments:

[SerializableAttribute]
public class Option<V,A>
  { ... }

Here, V gives the type of the value, and A gives the type of the annotation. Like Option<T> instances, all Option<V,A> instances are immutable. Instances of Option<V,A> are created through the following constructors:

public class Option<V,A>
{
  ...

  public Option(V value)
  {
    if (value == null)
      { throw new ArgumentNullException(); }
    _value          = value;
    _annotation     = default(A);
    _valueIsPresent = true;
  }

  public Option(A annotation)
  {
    if (annotation == null)
      { throw new ArgumentNullException(); }
    _value          = default(V);
    _annotation     = annotation;
    _valueIsPresent = false;
  }

  //// private fields

  V _value;
  A _annotation;
  bool _valueIsPresent;
}

These constructors illustrate the symmetry between the value and the annotation. This symmetry is valuable. Understanding symmetry and other mathematical views of the Option pattern, such as "zero or one", "one or the other", and "one value two types" facilitates recognizing non-obvious occurrences of the Option pattern.

The Option<V,A> type was designed to maintain symmetry. Doing so while never returning null pointers requires that either Option<V,A> be a class, or that Option<V,A> be a struct and that instances initialized with C#'s default parameterless struct constructor be in a third abstract state, e.g., error. The original implementation of Option<V,A> was a struct. We later converted Option<V,A> into a class so as to make a third state unnecessary. That Option<V,A> is a class while Option<T> is a struct is a potential source of confusion and error. In most code, though, the difference is unimportant, whereas it is valuable to simplify the type by avoiding a third state, and to make the implementation closely follow the abstract pattern.

The following core Option<V,A> members safely retrieve state, values, and annotations:

public class Option<V,A>
{
  ...

  public bool IsPresent
    { get { return _valueIsPresent; } }

  public V Value
  {
    get {
      if (_valueIsPresent)
        { return _value; }
      throw new InvalidOperationException(String.Format(
        "Value is not present (annotation is {0}).", _annotation
      ));
    }
  }

  public A Annotation
  {
    get {
      if (_valueIsPresent) {
        throw new InvalidOperationException(String.Format(
          "Annotation is not present (value is {0}).", _value
        ));
      }
      return _annotation;
    }
  }
}

Views of the Option Patterns

In this section, we consider several views of simple and annotated options.

One view of a simple option is as a tagged union with two possible types, one being a dummy type corresponding to the not present state. Annotated options may be viewed similarly, with one of the tagged union types being for the option value, and the other being for the annotation. That functional languages such as Haskell and OCaml support tagged unions is a partial explanation for why these languages support explicitly using options.

An object oriented approach views options as using polymorphism over two types. In the simple option pattern, the type of the value, along with a pseudo-type for the not present state, are both seen as deriving from a shared base class. In the annotated option pattern, the value type and the annotation type are viewed as deriving from a shared base class.

Options may also be viewed as special cases of small containers - containers designed to contain at most a pre-specified number of elements. In this slot- or set-oriented view, simple options contain 0 or 1 elements, and annotated options contain one element whose type is one of two pre-specified types.

Recommendations for Explicitly Using the Options

In this section, we give some guidance for using an explicit option pattern implementation such as our Option<T> type.

Sometimes Exceptions are Better

Using exceptions is sometimes better than using options. This is discussed in detail in [Sha07d].

How to Decide Between Simple and Annotated Options

The following rule of thumb can help to decide between simple and annotated options:

Use an annotated option when the client needs to differentiate two or more ways in which a value can fail to exist. Otherwise, use a simple option.

Avoid Options When Only One Value is Possible

Using a plain boolean variable is usually better than using an Option<bool> instance whose Value property always returns true when the option is present. In general, when only one value is possible, alternatives may be superior to options. On the other hand, three-value logic may reasonably be implemented using Option<bool>.

Avoid Using Options to Indicate that a Collection is Empty

Wrapping a collection in an option may add complexity unnecessarily. The forms that such wrapping may take are as varied as the forms of the Option pattern, e.g., Option<ICollection>, using null to indicate that an Array container is empty, using an auxiliary variable for the same purpose, and so on. The Option pattern leads to a unifying guideline that applies to all mechanisms for using the Option pattern:

Avoid using options to indicate that a collection is empty. [GE]

Suppose, for example, that a program collects zero or more file names from the command line. Within this program, it is probably simplest for a command line class to present the arguments to the rest of the program as a list. Clients can then do things like the following:

foreach (string fileName in commandLineArguments.FileNames)
{ ... }

Wrapping the list with an option adds complexity unnecessarily:

Option<string> fileNameCollection
  = commandLineArguments.OptionalFileNameCollection;
if (fileNameCollection.IsPresent) {
  foreach (string fileName in fileNameCollection.Value)
    { ... }
}

Or, worse:

string[] fileNameCollection
  = commandLineArguments.OptionalFileNameCollection;
if (fileNameCollection != null) {
  foreach (string fileName in fileNameCollection)
    { ... }
}

Microsoft has a guideline that is essentially the following: indicate that an array is empty by using a zero-length array instead of using null [Ric06, p302]. That this guideline is a special case of the above option-oriented guideline [GE] highlights the abstract option pattern's role as a unifier of disparate option pattern techniques.

Don't Wrap Exception Objects in Option Containers

One of the main benefits of exceptions is that they provide detailed information about the reason for and location of a failure. Examples include a stack trace, identification of a method argument that was in error, an exception's object type, and exception-specific information such as descriptive strings that the application embeds in exception instances.

Annotations in the annotated option pattern likewise contain information about reasons for failure. It is natural to think of using annotations to refer to unthrown exceptions. The idea is to use exception types simply as containers of information. Unthrown exceptions can be created using new as usual, and never applied to throw.

This is probably a bad idea. It violates several guidelines in [CA06] about using exceptions, most closely the guideline, "Do not have public members that return exceptions as the return value or an out parameter" (p187). As [CA06] gives no reason for this guideline, it may be instructive to examine some reasons for it.

The main justification is probably that code presented with an exception e may reasonably assume that e was thrown. This code can break when applied to unthrown exceptions because some exception properties return a non-null value only when the exception is thrown. For example, the e.StackTrace property returns null when e has not been thrown. An exception handler presented with e might attempt to print e.StackTrace, and itself take a NullReference exception. To avoid these scenarios, we should strive to prevent unthrown exceptions from leaking out from local uses.

Wrapping Interfaces in Option<T> Containers

When T is an interface, some care is required when considering the use of Option<T>. One problem is the impossibility of assigning to such options in the usual way:

void f()
{
  Option<IAsyncResult> result;

  ...

  // This won't work, since it would require a conversion
  // from IAsyncResult to Option<IAsyncResult>, and C#
  // doesn't allow conversions to or from interfaces.
  result = someDelegate.BeginInvoke(...);

  // This works fine, but is a bit cumbersome.
  result = new Option<IAsyncResult>(
    someDelegate.BeginInvoke(...)
  );
}

Clients may expect to be able to do such assignments, and become confused when they can't.

Where to Find Conversion Opportunities

Here are some things to search for in C# code in order to find opportunism for converting to explicitly use options:

Code that is a client of code that was recently converted to explicitly use options.
The string null (a case-insensitive search for null will also find conversion opportunities with uses of System.Nullable).
The string enum; look for enumerations with special members like None and Unknown that seem different than other enumeration elements.
Constant values, including const, readonly, static, and literal values.
Awkward or confusing initialization.
Data types with instances that are created in, or may become in, a state where using members is not permitted, e.g., through closing, disposal, or lazy initialization.
Peculiar interfaces or argument patterns.
Code that throws, catches, or otherwise involves exceptions.
Code that returns, receives, or otherwise involves error codes, warnings, or notifications.
Code that deals with boundary conditions or special cases.
Variable pairs or triples that are passed around together as arguments or that seem otherwise related.
Code that processes data that it did not create, especially command line arguments, configuration data, or databases.
Code that searches, looks things up, or uses hash tables.
Catch-all code, especially default cases in switch statements.
Code that represents one entity in two ways.
Methods that return handles.
Code that parses things.
Enumerations with only one element (try to remove these in any case).
Placeholders or sentinels, such as special linked list nodes.
Data structures that use pseudo-pointers, e.g., integers.
Collections that have at most one element.
Methods that take a variable number of arguments.
null-like instances of objects that are intended not to be dereferenced and to throw an exception when dereferenced.
Uses of the "null object" design pattern.
Definitions or uses of methods not covered above that return disguised instances of the Option pattern, e.g., methods that use the TryGet pattern.

Conclusion

In this article, we have defined the abstract option pattern and distinguished between it and concrete ways of using it in code. This contributes towards a productive, unifying view of widely used coding patterns and practices. The option pattern's unifying role is also highlighted in the other articles in this series. Our extensive review [Sha07b] of common option techniques demonstrates the option pattern's ubiquity in mainstream programming.

We introduced our simple and annotated Option types, and included code for them as an attachment to this article. This series of articles should make it clear that wise use of an Option type or some other explicit option implementation offers many advantages over other implicit techniques such as the null means not present semantics, special values, two variables, System.Nullable, and, sometimes, exceptions.

Acknowledgement

Marc Clifton and Úlfar Erlingsson made some helpful suggestions. Thanks.

References

[CA06] Framework Design Guidelines - Conventions, Idioms, and Patterns for Reusable .NET Libraries. Krzysztof Cwalina and Brad Abrams, Addison-Wesley, 2006.
[GHJV95] Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides. Design patterns: elements of reusable object-oriented software. Addison-Wesley, 1995.
[Ric06] Jeffrey Richter. CLR via C#. Microsoft Press, 2006.
[Sha07b] Andrew Shapira. Historically Popular Techniques for Using the Option Pattern. In preparation.
[Sha07c] Andrew Shapira. Explicit Option Use, Pure Pointers, and Impure Pointers. In preparation.
[Sha07d] Andrew Shapira. Exceptions with the Succeed View, and Options with the Option View. In preparation.
[Sha07e] Andrew Shapira. Small Containers and Associated Mainstream Language Extensions. In preparation.