Provable APIs






4.94/5 (17 votes)
Five ways to structure an API to ensure that people use it correctly.
Introduction
Tool vendors like Microsoft are not the only ones who publish APIs. When we create layered software, each layer has an API that is consumed by the next one up. To ensure the quality of our software, we should try to create provable APIs. These are interfaces that guide the caller to the correct usage patterns. They help the compiler help us to verify the correctness of our code.
An unhelpful API throws exceptions whenever we get something wrong. These kinds of APIs can cause stress and lead to bugs that are difficult to correct. There is a right way to call them, but there is also a wrong way. The wrong way still compiles, but it contains bugs nonetheless.
Some language features and patterns that can help us to prove the correctness of code. Parameters, callbacks, factories, and constructors are not just language constructs. They are axioms that can be used to prove interesting things. For example:
- You must set a property before calling this method
- You must check a condition before calling this method
- You must call this method after setting properties
- You cannot change this property after calling a method
- You must dispose this object
You must set a property before calling this method
A ShoppingService
uses a Transaction
to perform some basic operations. For example:
public class Transaction
{
}
public class ShoppingService
{
public Transaction Transaction { get; set; }
public void AddToCart(int cartId, int itemId, int quantity)
{
}
}
public static void Right()
{
ShoppingService shoppingService = new ShoppingService();
shoppingService.Transaction = new Transaction();
shoppingService.AddToCart(1, 2, 3);
}
public static void Wrong()
{
ShoppingService shoppingService = new ShoppingService();
shoppingService.AddToCart(1, 2, 3);
}
It has a Transaction
property that must be set before it is called. If you forget to set it, the
method throws an exception. This API is unhelpful. If instead the method takes the transaction
as a parameter, the compiler enforces this rule.
public class ShoppingService
{
public void AddToCart(Transaction transaction, int cartId, int itemId, int quantity)
{
}
}
public static void Right()
{
ShoppingService shoppingService = new ShoppingService();
shoppingService.AddToCart(new Transaction(), 1, 2, 3);
}
In this version of the code, we’ve refactored the Transaction
property and turned it into a method
parameter. The right way of calling the method compiles. The wrong way does not.
You must check a condition before calling this method
Now let’s look at the interface for a cache. You can Add
an item,
Get
an item, or check to see if the
cache already Contains
an item. There is a right way to use this API, and a couple of wrong ways.
public class Cache<TKey, TItem>
{
public bool Contains(TKey key)
{
return false;
}
public void Add(TKey key, TItem item)
{
if (Contains(key))
throw new ApplicationException();
}
public TItem Get(TKey key)
{
if (!Contains(key))
throw new ApplicationException();
return default(TItem);
}
}
public static void Right()
{
Cache<int, string> cache = new Cache<int, string>();
int key = 42;
string value;
if (cache.Contains(key))
{
value = cache.Get(key);
}
else
{
value = LoadValue(key);
cache.Add(key, value);
}
}
public static void Wrong1()
{
Cache<int, string> cache = new Cache<int, string>();
int key = 42;
string value;
value = cache.Get(key);
if (value == null)
{
value = LoadValue(key);
cache.Add(key, value);
}
}
public static void Wrong2()
{
Cache<int, string> cache = new Cache<int, string>();
int key = 42;
string value;
value = LoadValue(key);
cache.Add(key, value);
}
private static string LoadValue(int key)
{
return "the value";
}
The right way is to check the condition first. If the item is not there, load it and add it. If the item is already there, get it.
But you might be confused. Maybe you need to get it first, and if Get
returns null you know it’s
not there. That is not the contract of this class, but it is impossible to see that from the
public API alone. It will throw an exception.
You might also make the mistake of trying to add an item to the cache without first checking to see if it is there. This could be a copy/paste bug, or perhaps your code took a path that you didn’t anticipate. This is going to throw an exception, too.
Let’s refactor this code by pulling the right usage pattern into the Cache
itself. Since we
need to do some work right in the middle, we’ll provide a callback.
public class Cache<TKey, TItem>
{
public bool Contains(TKey key)
{
return false;
}
public TItem GetValue(TKey key, Func<TKey, TItem> fetchValue)
{
TItem value;
if (Contains(key))
{
value = Get(key);
}
else
{
value = fetchValue(key);
Add(key, value);
}
return value;
}
private void Add(TKey key, TItem item)
{
if (Contains(key))
throw new ApplicationException();
}
private TItem Get(TKey key)
{
if (!Contains(key))
throw new ApplicationException();
return default(TItem);
}
}
public static void Right()
{
Cache<int, string> cache = new Cache<int, string>();
int key = 42;
string value;
value = cache.GetValue(key, k => LoadValue(k));
}
After moving this code into the Cache
class, we can make the
Add
and Get
methods private
.
This makes it impossible to use the Cache
incorrectly.
You must call this method after setting properties
It’s a good idea to have business objects that perform validation. It lets you respond to the user,
and it prevents bad data from getting into the database. But what if you forget to call the
Validate
method?
public class Customer
{
private static Regex ValidPhoneNumber = new Regex(@"\([0-9]{3}\) [0-9]{3}-[0-9]{4}");
public string Name { get; set; }
public string PhoneNumber { get; set; }
public bool Validate()
{
if (!ValidPhoneNumber.IsMatch(PhoneNumber))
return false;
return true;
}
}
public static void Right()
{
Customer customer = new Customer()
{
Name = "Michael L Perry",
PhoneNumber = "(214) 555-7909"
};
if (!customer.Validate())
throw new ApplicationException();
}
public static void Wrong()
{
Customer customer = new Customer()
{
Name = "Michael L Perry",
PhoneNumber = "555-7909"
};
}
Nothing about this API forces you to call Validate
. And if you don’t, bad data can get through.
The problem is that the PhoneNumber
is a string – a very permissive type. We can make it a more restrictive
type and use a factory method to enforce validation.
public class PhoneNumber
{
private static Regex ValidPhoneNumber = new Regex(@"\([0-9]{3}\) [0-9]{3}-[0-9]{4}");
private string _value;
private PhoneNumber(string value)
{
_value = value;
}
public string Value
{
get { return _value; }
}
public static PhoneNumber Parse(string value)
{
if (!ValidPhoneNumber.IsMatch(value))
throw new ApplicationException();
return new PhoneNumber(value);
}
}
public class Customer
{
public string Name { get; set; }
public PhoneNumber PhoneNumber { get; set; }
}
public static void Right()
{
Customer customer = new Customer()
{
Name = "Michael L Perry",
PhoneNumber = PhoneNumber.Parse("(214) 555-7909")
};
}
Now we are forced to validate the string in order to get a PhoneNumber
object. We can still
provide feedback on user input, since that’s the time at which we will be parsing the string.
But now we can’t forget.
You cannot change this property after calling a method
The .NET Connection
class requires that you provide a connection string before you access any
data. And it also prevents you from changing the connection string after you connect. These
rules are fine. The problem is that they are enforced by a state machine behind an unhelpful
API that throws exceptions if you get it wrong.
public class Connection
{
private string _connectionString;
private bool _connected = false;
public string ConnectionString
{
get
{
return _connectionString;
}
set
{
if (_connected)
throw new ApplicationException();
_connectionString = value;
}
}
public void Connect()
{
if (String.IsNullOrEmpty(_connectionString))
throw new ApplicationException();
_connected = true;
}
public void Disconnect()
{
_connected = false;
}
}
public static void Right()
{
Connection connection = new Connection();
connection.ConnectionString = "DataSource=//MyMachine";
connection.Connect();
connection.Disconnect();
}
public static void Wrong1()
{
Connection connection = new Connection();
connection.Connect();
connection.Disconnect();
}
public static void Wrong2()
{
Connection connection = new Connection();
connection.ConnectionString = "DataSource=//MyMachine";
connection.Connect();
connection.ConnectionString = "DataSource=//HisMachine";
connection.Disconnect();
}
If we were to make the connection string a constructor parameter instead of a property, we wouldn’t be able to change it.
public class Connection
{
private string _connectionString;
public Connection(string connectionString)
{
_connectionString = connectionString;
}
public string ConnectionString
{
get { return _connectionString; }
}
public void Connect()
{
}
public void Disconnect()
{
}
}
public static void Right()
{
Connection connection = new Connection("DataSource=//MyMachine");
connection.Connect();
connection.Disconnect();
}
The .NET Connection
class has a constructor that takes a connection string. But it also
has a constructor that does not. The overloaded constructor and modifiable property make
it possible to do the wrong thing. Rip them out and let the compiler enforce correctness
for you.
You must dispose this object
Let’s go back to the ShoppingService
. There’s still a problem with the code. It’s possible
to leak database transactions if you forget to dispose them.
public class Transaction : IDisposable
{
public void Dispose()
{
}
}
public class ShoppingService
{
public void AddToCart(Transaction transaction, int cartId, int itemId, int quantity)
{
}
}
public static void Right()
{
ShoppingService shoppingService = new ShoppingService();
using (Transaction transaction = new Transaction())
{
shoppingService.AddToCart(transaction, 1, 2, 3);
}
}
public static void Wrong()
{
ShoppingService shoppingService = new ShoppingService();
shoppingService.AddToCart(new Transaction(), 1, 2, 3);
}
The compiler doesn’t require you to dispose an object that implements
IDisposable
. It doesn’t
even issue a warning. Some refactoring tools and static analysis tools look for these problems,
but we can refactor the API to enforce it at the compiler level. We’ll use a combination of a
factory and a callback to take that responsibility away
from the caller.
public class TransactionFactory
{
private Func<Transaction> _factoryMethod;
public TransactionFactory(Func<Transaction> factoryMethod)
{
_factoryMethod = factoryMethod;
}
public void Do(Action<Transaction> action)
{
using (var transaction = _factoryMethod())
{
action(transaction);
}
}
}
public static void Right(TransactionFactory transactionFactory)
{
ShoppingService shoppingService = new ShoppingService();
transactionFactory.Do(transaction =>
{
shoppingService.AddToCart(transaction, 1, 2, 3);
});
}
The caller receives a TransactionFactory
, rather than creating a
Transaction
himself. But the
factory doesn’t just ensure that the Transaction is created properly, it also ensures that it
is disposed of properly.
It doesn’t require any special tools to prove that an API is properly used. All it takes is a little forethought to turn an unhelpful API that buzzes and throws exceptions into a helpful, provable API.
For a more in-depth look at techniques for writing provable code, please see Michael's course on Pluralsight.