There are lots of texts on object design and programming, but very few on how to use objects and why to use them in a particular way. This paper briefly reviews the benefits of object-oriented programming, then suggests a specific strategy for maximizing those benefits. It illustrates the strategy by showing its use if the Factory patterns at the heart of object-oriented design.
The Problem of Change
Software projects begin with simple, clear visions. But as a progress progresses, it turns into a swamp, and that giant sucking sound you hear is the sound of a minor change cascading through a project, like ripples on a pond. Even the smallest change breaks objects located in distant parts of the application, and fixing those bugs breaks other objects in other parts of the application. Past some point of critical mass, the application becomes stratified and calcified. It is so brittle that the smallest change can cause the entire structure to come crashing to the ground, like a glass house in a hurricane. Or, in this case, a light breeze.
The entire point of object design is to localize change, to make an application easier to build and maintain. In traditional procedure-oriented design, the application is a single monolithic object with dozens of attributes and operations. These operations interact with the application's data, and with each other, without restriction. We can localize effects to some extent by encapsulating operations and variables in subroutines, but what we have, in effect, is a large roomful of people, milling about and handing messages to each other.
It's as if we are the president of a large company that is not divided into departments. When we deal with any part of the application, we are, in effect, dealing with all of it. The term 'spaghetti code' was invented to describe this phenomenon.
Organizing an Application
A good design will 'departmentalize' our application. Attributes and operations can be grouped into responsibilities, and responsibilities can be assigned to objects, the way that they are assigned to employees. And objects can be assigned to modules, the way that employees are assigned to departments. Our application begins to take on a structure similar to the organization chart of a business.
So far, we haven't described anything that could not be accomplished in a good procedure-oriented design. Where object design begins to really differ is in how it lets us encapsulate attributes and operations. We can seal them off and compile them separately. By making certain attributes and operations
public (visible to other parts of the application) and others
private (no visible to other parts), we can limit the access of the other parts of the application. They can use the modules only in the ways allowed by the
public attributes and operations—the module's interface. The rest of the module's attributes and operations—its internal workings—are hidden from the code using the module.
So, the first thing we want to do is organize our application so as to localize change. Consider the following objects:
We have divided our monolithic, procedure-based application into an object structure, and we have assigned appropriate responsibilities to each object. The responsibilities are implemented by attributes and operations within each object, and each object provides an interface that allows the other objects to use the services it provides.
Under this arrangement, we can address each object more or less separately. If a change is required in one of Object C's operations, we open that object and make changes as needed. If we have designed our objects well, the impact of changes will be more-or-less localized to Object C. Instead of having to deal with the entire company, as it were, we are able to deal with a single department.
The Problem of Dependency
You may have noticed the use of some 'weasel words' in the preceding description—specifically, the phrase 'more or less' turns up at key places. That's because the picture in not quite as rosy as we have painted it.
To use Object C, Object B must contain a reference to it. The reference is established when the application is compiled. That means that if Object C is recompiled, then Object B must be recompiled, too. Ad, since Object A has a reference to Object B, it must be recompiled, too. So, a change to Object C isn't entirely localized to that object. We have provided some localization, but minor changes still have global effects.
As a practical matter, there are more serious ramifications, as well. When we change Object C, we may very well change its interface, those attributes and operations it exposes to the rest of the application. If we do so, we will break Object B—it will try to use Object C with the old interface, causing a crash. So, when we change Object C, we may well have to change Object B as well. And that change may force a change to Object A. We may have reduced the problem of cascading changes, but we haven't quite solved it.
Changes and Extensions
There are two types of changes that we make to a design—bug fixes, and extensions. So far, we have been talking about bug fixes. An extension is a change in the requirements of the application. For example, let's say we are designing an application to retrieve stock quotes. We have designed an object (Object C) that will call a web page, retrieve the quotes for a specified stock from that page, and return the result to Object B. We have tested and compiled, and everything works well. Time for lunch.
After lunch, the project manager informs us that the application is very popular. In fact, it's so popular that users want to be able to use it to retrieve bond quotes, as well. There goes dinner—we will certainly be working late tonight.
Take a moment to consider what we have been asked to do:
- There is nothing wrong with the original program—we are being asked to extend the original program, rather than fix it.
- There is a certain commonality between what we have (a stock quote) and what we need (a bond quote). Both items are securities quotes.
- What needs to be done (i.e., loading web page, getting a quote, and returning it) isn't changing. The change involves how it is to be done (different web page, different format for the quote).
- If users want to add bonds today, they are likely to want to add something else (like pork bellies) tomorrow. We can expect further extensions later.
Obviously, we're going to have to modify Object C, and its changes are going to require retesting and recompiling the entire application. Is there any way we can make the changes so that we don't have to repeat this exercise every time someone wants to add a new type of quote?
There is a way to do that, and its key is the last phrase of the last sentence. What we have, and what we are adding, are both types of quotes. And what we anticipate having to add in the future are additional types of quotes. We can make the application very extensible if we can give it some type of socket that any type of quote can be plugged into. And that's where object design departs completely from traditional procedure-oriented design.
Object-oriented programming languages provide a feature called inheritance. If we have several types of anything, we can factor out the common elements and define them in a base class. A class is simply a template we use to create an object. It's the same as a metal-stamping press in an automobile factory. Just as the stamping-press can churn out car hoods all day long, a class can be used to churn out instances of a particular object.
A base class specifies the attributes and operations a particular type of object must have. In our example, let's create a base class called Security Quote. Once we have created a base class, we can derive other classes from it. For example, we can derive a
aBondQuote class, and later, a
CommoditiesQuote class. The classes that we derive from our base class will inherit its attributes and operations. Here is how an inheritance relationship is typically shown:
We have renamed Object C '
SecurityQuote', and we have derived three new classes from it,
One of the rules of object programming is that a derived class must implement the attributes and operations of its base class. The implementation may be inherited from the base class, or it may be set out in the derived class. The base class typically implements any attributes and operations (such as, say, connecting to the Internet) that would be the same in all classes. The base class specifies, but does not implement, attributes and operations that are implemented differently in each derived class. For example, all of the derived classes need to fetch a quote. But since the types of quotes are different, the mechanics of quote fetching will vary for each derived class. So, the base class will define '
FetchQuote' as an operation, but each class will implement the operation separately.
This structure separates what needs to be done from how it needs to be done. When we extend an object, we generally do not change what needs to be done, we only change how it is to be done. In our example, when we add a new type of quote, we still perform the same operations—we connect to a web page, fetch a quote, and return it to the application. But how we perform those operations differs, in a manner appropriate to the new type of quote we are adding.
Stated a bit differently, a change in what we do involves a change in the interface of an object, while a change in how we do it only affects its implementation. This distinction has important ramifications for object design.
One of these ramifications concerns a feature of inheritance that we haven't discussed yet: When I compile a derived class, I don't have to recompile its base class. The base class, and any other class that depends on it, are insulated from the change. And that gives us two key advantages:
- If I add a new derived class, I don't have to retest or recompile. In our example, if users do request commodities quotes next week, then all I have to do is derive a new
CommoditiesQuote class, as shown in the diagram above. I only need to test and compile the
CommoditiesQuote class. All of the other classes are untouched.
- If I change an existing derived class, I only have to retest and recompile that class. In our example, if users discover a bug in the bond quote section of the application, then only the
BondQuote class is affected. Once it is fixed, I recompile only that class. The rest of the application is unaffected.
In short, I have truly isolated the impact of changes to specific objects. I have eliminated the cascading effects that suck me down into the swamp. It's the buffering effect of an inheritance relationship that makes inheritance such a powerful technique. It gives me the ability to set up firewalls within my design to localize changes and improve the extensibility of the application.
This buffering capability makes inheritance attractive even in situations where I expect to derive only a single class. If it turns out that I am wrong, and need to add additional types, it is a simple matter to derive new classes for the types that I need.
The buffering power of inheritance stems from the way that it separates interface from implementation, the what from the how. Using that capability is a key strategy involved in many of the software patterns that are used regularly to solve design problems.
For example, I have two objects, Object B and Object C. Object B is pretty stable; I know what it needs to do and how it needs to do it. It needs the services of Object C to do its job, and I can specify which services it will need from Object C. The services that Object B needs from Object C, and how those services are specified, are probably not going to change. But how Object C will deliver those services is a different matter.
I know what services Object C will deliver, but I don't have a clue yet how it's going to deliver them. There's going to be a fair amount of trial and error involved in figuring out Object C's internal workings; its implementation of its services. Clearly, if Object B calls Object C directly, I'm going to be doing a lot of recompiling of Object B and any other object that depends on Object B.
In light of this, it would be wise to create a base class that specifies the services that Object B requires (which we will call Interface C, then derive another class to implement the services (Object C). The base class won't contain any implementation code—all of that is going to be in the derived class. In face, the base class will only contain a specification of the interface for Object C's services. The resulting design looks like this:
Interface C is what is referred to as an abstract class. It contains no implementation code, only interface specification. All of the attributes and operations it specifies are implemented in
Note how effectively we have separated interface from implementation. So long as what needs to be done does not change, Object B is insulated from changes to Object C. We can recompile Object C a dozen times as we figure out how it is to deliver its services, and we will never have to recompile, retest or change Object B. We have bound Object B to an abstract interface, rather than to another concrete object.
There is a final aspect of
abstract interfaces that is worth noting. Even though Object C is derived from Interface C, that interface's specifications are not driven by the needs of Object C. Instead, those specifications are driven by the needs of Object B, which define the services that Object C must provide. Thus, Interface C is more closely related to Object B than to Object C, even though Object C is derived from it.
Abstract interfaces form the foundation of many of the patterns used in object design. A design pattern is simply a group of classes used in a specific way to solve a design problem. The Factory patterns (Factory Method and Abstract Factory illustrate how abstraction is used to localize changes and increase the overall flexibility of an application.
In the preceding discussion, we have assumed that objects simply appear, like Venus rising from the foam. That's not the case, of course. Objects have to be created, or instantiated. So let's consider the simple case of instantiating a new object from a class. Object B needs to use another object, Object C. So, object B creates a new object from Class C. The code will look something like this:
ObjectC = new ClassC();
After this statement is executed, Object B will hold a reference to Object C:
As the diagram indicates, we have created a dependency between Object B and Object C. If Object C changes, we will have to retest and recompile Object B and any other object that depends on Object B. So, the simple act of creating an object starts us down the road that leads to the swamp.
In many cases, there is more to creating an object than simply instantiating it using the
new keyword. Attributes must be set and operations must be called, in order to initialize the object. As a result, the process of creating an object can produce a number of couplings between the creator and the object. Any change to the object can run afoul of these couplings, breaking the creator.
What we would like to be able to do is create an object and hold a reference to that object, without depending on it. We saw above how we can use an
abstract class as a buffer between an object and another object to which it holds a reference. The Factory Method pattern uses a similar approach to instantiate an object without creating a dependency upon it.
The Factory Method Pattern
The Factory Method pattern uses an
abstract class called a '
factory' to instantiate new objects. In its simplest form, the pattern looks like this:
The diagram is similar to the previous one, except that we have added two classes on the left, Abstract Factory and Concrete Factory. Since the Abstract Factory class is an
abstract interface, Object B doesn't have to know anything about the process of creating Object C. And since it holds a reference to Interface C, it doesn't even need to know anything about Object C. It is insulated from changes to Object C, so long as those changes do not change the abstract interfaces to which Object B is bound.
To illustrate the pattern, let's go back to our previous Security Quote example. We can use the Factory Method Pattern to create new stock and bond quote objects as follows:
Object B is bound to the Quote Factory and Security Quote interfaces, rather than the concrete implementations of those classes. Therefore, either Stock Quote or Bond Quote, or both, or event the way they are created, could change without affecting Object B. We have insulated Object B from changes in the security quote objects and how they are created.
Another Variation of the Factory Pattern
In the preceding example, we assumed that we used a factory to create several different types of objects (security quotes) derived from the same abstract product. We can just as easily use a Factory Pattern to create objects derived from different abstract products:
In this variation of the pattern, a single factory can create two different objects that conform to different interfaces. This variation shows how a factory can centralize object creation for a subsystem or an entire application. Any client object that needs to create and use another object (the product) can simply go to the factory and get the object needed. The factory insulates the client from changes to the product or how it is created, and it can provide this insulation across objects derived from very different abstract interfaces.
Let's extend our previous security quote example to illustrate this variation of the Factory pattern. Let's say that our application needs to provide not only security quotes, but
Holdings objects as well. The
Holdings objects will know how many shares of a particular stock we own, when we bought it, and so on.
A factory to generate both Stock
Quote objects and
Holdings objects will follow the pattern shown in the preceding diagram. The factory looks like this:
When Object B needs either a new
Holdings object or a new
Quote object, it requests one from the factory, using the Abstract Factory interface. The Securities Factory object creates the requested object as a derivative of the appropriate abstract product and returns a reference to the new object to Object B. Object B is completely oblivious to the details of the factory and the new object, since it uses only
abstract interfaces. As a result, it is insulated from changes to either the concrete factory or any of the concrete products.
Note that the Factory pattern insulates the client from implementation changes (changes in how things are done) to the factory and its products. Any changes to the interface (what needs to be done) of either the factory or any of its products will cascade through to Object B. I other words, the Factory pattern assumes stable interfaces with volatile implementations. If the interface of the factory or its products is not stable, then the Factory pattern, like and pattern based on abstract interface buffers will not provide any meaningful benefits.
The Abstract Factory Pattern
Now let's put the two variations of the Factory pattern together. And while we are at it, let's add a new twist. Note that in the examples we have looked at so far, we have derived a single concrete factory class from our abstract factory interface. There is no reason we cannot derive multiple concrete factories from the same abstract interface:
For example, let's assume we need to provide stock holdings and stock quotes for American securities and for foreign securities. Both factories will be derived from the same Abstract Factory class:
Turning to the product side, our application will use the same Holdings interface and Quotes interface we used in the last example:
Now, let's put the two sides together, to see the entire pattern:
We have created a very flexible factory that resembles a metal-stamping machine. In an automobile factory, a stamping machine can be switched from stamping car doors to stamping trunk lids by changing the dies that are mounted on the machine. In the same way, we can change our factory from creating US Securities objects to Foreign Securities objects by changing the concrete factory that we 'plug in' to the Abstract Factory 'socket'. By changing the factory, we change not just a single product, but the entire family of products created by the factory.
The Abstract Factory pattern is one of the most complex of the 'Gang of Four' patterns, and one of the most difficult to understand. But it relies on the same technique of abstract interface buffering that we have seen in all of the examples presented in this paper. That technique insulates client objects from the objects that they use. By reducing the coupling between these objects, abstract-interface-buffering increases the flexibility of an application, by localizing the effects of changes to volatile objects, and by providing 'sockets' by means of which key objects in the application can be extended to provide additional functionality.
 The Factory pattern is one of the patterns contained in Gamma, et al, Design Patterns (Addison Wesley1995). A variation of this pattern is contained in Martin, Agile Software Development (Prentice Hall 2002). These texts are the primary source references for this paper.
 The triangle at the top of the line points to the base class, and branches of the line run to all derived classes. This triangle arrowhead is only used to specify inheritance relationships.
 The fact that
InterfaceC's name is italicized indicates that it is an
abstract class. That is a standard notation.