Click here to Skip to main content
15,868,016 members
Articles / Programming Languages / C#

Refactoring Without Starting Over

Rate me:
Please Sign up or sign in to vote.
3.60/5 (14 votes)
5 Jul 2007CPOL8 min read 31.3K   25   9
Imagine this: for some odd reason, you end up in a situation where you have a big pile of spaghetti code (also known as legacy code) and you have a feature request to radically extend the functionality of the big pile. How do you go about this?

Scope of the problem

Imagine this: for some odd reason, you end up in a situation where you have a big pile of spaghetti code (also known as legacy code) and you have a feature request to radically extend the functionality of the big pile. How do you go about this? Obviously, there are at least the three following approaches to the problem:

  1. Deny the request.
  2. Hack the legacy code to cope with the request.
  3. Refactor the legacy code to meet new coding standards.
  4. Cases 1 and 2 have an immediate short term effect. Case 1 - no money is earned and the customer might be lost for good. Case 2 - the money is safe; however, further development and maintenance will (over time) be awfully painful. Case 3 is the ideal solution, and would be the choice of most developers and the worst case scenario of most CEOs: it has a radical short term economical impact, and could drag the development process on for ages.

    However, there is also a fourth option:

  5. Turn the legacy "interface" into a combination of Facade and Adapter patterns.

This option will leave most of the legacy code base intact, while only introducing a light weight abstraction layer. This may sound fuzzy at the moment, but in the remainder of this article, I will give an introduction to how to refactor legacy code without rewriting the code.

Going back to case 2, where we would refactor the code base to suit the newly found requirements, changes are introduced at the location of the new feature code and also at the location of all the calling applications. This is a cumbersome solution, and is likely to introduce bugs in portions of the program that used to work flawlessly. However, the method described in this article will try to describe a method of implementing new features without tampering with the legacy interface. This means that all calling applications remain unmodified, but with access to the new feature code hidden by a facade.

Design Patterns

First, what is a facade pattern? Googling on Wikipedia, we will find the following definition:

In computer programming, a facade is an object that provides a simplified interface to a larger body of code, such as a class library.

  • Make a software library easier to use and understand, since the facade has convenient methods for common tasks.
  • Make code that uses the library more readable, for the same reason.
  • Reduce dependencies of outside code on the inner workings of a library, since most code uses the facade, thus allowing more flexibility in developing the system.
  • Wrap a poorly designed collection of APIs with a single well-designed API.

Second, what is the Adapter pattern? Again, googling Wikipedia will tell us:

In computer programming, the Adapter design pattern (sometimes referred to as the wrapper pattern, or simply a wrapper) 'adapts' one interface for a class into one that a client expects. An adapter allows classes to work together that normally could not because of incompatible interfaces, by wrapping its own interface around that of an already existing class.

These two structural patterns are per definition generic, and can be applied to any code developed. At scope level, it is most suitable to deploy such patterns while designing and implementing a component, and not while adding features to existing components. Many programmers have to deal with source code written in an era before the Design Patterns, and thus no patterns have been intentionally applied. Introducing or identifying such patterns will often require a costly rewrite or a major refactoring of the code base. In the following section, we will discuss possible ways of refactoring at minimum cost.

These two structural patterns are per definition generic and can be applied to any code developed. At scope level it is most suitable to deploy such patterns while designing and implementing a component and not while adding features to existing components. Many programmers have to deal with source code written in an era before the design patterns and thus no patterns have been intentionally applied. Introducing or identifying such patterns will often require a costly rewrite or a major refactorization of the code base. In the following section, we will discuss possible ways of refactoring at minimum cost.

Introducing the patterns

The first step in the process of preparing the legacy code base for the new component feature is to identify all feasible entry points. Looking at a legacy code base, there are two basic constructs of how the code is interacting:

1. Multiple clients one entry point

The first step in the process of preparing the legacy code base for the new component feature is to identify all feasible entry points. Looking at a legacy code base, there are two basic constructs of how the code is interacting:

Figure 1: Multiple clients, one entry point, demonstrates the simplest scenario, a code base with a (more or less) well defined interface. The interface may consist of a range of free functions, or be centralized in a common class. In both cases, the code structure already holds a derivate of the Facade pattern and is ready for modifications.

2. Multiple clients and multiple entry points

Figure 2: Multiple clients, and multiple entry points, shows how a range of clients may interact with a shared component through many entry points. This is the difficult scenario, and the following tasks must be executed:

  1. Determine the entry points (could be done programmatically by the linker, i.e., remove the legacy code objects from the linker options).
  2. Decide between the following solutions:
    1. Decide if the entry points in the legacy code are close enough to be moved to a common location (perhaps even a common class).
    2. If the gap between the entry points is too large, determine the possible side-effects of modifying the underlying code for the entry points, and isolate the separate interface.

A primitive example of case 2.a could be a set of free functions for string operations, where the implementation is spread across the code base. Moving the interface and implementation to a common location introduces a common interface available to the entry points. However, it also introduces the possibility to modify the underlying code in a central place while keeping the interface intact.

An example of case 2.b could be a set of free functions for string operations and a set of free functions for database access. These are logically too wide apart, and would ideally be split up in two separate interfaces.

Having identified the entry points and interfaces to the legacy code, we should reconsider the interface and possibly update it. It makes perfect sense to introduce incremental "face lifts" in the source code, i.e., refactor once in a while to keep them in sync with their usage. In the example of case 2.b, it may not be possible to separate the two chunks, and thus an adapter might come in handy.

Adding the new feature

Having the legacy code and its interfaces prepared for the new feature, we will now have a look at how the feature could be introduced.

The above figure shows an UML diagram of the expected structure. The Common Interface is the interface entry point introduced in the previous section of this article. To abstract the code beneath this point, we introduce an adapter, interpreting the common interface and handling requests to the underlying implementation. The adapter holds references or instances of the legacy code and the new feature code. The mechanism for alternating between the legacy code and the new feature code is placed in the adapter. It may be necessary to extend the existing data structures to keep information about its origin, i.e., if the value originates from legacy code or new feature code.

Code before:

C#
typedef struct  
{
  char *text;
  size_t length;
} data_t;

void str_analyze(data_t *data)
{
  /// Put code here
}

Code after

C#
typedef struct  
{
  char *text;
  size_t length;
  unsigned char origin;
} data_t;

void str_analyze(data_t *data)
{
  switch (data->origin)
  {
  case LEGACY: 
    str_analyze_old(data);
    break;
  case FEATURE:
    str_analyze_new(data);
    break
  }
}

The previous code samples illustrate how the alternating adapter could be added to handle legacy code along side with new feature code. Note that the data structure has been updated with an origin variable, and that the function retains its original interface. Here, the function "str_analyze" acts as an adapter, as it translates incoming requests, but also as a facade since it is also in charge of delegating the work.

What if ...

What if not all the legacy code should be updated for the newly added feature? Working with a large legacy code base, we are bound to have many generic functions, i.e., reading contents of file to string or similar common functions.

Figure 4: In the previous section, we introduced an adapter layer to handle incoming requests. Modifying this as shown in the above figure gives the code direct access to the legacy code while still keeping the code open for future implementations. This is illustrated in the code samples below.

Code before:

C#
typedef struct  
{
  char *text;
  size_t length;
} data_t;

data_t * str_readf(const char* filename)
{
  /// Put code here
  return data;
}

Code after:

C#
typedef struct  
{
  char *text;
  size_t length;
  unsigned char origin;
} data_t;

data_t * str_readf(const char* filename)
{
  return str_readf_old(filename);
}

In closing

To put it short, this article provides a small example of how to extend existing code bases with new and shiny features. Working with commercial code, we are often met with the challenge of implementing a new feature in very old and very messy legacy code bases. The code is most likely written in an era without emphasis on Design Patterns and maintainability. Following the simple guides from this article, it should be possible to seamlessly extend legacy code bases without tampering with existing functionality. Using the method described in this article, we will have to deal with the following issues:

  • Maintainability: Updating or modifying either the legacy code base or the new feature code is possible without tampering the other.
  • Testability: Introducing the Adapter and Facade patterns imposes a layer of abstraction, making it possible to test the underlying code with unit tests.
  • Flexibility: The Facade pattern allows the developer to change the underlying code, infrastructure, etc., without changing the interface.

History

  • July 6th, 2007 - First revision uploaded to codeproject.com.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Web Developer
Denmark Denmark
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralGood points Pin
The Wizard of Doze6-Jul-07 23:19
The Wizard of Doze6-Jul-07 23:19 
Thanks! You obviously speak from your own experience. Refactoring often is conducted step by step. The Adapter pattern helps to gradually transform existing code without too much disruption for the rest of the application. I'd like to see more articles on refactoring!



GeneralJust makes more spaghetti Pin
Marc Clifton6-Jul-07 1:49
mvaMarc Clifton6-Jul-07 1:49 
GeneralRe: Just makes more spaghetti Pin
Rasmus Kaae6-Jul-07 2:30
Rasmus Kaae6-Jul-07 2:30 
GeneralRe: Just makes more spaghetti Pin
Marc Clifton6-Jul-07 2:36
mvaMarc Clifton6-Jul-07 2:36 
GeneralRe: Just makes more spaghetti Pin
Daniel Turini6-Jul-07 3:26
Daniel Turini6-Jul-07 3:26 
GeneralRe: Just makes more spaghetti Pin
jmw6-Jul-07 8:30
jmw6-Jul-07 8:30 
GeneralRe: Just makes more spaghetti Pin
Rasmus Kaae6-Jul-07 8:49
Rasmus Kaae6-Jul-07 8:49 
GeneralRe: Just makes more spaghetti Pin
sadavoya6-Jul-07 5:31
sadavoya6-Jul-07 5:31 
GeneralThanks Pin
Hannes Foulds5-Jul-07 21:08
Hannes Foulds5-Jul-07 21:08 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.