Implementing the Factory Pattern (Part 1 of 2 or 3)

Don Kackman

4.88/5 (10 votes)

8 Sep 2008CPOL10 min read

152

The first part of a Factory: locating Assemblies and finding Types that match criteria.

Download source - 37.4 KB

Introduction

I started this article as just a simple post about a library that has been useful to me, but after reading it over (and a colleague's observation that it assumed a lot of the reader), decided to expand it into a series on the Factory pattern.

In my experience, any non-trivial development effort eventually gets to the point where defining linkage to all the various implementation types at compile time becomes unwieldy. Even in a well factored and loosely coupled design, compile time linking between interface and implementation at the consumer level can often require the entire system to be rebuilt (and therefore retested and deployed) in order to update even the smallest portion of it.

That's where the creational patterns come in. Of these, Factory and Abstract Factory can help answer the question "what are the set of specific Types that comprise the functionality of a system?".

Why a Factory?

Because Type coupling is the enemy of good (meaning flexible, resilient, and robust) design. Type coupling is expressed in code in many ways; direct usage and inheritance being the most obvious. As software grows and evolves, it is the coupling between Types that makes the system increasingly brittle, and difficult to change and adaptable to new requirements.

Structural Patterns help to manage coupling that arises via the inheritance, composition, and usage of and between Types. The Factory pattern, on the other hand, is meant to manage a less discussed source of coupling: what I'll call constructor coupling, for lack of a better term.

Take the following bit of code:

namespace MyNamespace
{
   SomeSharedNamespace.ISomeInterface instance = new ThirdPartyNamespace.SomeObject();
}

Even though the code uses an interface from a shared namespace (a common way to manage usage coupling), MyNamespace is tightly bound to ThirdPartyNamespace via the call to the constructor on SomeObject. In concrete terms, the Assembly containing this code must reference the Assembly containing SomeObject. This implies that they must always be deployed together, and further, if the application wants to use FourthPartyNamespace.SomeObject (or even ThirdPartyNamespace.SomeOtherObject) instead, the entire system needs to be rebuilt, retested, and redeployed.

Now, of course, in this simple case, that doesn't pose that much of a problem, but real world systems can contain hundreds or even thousands of Types. Being able to build, test, and deploy the whole system can be a significant undertaking.

A more insidious side effect of direct references between Assemblies (or more specifically, VS.NET projects and Assemblies) is that they tend to lead to abstraction leaks. With the concrete class reference right there, it may just be more expedient to cast an interface pointer to an implementation class than it is to make sure that the object model adequately represents the requirements. The code starts to work around its own design, and eventually the design is little more than a fiction, with the system becoming increasingly fragile.

So the first step in decoupling object instantiation is to replace the direct creation of an implementation class with something like:

SomeSharedNamespace.ISomeInterface instance = Factory.CreateISomeInterface();

with an initial implementation of the CreateISomeInterface method that might look something like this:

switch (m_someConfigurationParameter) 
{
   case &quot;ThirdPartySystem&quot; : return new ThirdPartyNamespace.SomeObject();
   case &quot;ThirdPartySystemEx&quot; : 
        return new ThirdPartyNamespace.SomeOtherObject();
   case &quot;FourthPartySystem&quot; : return new FourthPartyNamespace.SomeObject();
// etc.
}

Here, the consuming code is isolated from the implementation class completely, and managing references can be encapsulated entirely within the Factory implementation. Concrete classes can change, and only the Factory needs to be updated without updating the consuming assembly at all.

It's a step in the right direction, but still requires a central component of the system to be updated in order to change the set of components or implementation classes. To really get flexibility and isolation, it would be good to be able to add and remove components from the system without recompiling anything.

This tutorial, and subsequent entries, will build from the bottom up an implementation of the Factory pattern (and if I'm ambitious, the Abstract Factory pattern). So if the end goal is to have a Factory that doesn't require recompilation to change, the first challenge is going to be locating the Assemblies that comprise the system and finding the Types that consumers are interested in instantiating.

Background

This first installment is intended to lay some groundwork and answer these three questions:

How does a consuming program locate the set of Types that are candidates for instantiation?
How can a library that provides a specific implementation add itself to that set without the consuming program being recompiled?
How can an implementation class describe itself to a consumer application? (This is actually a much larger question than I will cover at first, but the basis will be laid.)

Obviously, the first two questions are closely related. In fact, they are the same question, only looked at from opposite sides of an interface. All of the major Microsoft technologies that I've worked with over the past 20 years had some sort of answers to these questions, but mostly the more important questions were how to define interoperability contracts and share heap space across library boundaries.

MFC had its extension DLLs, and ATL and VB6 had COM (I know, I know, MFC had COM as well, but don't make me recall the this pointer and the 20 macros in 5 different source files it took to turn a CObject into a COM object).

The .NET Framework, with its clear and shared definition of Type, its standardization of memory management, and its exposure of all aspects of binding via Reflection, makes late loading of implementation types a breeze, but it still falls to the application developer to answer those two original questions.

I've seen a number of approaches in the .NET realm. NDoc uses a naming convention for assembly files. It will look in any DLL with the suffix *.documenter.dll for an implementation of a specific set of interfaces. SharpDevelop (at least the last time I poked around in its code, which was a long time ago) uses an XML registration mechanism, wherein extension assemblies insert entries in a set of XML files that the application uses to locate and expose functionality to the user.

The Reflection only approach of NDoc is nice as it is quite easy to use. Agree on a naming convention, drop a file in an agreed upon location, and away you go. The downside is that assembly implementers have to be careful not to step on each other's toes: assembly file names and human readable type information has to be kept unique across all assemblies. It also means a bit less control at the application level of what Types are considered part of the system.

XML registration gives the application more control, but it makes deployment harder as the implementer has to know how to insert the correct information in the right place to become visible.

This library supports either approach, but with a slight nod to the Reflection design, as you will see below.

The third question is always application specific. A UI plug-in system may have things like a name, description, an icon, and a set of supported commands, while an extensible query engine might need to indicate what sorts of data objects a query implementation can operate on.

But again, since this decision is so application specific, this assembly defers that to the consumer. The sample code uses Attributes, albeit in a very simplified fashion, to expose the properties of candidate types to the consumer.

The Code

The Assembly has four instantiatable classes, one base class, and a helper attribute.

The Classes

TypeLoaderBase

This abstract base class has a small set of responsibilities utilized from the concrete derived classes:

Locate candidate Assemblies to search for specific Types
Expose properties that can be used to govern how Assemblies are found
Load Types from those Assemblies that meet specific search criteria
Contain the storage for all the Type objects that are found which match the search criteria

There are four properties that govern how Assemblies will be selected:

SearchLoadedAssemblies - true in order to look for Types in Assemblies already in AppDomain.CurrentDomain.
SearchDirectories - true in order to search directories on the file system for candidate Assemblies.
AssemblyFileNameFilter - (ignored if SearchDirectories is false) A file name mask used to limit the set of Assemblies to search. Defaults to *.dll, but can be further constrained in a system where many Assemblies may be located in the search location.
SearchDirectoriesRecursively - (ignored if SearchDirectories is false) true in order to recurse folders in the search location.

The declaration of TypeLoaderBase looks like:

public abstract class TypeLoaderBase<TBinding, TStorage, TKey> :
       IEnumerable<KeyValuePair<TKey, TStorage>

TBinding is the Type (interface or class) that a consumer will use to bind to instantiated objects, TStorage is used by the derived classes to define how Type objects will be stored in the Dictionary, and TKey is the Type used to index that Dictionary. The implementation of IEnumerable<KeyValuePair<TKey, TStorage>> allows the Dictionary to be iterated over without allowing modification.

Finding Candidate Assemblies

private IEnumerable<Assembly> GetAssemblies(string path)
{
      IList<Assembly> list = new List<Assembly>();
      if (SearchLoadedAssemblies)
      {
          // look for any types in already loaded assemblies don't 
          //  iterate over framework assemblies or CTS assemblies
          list = (from a in AppDomain.CurrentDomain.GetAssemblies()
                  where a.FullName.StartsWith(&quot;mscorlib,&quot;) == false &&
                   a.FullName.StartsWith(&quot;System,&quot;) == false &&
                   a.FullName.StartsWith(&quot;System.&quot;) == false &&
                   a.FullName.StartsWith(&quot;Microsoft.&quot;) == false &&
                   a.FullName.StartsWith(&quot;vshost,&quot;) == false
                  select a).ToList();
      }

      // if path is a file load it as an assembly
      if (File.Exists(path))
      {
          list.Add(Assembly.LoadFrom(path));
      }
      else if (SearchDirectories && Directory.Exists(path))
      {
          SearchOption searchOption = SearchDirectoriesRecursively 
                 ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly;
          foreach (string file in 

             Directory.GetFiles(path, AssemblyFileNameFilter, searchOption))
              list.Add(Assembly.LoadFrom(file));
      }

      // Distinct eliminates duplicate entries for assemblies 
      // that are already loaded as well as in the search location
      return list.Distinct();
}

Once the set of Assemblies are determined, then all of the Types are iterated over, and those which match are returned. In order to be selected, a Type must be a class, it must not be abstract, and it must either implement, be, or derive from the generic parameter TBinding.

Loading Types

protected IEnumerable<Type> GetTypes(string path, Func<Type, bool> match)
{
      return GetAssemblies(path).SelectMany(a => GetTypes(a, match));
}

private IEnumerable<Type> GetTypes(Assembly assembly, Func<Type, bool> match)
{
      try
      {
          // in order to be loaded the candidate type must be instantiatable
          // and it must either implement the TBinding interface 
          // (if TBinding is an interface type),
          // or inherit from or actually be TBinding (if TBinding is a class type)
          return from t in assembly.GetTypes()
                 where
                     t.IsClass && t.IsAbstract == false &&
                     ((typeof(TBinding).IsInterface &&
                     t.GetInterface(typeof(TBinding).Name) != null) ||
                     t.IsSubclassOf(typeof(TBinding)) || 
                     t == typeof(TBinding)) &&
                     t.GetCustomAttributes(typeof(IgnoreTypeAttribute), 
                                           false).Length == 0 && 
                     //IgnoreTypeAttribute can be used to suppress inclusion
                     match(t)
                 select t;
      }
      catch (Exception e)
      {
          return new List<Type>();
      }
}

IgnoreTypeAttribute

This attribute can be applied to a Type in order to prevent it from being enumerated.

TypeLoader

This class will store Types such that each key represents only one Type. A given Type can be present many times under different keys, but each key must be unique.

All of the TypeLoaderBase derived classes support a set of Load methods similar to the following:

public void Load(string path, 
                Func<Type, TKey> keySelector, 
                Func<Type, bool> match)
{
      foreach (Type t in GetTypes(path, match))
          AddEntry(keySelector(t), t);
}

path - (ignored if SearchDirectories is false) Can be a path to a directory, in which case, all Assembly files that match AssemblyFileNameFilter will be searched. Can also be a file path, in which case, just that Assembly will be searched.
keySelector - A caller supplied delegate method that will return the key for each matching Type. Typically, I use this to find application specific attributes on the Types that specify the appropriate key.
match - A caller supplied delegate method that can be used to further constrain the Types that will be loaded.

TypeLoader and TypeLoaderDictionary also have LoadMany methods which allows each Type to have multiple keys. The difference from the Load methods is that the keySelector argument returns an enumeration of keys rather than a single value. The Type being inspected will be inserted once for each key returned from this delegate.

 public void LoadMany(string path, 
               Func<Type, IEnumerable<TKey>> keySelector, 
               Func<Type, bool> match)
{
      foreach (Type t in GetTypes(path, match))
      {
          foreach (TKey key in keySelector(t))
              AddEntry(key, t);
      }
}

Each derived class also has a method or methods to create an instance (or list of instances in the case of List based implementations):

public TBinding CreateInstance(TKey key)
{
      try
      {
          return (TBinding)Activator.CreateInstance(this[key]);
      }
      catch (TargetInvocationException e)
      {
          Debug.Assert(false);
          throw e.InnerException;
      }
}

TypeLoaderList, TypeLoaderDictionary, TypeLoaderDictionaryList

The remaining classes are all variations of TypeLoader but with different underlying storage.

The two Dictionary classes allow Types to be identified by a two part key: a namespace and a Type key. This can be useful for scenarios like XML nodes, where the item being bound to may require two identifiers to uniquely describe (a namespace and a local name in the case of XML).

The two List classes allow each key to return multiple Types in such cases where multiple Types may be applicable in the same context. For instance, if you have a shell extension and there are multiple classes that could operate on a specific file type, keying the type collection on the file type would give you back all types that understand that type.

Using the Code

The accompanying code contains a sample application that demonstrates a few different (highly contrived) scenarios for using this library that will hopefully make the basic usage clear.

One sample usage will load different implementations of System.Windows.Forms.ToolStripRenderer using just the Type name as the key.

private TypeLoader<ToolStripRenderer, string> m_renderers 
               = new TypeLoader<ToolStripRenderer, string>();

m_renderers.AssemblyFileNameFilter = &quot;*.renderers.dll&quot;;
m_renderers.Load(Path.GetDirectoryName(Application.ExecutablePath), 
                  delegate(Type t) { return t.Name; });

foreach (KeyValuePair<string, Type> pair in m_renderers)
{
    ToolStripItem item = new ToolStripMenuItem(pair.Key);
    item.Click += new EventHandler(item_Click);
    this.renderersToolStripMenuItem.DropDownItems.Add(item);
}

Another demonstrates using a custom attribute to supply the Type key.

private TypeLoader<MessageSource, string> m_messageSources 
             = new TypeLoader<MessageSource, string>();

m_messageSources.SearchDirectories = false;
m_messageSources.Load(GetMessageSourceName);

private static string GetMessageSourceName(Type t)
{
      foreach (MessageSourceAttribute attr in 
                   t.GetCustomAttributes(typeof(MessageSourceAttribute), false))
          return attr.Name; // first one wins for this simple demo

      return string.Empty;
}

And, the last demonstrates using the TypeLoaderDictionary class as well as loading types from a user defined folder or assembly.

private TypeLoaderDictionary<IColorProvider, string> m_colorProviders 
                     = new TypeLoaderDictionary<IColorProvider, string>();

m_colorProviders.Load(Path.GetDirectoryName(Application.ExecutablePath), 
                     GetColorProviderNamespace, GetColorProviderName);

foreach (KeyValuePair<string, IDictionary<string, Type>> pair in m_colorProviders)
    listBox3.Items.Add(pair.Key);

private static string GetColorProviderNamespace(Type t)
{
      ColorProviderAttribute[] attrs = 
             (ColorProviderAttribute[])t.GetCustomAttributes
                 (typeof(ColorProviderAttribute), true);

      if (attrs.Length > 0)
          return attrs[0].Category;

      return t.Namespace;
}

private static string GetColorProviderName(Type t)
{
      ColorProviderAttribute[] attrs = 
            (ColorProviderAttribute[])t.GetCustomAttributes
                (typeof(ColorProviderAttribute), true);

      if (attrs.Length > 0)
          return attrs[0].Name;

      return t.Name;
}

private void listBox3_SelectedIndexChanged(object sender, EventArgs e)
{
      listBox2.Items.Clear();
      if (listBox3.SelectedIndex > -1)
      {
          foreach (KeyValuePair<string, Type> pair in
                     m_colorProviders[listBox3.SelectedItem.ToString()])
              listBox2.Items.Add(pair.Key);
      }
}

private void button2_Click(object sender, EventArgs e)
{
      if (listBox3.SelectedItem != null && listBox2.SelectedItem != null)
      {
          IColorProvider provider = 
                m_colorProviders.CreateInstance(listBox3.SelectedItem.ToString(), 
                listBox2.SelectedItem.ToString());
          this.toolStripContainer1.ContentPanel.BackColor = 
                provider.GetColor(&quot;AppWorkspace&quot;);
          this.ForeColor = provider.GetColor(&quot;ControlText&quot;);
      }
}

A Note About Frameworks

LINQ is used in a number of places, as are generic delegates. Those are the only things about this code that couple it to the 3.5 framework. All of the LINQ code could be replaced with foreach iterations, and the generic delegates converted to declared delegates without too much effort. Doing this would make this usable on the 2.0 framework.

History

9/9/2008 - Initial release.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)