Click here to Skip to main content
15,861,125 members
Articles / Programming Languages / C#
Article

Improving String.Format

Rate me:
Please Sign up or sign in to vote.
4.62/5 (17 votes)
17 Nov 2006CPOL6 min read 49.3K   298   37   8
Creating an improved version of String.Format.

Overview class diagram

Introduction

Most .NET developers would agree that the standard String.Format() function is both highly functional and convenient. It does have some limitations, however - in this article, we highlight some of those drawbacks, and present an improved Format() function that doesn't have the same limitations.

Motivation

I've spent most of my career as a software developer on various teams, writing bespoke (custom, in-house) software systems. These systems have ranged from timekeeping and inventory systems to data visualisation and data interfacing.

Because most of these systems were intended solely for in-house use, they tended to have all their user messages hard coded. A recent system took a different approach, storing the messages outside the application in an XML file.

By storing these messages outside the code, we gained the ability to improve the wording of the messages at any time - when the application is in testing, or even after deployment to production. This has proved to be very worthwhile, allowing us to clarify and improve the language used in a very dynamic way.

One complication stemmed from our extensive use of context sensitive messages. For example, when seeking confirmation of a delete action, we don't display a typical confirmation dialog like this:

Image 2

Instead, we display a much more precise dialog that clearly identifies the item that is vulnerable:

Image 3

To achieve such a level of customization, our messages must contain placeholders that get replaced as required.

The Problem with String.Format()

Originally, our messages used placeholders designed for use by the standard String.Format() function:

  • Do you want to delete the person "{0}"?
  • If you go ahead, all details of the person {0} will be permanently removed from the Inventory system.
  • Press Delete to go ahead with deleting {0}; press Cancel if you want to keep all the details of {0} unchanged.

This approach works, but we found a couple of significant drawbacks.

Firstly, the placeholders all look the same - {0} in one message looks the same as {0} in another, even though the information that will be substituted is different. Someone revising the messages needs a reference to identify what each placeholder represents. Any developer would naturally reference the code - but other people don't have that option, leaving them reliant on either a developer's memory or some form of documentation.

Secondly, the information available for placeholders is strictly limited - if a message needs to include new information (say, a person's username in addition to their real name), then a code change is required.

A Better Formatter

To resolve these issues, I wrote a new formatter - one that builds on the capabilities of String.Format() in a new way.

The key difference is that the placeholders are named instead of numbered. Instead of each placeholder indexing into a list of parameters, each placeholder names a property from a single passed topic object.

For example, the messages shown earlier become:

  • Do you want to delete the person "{Name}"?
  • If you go ahead, all details of the person {Name} will be permanently removed from the Inventory system.
  • Press Delete to go ahead with deleting {Name}; press Cancel if you want to keep all the details of {Name} unchanged.

To complete the placeholders in each template, the code looks like this:

C#
// Load the message template
string template = MessageManager.LoadMessage("delete.prompt");

// Format the string for display
string message = StringUtils.Format( template, topicPerson);

How does this improve over the original String.Format()?

  • Every property on the supplied object is available for use.
  • No additional documentation is required, as we already have a fully documented object model available for reference.

Additionally, a simple extension to the formatter allows us to use a dot-separated "path" of property names to reference related objects within the business domain.

For example, without requiring any code changes at all, we can change the messages from our deletion example to:

  • Are you sure you want to delete {KnownAs} from {Department.Name}?
  • All the information about {Name} ({UserName}) will be permanently removed.
  • Press Delete to go ahead with deleting {KnownAs}; press Cancel if you want to keep all the details of {KnownAs} unchanged.

How it Works

There are two aspects of the formatter that are worth examining in depth - the use of regular expressions, and the way it extracts property values using reflection.

Regular Expressions

Instead of writing some moderately complex string parsing code to split up the template, a regular expression makes it simple to isolate the placeholders for substitution.

Here is the relevant section of code:

C#
// Create a Regex to extract our placeholders
Regex r = new Regex(@"\{(?<ID>\w+(\.\w+)*)(?<PARAM>:[^}]+)?\}");

// Use the Regex to find all the placeholders
MatchCollection mc = r.Matches(Template);

The regular expression (regex) itself is divided into three parts that work as follows.

  • The first part of the regex [(?<id>\w+(\.\w+)*)] is used to isolate the name of each placeholder - it matches any sequence of letters/numbers or periods.
  • The second part of the regex [(?<param>:[^}]+)?] is used to match any optional parameters that might be included - it matches a colon and then anything up until the closing brace.
  • Finally, these two parts of the regex are wrapped [\{...\}] to ensure they are only matched when wrapped with braces.

Once the regular expression is created, we use it to find all the placeholders in the supplied template. For each match that is found, the named property is extracted from the supplied source object, and String.Format() is used to create a string. A StringBuilder is used to concatenate these results with portions of the original template string.

Hopefully, you can see how using a regular expression in this routine makes short work of the parsing problem. It's unfortunate that regular expressions aren't more widely used.

Reflection

A utility function is used to access the property value identified by the placeholder being processed.

C#
// Declared as internal so it can be accessed by our test code
static internal Object ExtractPropertyValue(Object Source,
                                            string PropertyPath)
{
    if (PropertyPath.Contains("."))
    {
        // Name references a sub property
        int index = PropertyPath.IndexOf(".");
        string subObjectName = PropertyPath.Substring(0, index);
        Object subObject = ExtractPropertyValue(Source, subObjectName);
        return ExtractPropertyValue(subObject,
               PropertyPath.Substring(index + 1));
    }

    // PropertyPath does not contain a "."
    // We are looking for something directly on this object

    Type t = Source.GetType();
    PropertyInfo p = t.GetProperty(PropertyPath);
    if (p != null)
    {
        // We found a Public Property, return it's value
        return p.GetValue(Source, null);
    }

    throw new InvalidOperationException("Did not find value " +
                    PropertyPath + " on " + Source.ToString());
}

This utility function is a good example of a recursive function - a function that calls itself.

First, the routine checks to see if the property name passed contains a period - this is used as an indicator that the passed string is a property path instead of a simple path. When a period is found, two recursive calls are made, the first to obtain the nested object and the second to process the rest of the path.

When no period is found in the passed property name, we use the Reflection API to obtain the appropriate property value.

If no property is found, we throw an exception. Note how the message included in the exception gives some useful details about why the exception was necessary.

Future Extensions

At the moment, only properties on the source object are used to replace the placeholders.

One suggestion that has been made is to check for a string based indexer on the object and to use that to obtain a sub-object. This would work for accessing fields from data rows, values from name-value collections, some kinds of dictionary, and more.

History

  • October 2006

    First draft of the article, reviewed by friends.

  • November 2006

    Rewritten article for submission to CodeProject.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
New Zealand New Zealand
A confirmed geek from an early age, Bevan's first computer was a ZX81 on which he learnt to program using Sinclair Basic. This was followed by a Spectrum 48K, the school's Apple //e network, an Amstrad CPC 6128, a whole lot of early Macintoshes and several PCs (though never a BEBox).

A former hardware service tech for Apple, Bevan is happiest when hip-deep in the code. Well, he actually prefers spending time with his family, but CodeProject is a techie site and who would believe that?

Comments and Discussions

 
GeneralYet another way of doing it [modified] Pin
metator15-Nov-09 13:44
metator15-Nov-09 13:44 
GeneralAdding methods Pin
bilo812-Dec-08 4:57
bilo812-Dec-08 4:57 
GeneralOther data source Pin
snoopy00124-Nov-06 5:10
snoopy00124-Nov-06 5:10 
I find this article very useful and use the idea in different way - I provide the data for the string as Dictionary<string,object>. Here is the code:
static public string Format(string Template, Dictionary<string, object=""> data) {
// This regular expression should find named parameters of the form {id} or {id:parameter}
Regex r = new Regex(@"\{(?<id>\w+(\.\w+)*)(?<param />:[^}]+)?\}");
MatchCollection mc = r.Matches(Template);
StringBuilder result = new StringBuilder();
// Loop over each match, replacing the placeholders with the result
int index = 0;
foreach (Match m in r.Matches(Template)) {
// Append content from before this match
if (index < m.Index)
result.Append(Template.Substring(index, m.Index - index));
// Get the details of this match
Group id = m.Groups["id"];
Group param = m.Groups["param"];
// Create a template to pass to String.Format()
string t = "{0" + param.ToString() + "}";
// Get the property identified by id
Object value = null; // ObjectUtilities.ExtractPropertyValue(Source, id.ToString());
if (data.TryGetValue(id.ToString(), out value) == false) {
throw new InvalidOperationException("Did not find value " + id.ToString());
}
// Format the value and add it to our result
result.AppendFormat(t, value);
index = m.Index + m.Length;
}
// Copy across any content from after the last match
if (index < Template.Length) {
result.Append(Template.Substring(index));
}
return result.ToString();
}
GeneralRe: Other data source Pin
Bevan Arps26-Nov-06 19:43
Bevan Arps26-Nov-06 19:43 
GeneralCool method + performance hint Pin
Josh Smith17-Nov-06 10:45
Josh Smith17-Nov-06 10:45 
GeneralRe: Cool method + performance hint Pin
Bevan Arps19-Nov-06 8:40
Bevan Arps19-Nov-06 8:40 
GeneralRe: Cool method + performance hint Pin
P_hil23-Nov-06 10:24
P_hil23-Nov-06 10:24 
GeneralWhat if the Property Name Changes? Pin
Bevan Arps26-Nov-06 19:51
Bevan Arps26-Nov-06 19:51 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.