Click here to Skip to main content
Click here to Skip to main content

Performance considerations for strings in C#

By , 8 May 2005
 

Introduction

How you handle strings in your code can have surprising effects on performance speed. In this article, I shall look at two of the common issues that using strings can produce: use of temporary string variables and string concatenation.

Background

There comes a time in every project when you have to start looking at coding standards. Using FxCop is a good place to start. My favourite set of FxCop rules is the 'Performance' set.

So there I was, checking my project against FxCop and seeing lots of issues with strings. I must admit something: I have always had problems with C#'s immutable strings. When I see myString.ToUpper(), I always forget that it won't change the contents of myString but will return a new string entirely (this is because strings are immutable in C#).

I proceeded to fix my code to remove FxCop's warnings and then I noticed something - my code was faster. I decided to investigate and ended up writing the test code that I present here.

Using the code

The test code is very simple. A console application calls four test methods. Each method performs a string processing routine 1000 times (so the time to execute is nice and long to look at performance differences).

The four test methods are split into two groups of two. The first group compares case-insensitive string comparison.

String Comparison and Temporary String Creation

The first test routine is a bad case-insensitive string comparison. The routine for the comparison is:

static bool BadCompare(string stringA, string stringB) 
{
    return (stringA.ToUpper() == stringB.ToUpper());
}

For this code, FxCop shows the following advice:

"StringCompareTest.BadCompare(String, String):Boolean calls 
   String.op_Equality(String, String):Boolean after converting 'stack1', a local, 
   to upper or lowercase. If possible, eliminate the string creation and call the 
   overload of String.Compare that performs a case-insensitive comparison."

What this means is that each call to ToUpper() is creating a temporary string which has to be created and managed by the garbage collector. This takes extra time and uses more memory. The String.Compare method is more efficient.

The second test routine uses String.Compare:

static bool GoodCompare(string stringA, string stringB)
{
    return (string.Compare(stringA, stringB, true, 
         System.Globalization.CultureInfo.CurrentCulture) == 0);
}

This method prevents the creation of unnecessary temporary strings.

According to nprof, the Good Comparison takes 1.69% of the total execution time of the code, while the Bad Comparison takes 5.50% of the total execution time.

So the String.Compare method is over three times as fast as the ToUpper method. If you have code that is performing a lot of string comparisons (especially in a loop) then using String.Compare can make a big difference.

String Concatenation inside a loop

The final pair of test routines consider string concatenation within a loop.

The 'bad' test routine is as follows:

static string BadConcatenate(string[] items)
{
    string strRet = string.Empty;

    foreach(string item in items)
    {
        strRet += item;
    }

    return strRet;
}

When FxCop sees this code, it is so outraged that it even marks the broken rule in red! FxCop says the following:

"Change StringCompareTest.BadConcatenate(String[]):String to use StringBuilder 
  instead of String.Concat or +="

The 'good' test routine was written as follows:

static string GoodConcatenate(string[] items)
{
    System.Text.StringBuilder builder = new System.Text.StringBuilder();

    foreach(string item in items)
    {
        builder.Append(item);
    }

    return builder.ToString();
}

This is an almost archetypal example given for the use of the System.Text.StringBuilder class. The issue with the bad example is the creation of more temporary strings. Because strings are immutable, the concatenation operator (+=) actually creates a new string out of the two originals and then points the original string instance at the new string.

However, when we look at performance, according to nprof, the we find that the 'Bad' concatenation takes 5.67% of the total execution time, while the 'Good' concatenation takes 22.09%. I'll run that by you again:

Using StringBuilder took almost four times longer than simple string concatenation!

Why?

The answer is partly in the design of the test; the concatenation routines only concatenate ten short strings. The StringBuilder class is a more complex class than a simple immutable string, so creating one StringBuilder is more expensive in performance than doing ten simple string concatenations.

I repeated the test with differing numbers of string concatenations, and found the following results:

Chart of concatenation method effect on relative performance.

Note: The values shown here are the % of the total execution time taken by the test routines. The 'Good Concatenation' test is not actually getting faster, but takes less relative time than the 'Bad Concatenation' routine.

So, it would seem that the StringBuilder class is only really faster if you are concatenating more than about 600 strings.

Of course, the other reason for the use of the StringBuilder class is memory allocation. Using the CLRProfiler produced the following memory use timeline for concatenation of 100 simple strings:

Memory usage timeline.

The section marked 'A' shows the effect of the bad string concatenation routine on memory allocation and de-allocation. The maximum allocated memory is increasing rapidly and there is a high number of garbage collections occurring (roughly 215 collections for this section).

The section immediately following the 'A' section shows the memory profile for the good string concatenation routine. The maximum allocated memory is increasing less rapidly and there are far fewer garbage collections being made (roughly 60 collections for this section).

So using the StringBuilder class may not be faster in some cases, but it is kinder to the garbage collector.

Conclusions

Use the String.Compare method for case-insensitive string comparison. It's just faster. Nice and simple.

Use the StringBuilder class for speed increases only if you are concatenating more than about 600 strings within a loop. The caveat here is that the length of the strings you are manipulating may also affect the speed tradeoff, as may the effects on the Garbage Collector so you should really perform your own tests for your specific code.

Points of Interest

I was surprised at what a difference using the correct string manipulation methods made to code in the real world (although we do perform a lot of string comparisons and concatenations in my current project).

FxCop's performance rules are a good starting point for finding potentially slow code which can direct you to some easy fixes to improve code performance. Both of the issues discussed here are marked by FxCop as 'non-breaking' which means that the changes should not break any code depending on the code changed. This should be a no-brainer: a non-breaking change for performance improvements should always be made.

History

  • April 2005 - First draft of the article.

License

This article, along with any associated source code and files, is licensed under The Creative Commons Attribution-ShareAlike 2.5 License

About the Author

Dr Herbie
Team Leader
United Kingdom United Kingdom
Member
After ten years studying biology and evolution, I threw it all away and became a fully paid up professional programmer. Since 1990 I have done C/C++ Windows 3.1 API, Pascal, Win32, MFC, OOP, C#, etc, etc.
 
PS. In the picture, I'm the one in blue. On the right.
 

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
SuggestionEven faster string comparison! [modified]memberPavel Vladov11 Sep '12 - 1:48 
If your application is not culture specific you can boost the performance of the string compaisons even more by using the String.Compare method with an ordinal ignore case string comparison. It is faster, as it does not take into account any culture specific rules. For more details and performance tests of different string comparison methods take a look at the following article:
 
C# Case Insensitive String Comparison

modified 11 Sep '12 - 8:05.

GeneralBad Compare : FxCop RulememberMember 403042023 Dec '09 - 19:28 
Which FxCop Rule catches code described by BadCompare Method and forces to use GoodCompare?
GeneralRe: Bad Compare : FxCop RulememberDr Herbie24 Dec '09 - 5:52 
Hi,
In VS2005 onwards 'FxCop' has evolved into the 'Code Analysis' tool which can be configured from the project's properties. The rule you would want is listed under 'performance' and is rule CA1807, described on MSDN here. http://msdn.microsoft.com/en-us/library/ms182266(VS.100).aspx[^]
 
Dr Herbie
Remember, half the people out there have below average IQs.

GeneralRe: Bad Compare : FxCop RulememberMember 403042029 Dec '09 - 19:26 
Thanks a lot Smile | :)
GeneralMemory Usuage Additional TestsmemberCumps16 Sep '07 - 7:51 
I've done some additional tests into memory usage as well with various string concatenation methods.
 
Might be useful: http://blog.cumps.be/string-concatenation-vs-memory-allocation/
GeneralI like your articlememberFormy19 Jul '07 - 5:21 
Found it interesting.
 
http://www.tbiro.com/Check-empty-string-performance.htm
 
Formy

GeneralPerhaps an incomplete conclusionmemberkarl beyer19 Sep '06 - 5:19 
Here's a test case from MSDN, which definitely triumphs StringBuilder for concatenating anything more than 25 strings:
 
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/vbnstrcatn.asp
 
Every situation IS different. Frequency of concatenation and number of concatenations and scope of handling are all criteria to the correct solution.
Your article is good, but the message that resonates is don't use StringBuilder for concatenating less than 600 strings - and that is a horrible message to walk away with. The article should highlight this is ONE test scenario, and it has only made a case that in THIS scenario, manual string concatenation has a performance edge. The message one should walk away with is: if in doubt, test it.
 

 


QuestionHelp with String Variable Comparisionmembermpham10 Aug '06 - 4:10 
Hello Everyone,
 
I have a problem with the following code.
 
Here is what I want for the Final Result.
 
I have the follwing Variables:
 
StartSerialNumber: (Operator Input and this can be ABC1000 or 061000 both Alphanumeric and Numeric Serial Number)

StopSerialNumber: (Operator Input and this can be ABC1000 or 061000 both Alphanumeric and Numeric Serial Number)
 
I just want to validate that the StopSerialNumber is not Less than the StartSerialNumber
 
Please Help.
 
Compiler gave me the following Error:
Compiler Error Message: CS0019: Operator '<=' cannot be applied to operands of type 'string' and 'string'
 
========Code=========
private bool validInputs()
{ bool v=!no;
string sSerial= tStartSerialNum.Text.Trim();
string eSerial= tStopSerialNum.Text.Trim();
 
if (eSerial < sSerial)
{ StopSerialNumError.InnerHtml = "Invalid Stop Serial Number.";
v = no;
}
return v;
}
 
Thanks
MvP
GeneralThis artical is not acuratememberyang yu 17999991 Jan '06 - 23:56 
First, performance on the object String has many levels and variables. In this case, you’ve only manipulated 2 of those money variables and based your conclusions on that. Let me explain why:
 
The basic principle of string concatenation is that the object ‘String’ is a fix-length array of ‘Char’. Example: the string “apple” is a char array of length 6 with each character being its char primitive value. So whenever a concatenation is done to String objects, a new string is created with a Char array length of those 2 strings. The basic code of array resizing has an performance in BIG-O notation of O(n). This means that it loops to the Length times. so “apple” + “banana” takes O(11) times to do. It is actually doing something like this behind the seen.
 
public char[] concat(char[] str1, char[] str2)
{
char[] newChar = new char[str1.length+str2.length];
for(int count = 0; count < str1.length; count ++)
newCar[count] = str1[count];
 
for(int count = str1.length; count < (str1.length + str2.length); count ++)
newCar[count] = str2[count];
 
return newChar;
}
 
this is why it takes so long! so by using StringBuilder, you can completly elimenate this process. Here is why this artical is very inacurate.
 
The artical uses the StringBuilder object in its worst case constructor. Do not ever use StringBuilder with no initial capactiy. Like the String object, StringBuilder is a array of chars in which you can Construct so it takes an inital memory alocation, like a byte buffer. So if you know your final string will be 10000 characters long, then use:
StringBuilder sb = new StringBuilder(10000);
this will ensure it does not run on O(n) performance. and your results will be instant.
 
So StringBuilder should be used WHENEVER you know the ruff length of your final string, which in alot of cases it is easy to find out with an easy math formula. (Example, if FirstName is max of 10 characters long, and 100 iterations is required, ten 10*100 is a good initial capactiy for the stringbuilder)
AnswerRe: This artical is not acuratememberDr Herbie4 Jan '06 - 1:48 

I would point you to the message below where I re-ran my code with a preset string length.
 
As you pont out:
 
yang yu 1799999 wrote:
The artical uses the StringBuilder object in its worst case constructor. Do not ever use StringBuilder with no initial capactiy.

 
This is not documented properly in the standard MSDN library for StrngBuilder, so there is a lot of code out there that uses the worst case scenario.
 
I would suggest you try performing a micro-benchmark of your own and look at the actual figures -- there's no reason why you couldn't even write another article and look at this in more depth.
I have intended to update this article for quite some time, but I get distracted by other things (like the new job I start next week, and selling my house, and ...)
 

 
Dr Herbie
Remember, half the people out there have below average IQs.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 8 May 2005
Article Copyright 2005 by Dr Herbie
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid