Click here to Skip to main content
Click here to Skip to main content

StringBuilder vs. String / Fast String Operations with .NET 2.0

, 30 Mar 2007 CPOL
Rate this:
Please Sign up or sign in to vote.
Comparision of String/StringBuilder functions. Efficient String handling.

Introduction

Strings are so heavily used in all programming languages that we do not think about them very much. We use them simply and hope to do the right thing. Normally all goes well but sometimes we need more performance so we switch to StringBuilder which is more efficient because it does contain a mutable string buffer. .NET Strings are immutable which is the reason why a new string object is created every time we alter it (insert, append, remove, etc.).

That sounds reasonable, so why do we still use the .NET String class functions and not the faster StringBuilder? Because optimal performance is a tricky thing and the first rule of the performance club is to measure it for yourself. Do not believe somebody telling you (including me!) that this or that is faster in every case. It is very difficult to predict the performance of some code in advance because you have to know so many variables that influence the outcome. Looking at the generated MSIL code does still NOT tell you how fast the code will perform. If you want to see why your function is so slow/fast you have to look at the compiled (JIT ed) x86 assembler code to get the full picture.

Greg Young did some very nice posts about what the JITer does make of your MSIL code at your CPU. In the following article I will show you the numbers for StringBuilder vs String which I did measure with .NET 2.0 a P4 3.0 GHz with 1 GB RAM. Every test was performed 5 million times to get a stable value.

Insert a String / Remove a character from one

I inserted the missing words at the beginning of the the sentence "The quick brown fox jumps over the lazy dog" to find out the break even point between String.Insert and StringBuilder.Insert. To see how the removal of characters worked I removed in a for loop one character from the beginning of our test sentence. The results are shown in the diagram below.

Screenshot - StringInsert.JPG
// Used Test functions for this chart
string StringRemove(string str, int Count)  
{
    for(int i=0;i<Count;i++)
        str = str.Remove(0, 1);

    return str;
}

string StringBuilderRemove(string str, int Count)
{
    StringBuilder sb = new StringBuilder(str);
    for(int i=0;i<Count;i++)    
        sb.Remove(0, 1);

    return sb.ToString();        
} 

string StringInsert(string str, string [] inserts)
{
    foreach (string insert in inserts)
        str = str.Insert(0, insert);


    return str;        
}

string StringBuilderInsert(string str, string [] inserts)
{          
    StringBuilder sb = new StringBuilder(str);
    foreach (string insert in inserts)
        sb.Insert(0, insert);

    return sb.ToString();      
}

We see here that StringBuilder is clearly the better choice if we have to alter the string. Insert and Remove operations are nearly always faster with StringBulder. The removal of characters is especially fast with StringBuilder where we gain nearly a factor of two.

Replace one String with another String

Things do become more interesting when we do replace anywhere from one to five words of our fox test sentence.

Screenshot - StringReplace.JPG
// Used Test functions for this chart
string StringReplace(string str, 
                     List<KeyValuePair<string,string>> searchReplace)
{
    foreach (KeyValuePair<string, string> sreplace in searchReplace)
        str = str.Replace(sreplace.Key, sreplace.Value);

    return str;
}

string StringBuilderReplace(string str, 
                            List<KeyValuePair<string,string>> searchReplace)
{
    StringBuilder sb = new StringBuilder(str);
    foreach (KeyValuePair<string, string> sreplace in searchReplace)
        sb.Replace(sreplace.Key, sreplace.Value);

    return sb.ToString();
}

This is somewhat surprising. StringBuilder does not beat String.Replace even if we do many replaces. There seems to be a constant overhead of about 1s we see in our data that we pay if we use StringBuilder. The overhead is quite significant (30%) when we have only a few String.Replaces to do.

String.Format

I checked when StringBuilder.AppendFormat is better than String.Format, and also appended it with the "+" operator.

Screenshot - StringFormat.JPG
// Functions used for this chart
string StringFormat(string format, int Count, params object[] para) 
{ 
    string str=String.Empty; 
    for(int i=0;i<Count;i++) 
        str += String.Format(format,para); return str; 
} 

string StringBuilderFormat(string format, int Count, params object[] para )
{ 
    StringBuilder sb = new StringBuilder(); 
    for(int i=0;i<Count;i++) 
        sb.AppendFormat(format, para); 

    return sb.ToString(); 
}

StringBuilder is better when you have to format and concatenate a string more than five times. You can shift the break even point even further if you do recycle the StringBuilder instance.

String Concatenation

This is the most interesting test because we have several options here. We can concatenate strings with +, String.Concat, String.Join and StringBuilder.Append.

Screenshot - StringConcat.JPG
string Add(params string[] strings) // Used Test functions for this chart
{
    string ret = String.Empty;    
    foreach (string str in strings)
        ret += str;

    return ret;
}

string Concat(params string[] strings)
{
    return String.Concat(strings);
}

string StringBuilderAppend(params string[] strings)
{
    StringBuilder sb = new StringBuilder();
    foreach (string str in strings)
        sb.Append(str);

    return sb.ToString();
}

string Join(params string[] strings)
{
    return String.Join(String.Empty, strings);
}

And the winner for String Concatenation is ... Not string builder but String.Join? After taking a deep look with Reflector I found that String.Join has the most efficient algorithm implemented which allocates in the first pass the final buffer size and then memcopy each string into the just allocated buffer. This is simply unbeatable. StringBuilder does become better above 7 strings compared to the + operator but this is not really code one would see very often.

Comparing Strings

An often underestimated topic is string comparisons. To compare Unicode strings your current locale settings has to be taken into account. Unicode characters with values greater than 65535 do not fit into the .NET Char type which is 16-bit wide. Especially in Asian countries these characters are quite common which complicates the matter even more (case invariant comparisons). The language specialties honoring comparison function of .NET 2.0 (I guess this is true for .NET 1.x also) is implemented in native code which does cost you a managed to unmanaged, and back transition.

Screenshot - StringCompare.JPG
// Used Test functions for this chart
int StringCompare(string str1, string str2) 
{
    return String.Compare(str1, str2, StringComparison.InvariantCulture);
}


int StringCompareOrdinal(string str1, string str2)
{
    return String.CompareOrdinal(str1, str2);
}

It is good that we compared the string comparison functions. A factor of 3 is really impressive and shows that localization comes with a cost which is not always negligible. Even the innocent looking mode StringComparison.InvariantCulture goes into the same slow native function which explains this big difference. When strings are interned, the comparison operation is much faster (over a factor 30) because a check for reference equality is made by the CLR.

To tell the truth, I was surprised by this result also and I did not know for a long time th use of this strange CompareOrdinal function. String.CompareOrdinal does nothing else than to compare the string char (16-bit remember) by char which is done 100% in managed code. That does allow the JITer to play with its optimizing muscles as you can see. If somebody does ask you what this CompareOrdinal is good for you now know why. You can (should) use this function on strings that are not visible to the outside world (users) and are therefore never localized. Only then it is safe to use this function. Remember: Making a program working fast but incorrect is easy. But making it work correctly and operate quickly is a hard thing to do. When you mainly deal with UI code the it's a good bet that you should forget this function very fast.

Conclusions

The following recommendations are valid for our small test strings (~30 chars) but should be applicable to bigger strings (100-500) as well (measure for yourself!). I have seen many synthetic performance measurements that demonstrate the power of StringBuilder with strings that are 10KB and bigger. This is the 1% case in real world programs. Most strings will be significantly shorter. When you optimize a function and you can "feel" the construction costs of an additional object then you have to look very carefully if you can afford the additional initialization costs of StringBuilder.

String Operation Most Efficient
Insert StringBuilder.Insert > 2 Insertion Strings
String.Insert otherwise
Remove StringBuilder is faster > 2 characters
to remove
Replace String.Replace always
Format String.Format < 5 Append + Format operations
StringBuilder.AppendFormat > 5 calls
Concatenation + for 2 strings
String.Join > 2 strings to concatenate

The shiny performance saving StringBuilder does not help in all cases and is, in some cases, slower than other functions. When you want to have good string concatenation performance I recommend strongly that you use String.Join which does an incredible job.

Points of Interest

  • I did not tell you more about the String.Intern function. You need to know more about string interning only if you need to save memory in favor of processing power.
  • If you want to see a good example how you can improve string formatting 14 times for fixed length strings have a look at my blog.
  • Did you notice that there is no String.Reverse in .NET? In any case, you would rarely need that function anyway Greg did put up a little contest to find the fastest String.Reverse function. The functions presented there are fast but do not work correct with surrogate (chars with a value > 65535) Unicode characters. Making it fast and correct is not easy).
  • The test results obtained here are .NET Framework, machine and string length specific. Please do not simply look at the numbers and use this or that function without being certain that the results obtained here are applicable to your concrete problem.

History

  • 28.7.2006 Fixed Download/Fine tuning the coloring of the charts to make it more readable.
  • 27.7.2006 Updated String Comparison graph. Interned string comparison is much faster.
  • 27.7.2006 Fixed bug in String.Concat Diagram. The numbers below 3 string concats where wrong. Thanks Greg for pointing this out.
  • 27.7.2006 Changed String.Format diagramm to get the full picture until when StringBuilder does outperform String.Format and Concat.
  • 26.7.2006 Released v1.0 on CodeProject

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Alois Kraus
Web Developer
Germany Germany
He is working for a multi national company which is a hard and software vendor of medical equipment. Currently he is located in Germany and enjoys living in general. During his search for programming best practices he was awarded by Microsoft with the Patterns and Pratices Champion Award. Although he finds pretty much everything interesting he pays special attention to .NET software development, software architecture and nuclear physics. To complete the picture he likes hiking in the mountains and collecting crystals.

Comments and Discussions

 
QuestionPlease compare it with linq PinmemberMember 36565317-Nov-12 15:49 
GeneralMy vote of 5 PinmemberSenthilkumar Elangovan6-Apr-12 11:39 
QuestionString Concatenation part is a bit misleading Pinmember_groo_21-Sep-11 23:28 
GeneralMy vote of 5 PinmemberАslam Iqbal17-Jan-11 4:03 
GeneralStringbuilders are faster then string.join PinmemberOmnicoder15-Feb-10 20:42 
GeneralRe: Stringbuilders are faster then string.join PinmemberAlois Kraus17-Feb-10 11:17 
GeneralRe: Stringbuilders are faster then string.join Pinmemberneha201128-Jun-11 22:03 
GeneralGreat article! Pinmemberbroham_chico27-Jan-10 18:30 
GeneralIt would be great to see an update of this article PinmemberBillWoodruff11-Dec-09 11:34 
GeneralSome of these results are impossible PinmemberKbirger21-Jan-09 7:43 
AnswerRe: Some of these results are impossible PinmemberAlois Kraus21-Jan-09 12:43 
GeneralI do not think string.Format vs StringBuilder was tested correctly PinmemberMetalKid00712-Jan-09 11:26 
GeneralRe: I do not think string.Format vs StringBuilder was tested correctly PinmemberAlois Kraus21-Jan-09 12:55 
GeneralWebsite optimization. Is it good to use stringbuilder fro very small string concatenations. Or for small concatenations string is sufficient Pinmembershivamraj14-Nov-08 1:09 
GeneralRe: Website optimization. Is it good to use stringbuilder fro very small string concatenations. Or for small concatenations string is sufficient PinmemberAlois Kraus21-Jan-09 12:49 
GeneralCompareOrdinal can be a MUST PinmemberAndrew Phillips28-Apr-08 22:08 
GeneralStringBuilder.Append() vs String.Join() Pinmemberflaunt5-Nov-07 10:06 
GeneralRe: StringBuilder.Append() vs String.Join() PinmemberSachdeo Ajay6-Jul-09 12:19 
GeneralServer consideration and other Performance measures Pinmembermross015-Dec-06 8:17 
GeneralRe: Server consideration and other Performance measures PinmemberAlois Kraus5-Dec-06 11:17 
QuestionCan' down your code PinmemberWuBill18-Aug-06 17:58 
AnswerRe: Can' down your code PinmemberAlois Kraus20-Aug-06 10:28 
GeneralRe: Can' down your code PinmemberWuBill20-Aug-06 17:20 
QuestionMissing Source code PinmemberTBermudez27-Jul-06 5:36 
AnswerRe: Missing Source code PinmemberAlois Kraus27-Jul-06 12:33 
GeneralDon't quit your day job... Pinmembertonyt26-Jul-06 15:04 
GeneralRe: Don't quit your day job... PinmemberAlois Kraus26-Jul-06 23:32 
JokeRe: Don't quit your day job... PinmemberAbishek Bellamkonda3-Aug-06 14:09 
QuestionGC overhead? Pinmemberjohnb4426-Jul-06 5:20 
AnswerRe: GC overhead? Pinmembergregoryyoung26-Jul-06 14:13 
AnswerRe: GC overhead? PinmemberAlois Kraus26-Jul-06 14:46 
GeneralString.Format Implementation (viewed by Reflector) PinmemberHectep25-Jul-06 23:25 
GeneralRe: String.Format Implementation (viewed by Reflector) PinmemberAlois Kraus26-Jul-06 0:08 
GeneralRe: String.Format Implementation (viewed by Reflector) PinmemberChristian Klauser31-Mar-07 6:21 
GeneralRe: String.Format Implementation (viewed by Reflector) PinmemberAlois Kraus1-Apr-07 3:35 
GeneralAppending Strings PinmemberSteve Hansen25-Jul-06 20:54 
GeneralRe: Appending Strings PinmemberAlois Kraus25-Jul-06 21:53 
GeneralRe: Appending Strings Pinmembersimon.proctor26-Jul-06 3:26 
GeneralWrong Testing Method Pinmemberdavepermen25-Jul-06 15:21 
GeneralRe: Wrong Testing Method PinmemberAlois Kraus25-Jul-06 21:55 
GeneralRe: Wrong Testing Method PinmemberMogobuTheFool2-Aug-06 6:26 
GeneralRe: Wrong Testing Method PinmemberAlois Kraus2-Aug-06 8:43 
GeneralRe: Wrong Testing Method PinmemberMogobuTheFool3-Aug-06 12:15 
GeneralRe: Wrong Testing Method PinmemberAlois Kraus3-Aug-06 14:19 
GeneralRe: Wrong Testing Method PinmemberMrDnote16-Nov-06 0:34 
GeneralRe: Wrong Testing Method Pinmembermross015-Dec-06 9:19 
GeneralRe: Wrong Testing Method PinmemberKinStephen30-Mar-07 11:34 
Wouldn't this be even faster since no concatination would be happening?, C# would treat this as a single string. It's also cleaner to read and modify.
 
public int GetCustCount()
{
   string sql = @"select Count(*)
   from Customers
   where City='Milan'";
 
   SqlCommand sql = new SqlCommand(sql);
   //etc
}
 
Stephen

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.1411022.1 | Last Updated 30 Mar 2007
Article Copyright 2006 by Alois Kraus
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid