Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / C#
Article

StringBuilder vs. String / Fast String Operations with .NET 2.0

Rate me:
Please Sign up or sign in to vote.
3.91/5 (65 votes)
30 Mar 2007CPOL7 min read 379.8K   701   101   47
Comparision of String/StringBuilder functions. Efficient String handling.

Introduction

Strings are so heavily used in all programming languages that we do not think about them very much. We use them simply and hope to do the right thing. Normally all goes well but sometimes we need more performance so we switch to StringBuilder which is more efficient because it does contain a mutable string buffer. .NET Strings are immutable which is the reason why a new string object is created every time we alter it (insert, append, remove, etc.).

That sounds reasonable, so why do we still use the .NET String class functions and not the faster StringBuilder? Because optimal performance is a tricky thing and the first rule of the performance club is to measure it for yourself. Do not believe somebody telling you (including me!) that this or that is faster in every case. It is very difficult to predict the performance of some code in advance because you have to know so many variables that influence the outcome. Looking at the generated MSIL code does still NOT tell you how fast the code will perform. If you want to see why your function is so slow/fast you have to look at the compiled (JIT ed) x86 assembler code to get the full picture.

Greg Young did some very nice posts about what the JITer does make of your MSIL code at your CPU. In the following article I will show you the numbers for StringBuilder vs String which I did measure with .NET 2.0 a P4 3.0 GHz with 1 GB RAM. Every test was performed 5 million times to get a stable value.

Insert a String / Remove a character from one

I inserted the missing words at the beginning of the the sentence "The quick brown fox jumps over the lazy dog" to find out the break even point between String.Insert and StringBuilder.Insert. To see how the removal of characters worked I removed in a for loop one character from the beginning of our test sentence. The results are shown in the diagram below.

Screenshot - StringInsert.JPG
C#
// Used Test functions for this chart
string StringRemove(string str, int Count)  
{
    for(int i=0;i<Count;i++)
        str = str.Remove(0, 1);

    return str;
}

string StringBuilderRemove(string str, int Count)
{
    StringBuilder sb = new StringBuilder(str);
    for(int i=0;i<Count;i++)    
        sb.Remove(0, 1);

    return sb.ToString();        
} 

string StringInsert(string str, string [] inserts)
{
    foreach (string insert in inserts)
        str = str.Insert(0, insert);


    return str;        
}

string StringBuilderInsert(string str, string [] inserts)
{          
    StringBuilder sb = new StringBuilder(str);
    foreach (string insert in inserts)
        sb.Insert(0, insert);

    return sb.ToString();      
}

We see here that StringBuilder is clearly the better choice if we have to alter the string. Insert and Remove operations are nearly always faster with StringBulder. The removal of characters is especially fast with StringBuilder where we gain nearly a factor of two.

Replace one String with another String

Things do become more interesting when we do replace anywhere from one to five words of our fox test sentence.

Screenshot - StringReplace.JPG
C#
// Used Test functions for this chart
string StringReplace(string str, 
                     List<KeyValuePair<string,string>> searchReplace)
{
    foreach (KeyValuePair<string, string> sreplace in searchReplace)
        str = str.Replace(sreplace.Key, sreplace.Value);

    return str;
}

string StringBuilderReplace(string str, 
                            List<KeyValuePair<string,string>> searchReplace)
{
    StringBuilder sb = new StringBuilder(str);
    foreach (KeyValuePair<string, string> sreplace in searchReplace)
        sb.Replace(sreplace.Key, sreplace.Value);

    return sb.ToString();
}

This is somewhat surprising. StringBuilder does not beat String.Replace even if we do many replaces. There seems to be a constant overhead of about 1s we see in our data that we pay if we use StringBuilder. The overhead is quite significant (30%) when we have only a few String.Replaces to do.

String.Format

I checked when StringBuilder.AppendFormat is better than String.Format, and also appended it with the "+" operator.

Screenshot - StringFormat.JPG
C#
// Functions used for this chart
string StringFormat(string format, int Count, params object[] para) 
{ 
    string str=String.Empty; 
    for(int i=0;i<Count;i++) 
        str += String.Format(format,para); return str; 
} 

string StringBuilderFormat(string format, int Count, params object[] para )
{ 
    StringBuilder sb = new StringBuilder(); 
    for(int i=0;i<Count;i++) 
        sb.AppendFormat(format, para); 

    return sb.ToString(); 
}

StringBuilder is better when you have to format and concatenate a string more than five times. You can shift the break even point even further if you do recycle the StringBuilder instance.

String Concatenation

This is the most interesting test because we have several options here. We can concatenate strings with +, String.Concat, String.Join and StringBuilder.Append.

Screenshot - StringConcat.JPG
C#
string Add(params string[] strings) // Used Test functions for this chart
{
    string ret = String.Empty;    
    foreach (string str in strings)
        ret += str;

    return ret;
}

string Concat(params string[] strings)
{
    return String.Concat(strings);
}

string StringBuilderAppend(params string[] strings)
{
    StringBuilder sb = new StringBuilder();
    foreach (string str in strings)
        sb.Append(str);

    return sb.ToString();
}

string Join(params string[] strings)
{
    return String.Join(String.Empty, strings);
}

And the winner for String Concatenation is ... Not string builder but String.Join? After taking a deep look with Reflector I found that String.Join has the most efficient algorithm implemented which allocates in the first pass the final buffer size and then memcopy each string into the just allocated buffer. This is simply unbeatable. StringBuilder does become better above 7 strings compared to the + operator but this is not really code one would see very often.

Comparing Strings

An often underestimated topic is string comparisons. To compare Unicode strings your current locale settings has to be taken into account. Unicode characters with values greater than 65535 do not fit into the .NET Char type which is 16-bit wide. Especially in Asian countries these characters are quite common which complicates the matter even more (case invariant comparisons). The language specialties honoring comparison function of .NET 2.0 (I guess this is true for .NET 1.x also) is implemented in native code which does cost you a managed to unmanaged, and back transition.

Screenshot - StringCompare.JPG
C#
// Used Test functions for this chart
int StringCompare(string str1, string str2) 
{
    return String.Compare(str1, str2, StringComparison.InvariantCulture);
}


int StringCompareOrdinal(string str1, string str2)
{
    return String.CompareOrdinal(str1, str2);
}

It is good that we compared the string comparison functions. A factor of 3 is really impressive and shows that localization comes with a cost which is not always negligible. Even the innocent looking mode StringComparison.InvariantCulture goes into the same slow native function which explains this big difference. When strings are interned, the comparison operation is much faster (over a factor 30) because a check for reference equality is made by the CLR.

To tell the truth, I was surprised by this result also and I did not know for a long time th use of this strange CompareOrdinal function. String.CompareOrdinal does nothing else than to compare the string char (16-bit remember) by char which is done 100% in managed code. That does allow the JITer to play with its optimizing muscles as you can see. If somebody does ask you what this CompareOrdinal is good for you now know why. You can (should) use this function on strings that are not visible to the outside world (users) and are therefore never localized. Only then it is safe to use this function. Remember: Making a program working fast but incorrect is easy. But making it work correctly and operate quickly is a hard thing to do. When you mainly deal with UI code the it's a good bet that you should forget this function very fast.

Conclusions

The following recommendations are valid for our small test strings (~30 chars) but should be applicable to bigger strings (100-500) as well (measure for yourself!). I have seen many synthetic performance measurements that demonstrate the power of StringBuilder with strings that are 10KB and bigger. This is the 1% case in real world programs. Most strings will be significantly shorter. When you optimize a function and you can "feel" the construction costs of an additional object then you have to look very carefully if you can afford the additional initialization costs of StringBuilder. <thread>

String Operation Most Efficient
InsertStringBuilder.Insert > 2 Insertion Strings
String.Insert otherwise
RemoveStringBuilder is faster > 2 characters
to remove
ReplaceString.Replace always
FormatString.Format < 5 Append + Format operations
StringBuilder.AppendFormat > 5 calls
Concatenation+ for 2 strings
String.Join > 2 strings to concatenate

The shiny performance saving StringBuilder does not help in all cases and is, in some cases, slower than other functions. When you want to have good string concatenation performance I recommend strongly that you use String.Join which does an incredible job.

Points of Interest

  • I did not tell you more about the String.Intern function. You need to know more about string interning only if you need to save memory in favor of processing power.
  • If you want to see a good example how you can improve string formatting 14 times for fixed length strings have a look at my blog.
  • Did you notice that there is no String.Reverse in .NET? In any case, you would rarely need that function anyway Greg did put up a little contest to find the fastest String.Reverse function. The functions presented there are fast but do not work correct with surrogate (chars with a value > 65535) Unicode characters. Making it fast and correct is not easy).
  • The test results obtained here are .NET Framework, machine and string length specific. Please do not simply look at the numbers and use this or that function without being certain that the results obtained here are applicable to your concrete problem.

History

  • 28.7.2006 Fixed Download/Fine tuning the coloring of the charts to make it more readable.
  • 27.7.2006 Updated String Comparison graph. Interned string comparison is much faster.
  • 27.7.2006 Fixed bug in String.Concat Diagram. The numbers below 3 string concats where wrong. Thanks Greg for pointing this out.
  • 27.7.2006 Changed String.Format diagramm to get the full picture until when StringBuilder does outperform String.Format and Concat.
  • 26.7.2006 Released v1.0 on CodeProject

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Systems Engineer Siemens
Germany Germany
He is working for a multi national company which is a hard and software vendor of medical equipment. Currently he is located in Germany and enjoys living in general. Although he finds pretty much everything interesting he pays special attention to .NET software development, software architecture and nuclear physics. To complete the picture he likes hiking in the mountains and collecting crystals.

Comments and Discussions

 
QuestionPlease compare it with linq Pin
Member 36565317-Nov-12 14:49
Member 36565317-Nov-12 14:49 
GeneralMy vote of 5 Pin
Senthilkumar Elangovan6-Apr-12 10:39
Senthilkumar Elangovan6-Apr-12 10:39 
QuestionString Concatenation part is a bit misleading Pin
_groo_21-Sep-11 22:28
_groo_21-Sep-11 22:28 
GeneralMy vote of 5 Pin
Аslam Iqbal17-Jan-11 3:03
professionalАslam Iqbal17-Jan-11 3:03 
GeneralStringbuilders are faster then string.join Pin
o m n i15-Feb-10 19:42
o m n i15-Feb-10 19:42 
GeneralRe: Stringbuilders are faster then string.join Pin
Alois Kraus17-Feb-10 10:17
Alois Kraus17-Feb-10 10:17 
GeneralRe: Stringbuilders are faster then string.join Pin
neha201128-Jun-11 21:03
neha201128-Jun-11 21:03 
GeneralGreat article! Pin
broham_chico27-Jan-10 17:30
broham_chico27-Jan-10 17:30 
GeneralIt would be great to see an update of this article Pin
BillWoodruff11-Dec-09 10:34
professionalBillWoodruff11-Dec-09 10:34 
Hi Alois,

I've always appreciated this article.

Now that we are in the "brave new world" of Linq, anonymous methods, and lambdas.

I would love to see a comparison of timings when the the input was a long (4k or more) list of words that varied in length from short to very long).

thanks, Bill

"Many : not conversant with mathematical studies, imagine that because it [the Analytical Engine] is to give results in numerical notation, its processes must consequently be arithmetical, numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine numerical quantities as if they were letters or any other general symbols; and it fact it might bring out its results in algebraical notation, were provisions made accordingly." Ada, Countess Lovelace, 1844

GeneralSome of these results are impossible Pin
Kir Birger21-Jan-09 6:43
Kir Birger21-Jan-09 6:43 
AnswerRe: Some of these results are impossible Pin
Alois Kraus21-Jan-09 11:43
Alois Kraus21-Jan-09 11:43 
GeneralI do not think string.Format vs StringBuilder was tested correctly Pin
MetalKid00712-Jan-09 10:26
MetalKid00712-Jan-09 10:26 
GeneralRe: I do not think string.Format vs StringBuilder was tested correctly Pin
Alois Kraus21-Jan-09 11:55
Alois Kraus21-Jan-09 11:55 
GeneralWebsite optimization. Is it good to use stringbuilder fro very small string concatenations. Or for small concatenations string is sufficient Pin
shivamrajvansh14-Nov-08 0:09
shivamrajvansh14-Nov-08 0:09 
GeneralRe: Website optimization. Is it good to use stringbuilder fro very small string concatenations. Or for small concatenations string is sufficient Pin
Alois Kraus21-Jan-09 11:49
Alois Kraus21-Jan-09 11:49 
GeneralCompareOrdinal can be a MUST Pin
Andrew Phillips28-Apr-08 21:08
Andrew Phillips28-Apr-08 21:08 
GeneralStringBuilder.Append() vs String.Join() Pin
flaunt5-Nov-07 9:06
flaunt5-Nov-07 9:06 
GeneralRe: StringBuilder.Append() vs String.Join() Pin
Sachdeo Ajay6-Jul-09 11:19
Sachdeo Ajay6-Jul-09 11:19 
GeneralServer consideration and other Performance measures Pin
mross015-Dec-06 7:17
mross015-Dec-06 7:17 
GeneralRe: Server consideration and other Performance measures Pin
Alois Kraus5-Dec-06 10:17
Alois Kraus5-Dec-06 10:17 
QuestionCan' down your code Pin
WuBill18-Aug-06 16:58
WuBill18-Aug-06 16:58 
AnswerRe: Can' down your code Pin
Alois Kraus20-Aug-06 9:28
Alois Kraus20-Aug-06 9:28 
GeneralRe: Can' down your code Pin
WuBill20-Aug-06 16:20
WuBill20-Aug-06 16:20 
QuestionMissing Source code Pin
TBermudez27-Jul-06 4:36
TBermudez27-Jul-06 4:36 
AnswerRe: Missing Source code Pin
Alois Kraus27-Jul-06 11:33
Alois Kraus27-Jul-06 11:33 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.