Introduction
Strings are so heavily used in all programming languages that we do not think about them very much. We use them simply and hope to do the right thing. Normally all goes well but sometimes we need more performance so we switch to StringBuilder
which is more efficient because it does contain a mutable string buffer. .NET Strings are immutable which is the reason why a new string object is created every time we alter it (insert, append, remove, etc.).
That sounds reasonable, so why do we still use the .NET String class functions and not the faster StringBuilder? Because optimal performance is a tricky thing and the first rule of the performance club is to measure it for yourself. Do not believe somebody telling you (including me!) that this or that is faster in every case. It is very difficult to predict the performance of some code in advance because you have to know so many variables that influence the outcome. Looking at the generated MSIL code does still NOT tell you how fast the code will perform. If you want to see why your function is so slow/fast you have to look at the compiled (JIT ed) x86 assembler code to get the full picture.
Greg Young did some very nice posts about what the JITer does make of your MSIL code at your CPU. In the following article I will show you the numbers for StringBuilder
vs String
which I did measure with .NET 2.0 a P4 3.0 GHz with 1 GB RAM. Every test was performed 5 million times to get a stable value.
Insert a String / Remove a character from one
I inserted the missing words at the beginning of the the sentence "The quick brown fox jumps over the lazy dog" to find out the break even point between String.Insert
and StringBuilder.Insert
. To see how the removal of characters worked I removed in a for loop one character from the beginning of our test sentence. The results are shown in the diagram below.

string StringRemove(string str, int Count)
{
for(int i=0;i<Count;i++)
str = str.Remove(0, 1);
return str;
}
string StringBuilderRemove(string str, int Count)
{
StringBuilder sb = new StringBuilder(str);
for(int i=0;i<Count;i++)
sb.Remove(0, 1);
return sb.ToString();
}
string StringInsert(string str, string [] inserts)
{
foreach (string insert in inserts)
str = str.Insert(0, insert);
return str;
}
string StringBuilderInsert(string str, string [] inserts)
{
StringBuilder sb = new StringBuilder(str);
foreach (string insert in inserts)
sb.Insert(0, insert);
return sb.ToString();
}
We see here that StringBuilder
is clearly the better choice if we have to alter the string. Insert and Remove operations are nearly always faster with StringBulder
. The removal of characters is especially fast with StringBuilder
where we gain nearly a factor of two.
Replace one String with another String
Things do become more interesting when we do replace anywhere from one to five words of our fox test sentence.
string StringReplace(string str,
List<KeyValuePair<string,string>> searchReplace)
{
foreach (KeyValuePair<string, string> sreplace in searchReplace)
str = str.Replace(sreplace.Key, sreplace.Value);
return str;
}
string StringBuilderReplace(string str,
List<KeyValuePair<string,string>> searchReplace)
{
StringBuilder sb = new StringBuilder(str);
foreach (KeyValuePair<string, string> sreplace in searchReplace)
sb.Replace(sreplace.Key, sreplace.Value);
return sb.ToString();
}
This is somewhat surprising. StringBuilder
does not beat String.Replace
even if we do many replaces. There seems to be a constant overhead of about 1s we see in our data that we pay if we use StringBuilder
. The overhead is quite significant (30%) when we have only a few String.Replaces
to do.
String.Format
I checked when StringBuilder.AppendFormat
is better than String.Format
, and also appended it with the "+" operator.
string StringFormat(string format, int Count, params object[] para)
{
string str=String.Empty;
for(int i=0;i<Count;i++)
str += String.Format(format,para); return str;
}
string StringBuilderFormat(string format, int Count, params object[] para )
{
StringBuilder sb = new StringBuilder();
for(int i=0;i<Count;i++)
sb.AppendFormat(format, para);
return sb.ToString();
}
StringBuilder
is better when you have to format and concatenate a string more than five times. You can shift the break even point even further if you do recycle the StringBuilder instance.
String Concatenation
This is the most interesting test because we have several options here. We can concatenate strings with +, String.Concat
, String.Join
and StringBuilder.Append
.
string Add(params string[] strings)
{
string ret = String.Empty;
foreach (string str in strings)
ret += str;
return ret;
}
string Concat(params string[] strings)
{
return String.Concat(strings);
}
string StringBuilderAppend(params string[] strings)
{
StringBuilder sb = new StringBuilder();
foreach (string str in strings)
sb.Append(str);
return sb.ToString();
}
string Join(params string[] strings)
{
return String.Join(String.Empty, strings);
}
And the winner for String Concatenation is ... Not string builder but String.Join
? After taking a deep look with Reflector I found that String.Join has the most efficient algorithm implemented which allocates in the first pass the final buffer size and then memcopy each string into the just allocated buffer. This is simply unbeatable. StringBuilder
does become better above 7 strings compared to the + operator but this is not really code one would see very often.
Comparing Strings
An often underestimated topic is string comparisons. To compare Unicode strings your current locale settings has to be taken into account. Unicode characters with values greater than 65535 do not fit into the .NET Char type which is 16-bit wide. Especially in Asian countries these characters are quite common which complicates the matter even more (case invariant comparisons). The language specialties honoring comparison function of .NET 2.0 (I guess this is true for .NET 1.x also) is implemented in native code which does cost you a managed to unmanaged, and back transition.
int StringCompare(string str1, string str2)
{
return String.Compare(str1, str2, StringComparison.InvariantCulture);
}
int StringCompareOrdinal(string str1, string str2)
{
return String.CompareOrdinal(str1, str2);
}
It is good that we compared the string comparison functions. A factor of 3 is really impressive and shows that localization comes with a cost which is not always negligible. Even the innocent looking mode StringComparison.InvariantCulture
goes into the same slow native function which explains this big difference. When strings are interned, the comparison operation is much faster (over a factor 30) because a check for reference equality is made by the CLR.
To tell the truth, I was surprised by this result also and I did not know for a long time th use of this strange CompareOrdinal
function. String.CompareOrdinal
does nothing else than to compare the string char (16-bit remember) by char which is done 100% in managed code. That does allow the JITer to play with its optimizing muscles as you can see. If somebody does ask you what this CompareOrdinal
is good for you now know why. You can (should) use this function on strings that are not visible to the outside world (users) and are therefore never localized. Only then it is safe to use this function. Remember: Making a program working fast but incorrect is easy. But making it work correctly and operate quickly is a hard thing to do. When you mainly deal with UI code the it's a good bet that you should forget this function very fast.
Conclusions
The following recommendations are valid for our small test strings (~30 chars) but should be applicable to bigger strings (100-500) as well (measure for yourself!). I have seen many synthetic performance measurements that demonstrate the power of StringBuilder
with strings that are 10KB and bigger. This is the 1% case in real world programs. Most strings will be significantly shorter. When you optimize a function and you can "feel" the construction costs of an additional object then you have to look very carefully if you can afford the additional initialization costs of StringBuilder.
String Operation |
Most Efficient |
Insert |
StringBuilder.Insert > 2 Insertion Strings String.Insert otherwise |
Remove |
StringBuilder is faster > 2 characters to remove |
Replace |
String.Replace always |
Format |
String.Format < 5 Append + Format operations StringBuilder.AppendFormat > 5 calls |
Concatenation |
+ for 2 strings String.Join > 2 strings to concatenate |
The shiny performance saving StringBuilder
does not help in all cases and is, in some cases, slower than other functions. When you want to have good string concatenation performance I recommend strongly that you use String.Join
which does an incredible job.
Points of Interest
- I did not tell you more about the String.Intern function. You need to know more about string interning only if you need to save memory in favor of processing power.
- If you want to see a good example how you can improve string formatting 14 times for fixed length strings have a look at my blog.
- Did you notice that there is no
String.Reverse
in .NET? In any case, you would rarely need that function anyway Greg did put up a little contest to find the fastest String.Reverse
function. The functions presented there are fast but do not work correct with surrogate (chars with a value > 65535) Unicode characters. Making it fast and correct is not easy).
- The test results obtained here are .NET Framework, machine and string length specific. Please do not simply look at the numbers and use this or that function without being certain that the results obtained here are applicable to your concrete problem.
History
- 28.7.2006 Fixed Download/Fine tuning the coloring of the charts to make it more readable.
- 27.7.2006 Updated String Comparison graph. Interned string comparison is much faster.
- 27.7.2006 Fixed bug in
String.Concat
Diagram. The numbers below 3 string concats where wrong. Thanks Greg for pointing this out.
- 27.7.2006 Changed
String.Format
diagramm to get the full picture until when StringBuilder does outperform String.Format
and Concat.
- 26.7.2006 Released v1.0 on CodeProject