Click here to Skip to main content
Click here to Skip to main content

Strings in .NET

, 25 Apr 2004
Rate this:
Please Sign up or sign in to vote.
Strings in .NET are special - this article shows why.

Introduction

Strings are a sequence of characters. There are different types of characters, but that is a topic for a different article (for a better understanding of character types, go here). I’m not covering the whole structure of the String type here, but just highlighting some of the “special” features!

Immutable and Interned

In .NET, strings are immutable. This means that, once a value is assigned to a String object, it can never be changed. That’s right – you can’t change a String’s value! Take a look at this code:

1  class Test
2  {
3      public static void Main()
4      {
5          string myString = "1234";
6          System.Console.WriteLine(myString);
7          myString += "5678";
8          System.Console.WriteLine(myString);
9      }
10 }

The output from this is:

1234
12345678

Though it seems as if we just changed the value of myString from “1234” to “12345678”, we really didn’t! Let’s step through the above code. In line 5, a new String object is allocated on the heap with a value of “1234”, and myString points to its memory address. In line 7, a new string is once again allocated on the heap, with value “12345678”, and myString now points to this new memory location. So you actually sit with two string objects on the heap, even though you’re only referencing one of them. The “1234” string is still interned, and if unused, it will be garbage collected with the next GC cycle.

If you now create any number of string objects, all with a value of “1234”, they would all point to the one interned instance. This ensures that strings use memory very efficiently.

When instantiating a string object with the value of “1234”, your string could thus be pointing to the same location as other already existing strings are. Now, imagine the chaos you could cause by changing the content of your string – you’d change the content of *ALL* other strings pointing to that location! This is the reason for strings’ immutability.

Performance

The performance gain by interning strings is in regard of memory optimization, and is quite obvious. When you have one thousand strings with the same value, you’d use only the memory space needed for one instance – the strings would all point to the same memory address. However, consider the following scenario:

1  class Test
2  {
3      public static void Main()
4      {
5          string myString = "1";
6          string myString += "2";
7          string myString += "3";
8          string myString += "4";
9          System.Console.WriteLine(myString);
10     }
11  }

We needed one string here with the value of “1234”, but in actual fact, we now have four strings on the heap! (“1”, “12”, “123” and “1234” - our variable myString points to the last one). This seems like a bad situation, where the string’s behavior actually decreases performance!

For this reason, there is the StringBuilder object. By using StringBuilder, we can enhance performance of string concatenation in our previous listing as follows:

1  class Test
2  {
3      public static void Main()
4      {
5          StringBuilder mySB = new StringBuilder(4);
6          mySB.Append("1");
7          mySB.Append("2");
8          mySB.Append("3");
9          mySB.Append("4");
10         System.Console.WriteLine(mySB.ToString());
11     }
12  }

The StringBuilder constructor accepts a parameter to specify the initial buffer size. In our case, we chose 4, since we know that this would be the length of our string. This will create a contiguous memory block of the specified size, where you can chop and change your string to your heart’s content. If you would append anything to your string that would overrun the buffer size, the StringBuilder’s memory buffer will automagically increase. Note that it is better to choose a buffer size that is slightly too big, than to have your StringBuilder’s buffer grow often.

Appending to a StringBuilder outperforms string concatenation by far, since there is much less overhead in terms of allocating new objects and collecting the old ones.

References to interned strings

Look at the following code:

1  class Test
2  {
3      public static void Main()
4      {
5          string firstString = "1234";
6          StringBuilder sb = new StringBuilder(4);
7          sb.Append("1234");
8          string secondString = sb.ToString();
9          string thirdString = String.Intern(sb.ToString());
10         System.Console.WriteLine((Object)secondString == (Object)firstString);
11         System.Console.WriteLine((Object)thirdString == (Object)firstString);
12     }
13 }

The output of this would be:

False
True

In line 6, the StringBuilder is allocated space on the heap – separate from the space of firstString. This makes perfect sense, since at this time the CLR does not know yet that the value of sb will be the same as that for firstString. So in line 8, instead of pointing to the interned firstString, secondString will point to the location of the StringBuilder. If you wish to make use of an interned string (if it exists), do it as in line 9.

The Intern method returns a reference to the interned string if it exists. If it does not exist, it will create an interned string with the value specified, and return a reference to this new interned string.

Final words

It isn’t necessary to use a StringBuilder for every concatenation. When you just append two (or a relatively low number) strings together, just concatenate them. If you have to concatenate in a loop with many iterations, use a StringBuilder. In my opinion, for readability, concatenate your strings, but if performance suffers noticeably, consider using a StringBuilder.

Once again, this article is published on my blog, and can be discussed over there.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Ernst Kuschke
Web Developer
South Africa South Africa
Ernst develops software for the agricultural industry.
He moderates the SADeveloper INETA community, and gives talks on development-related stuff from time to time.
He is an MVP in Visual C#, but whilst he advocates the adoption of .NET, he is also skilled in non-Microsoft technologies.
Sometimes he chats about it.

Comments and Discussions

 
Questionhttp://passionatetalks.wordpress.com/2014/07/20/why-strings-are-immutable-in-dot-net/ PinprofessionalVineetyan21-Jul-14 7:54 
GeneralStrings in .Net PinmemberChristian Glowinski22-Apr-11 8:22 
Generalstring.Format and constants PinmemberLorenzoDV17-May-04 13:22 
GeneralRe: string.Format and constants PinmemberErnst Kuschke13-Jun-04 1:36 
GeneralRe: string.Format and constants PinsussAnonymous2-Feb-05 16:19 
GeneralRe: string.Format and constants PinmemberLorenzoDV2-Feb-05 22:41 
GeneralRe: string.Format and constants PinmemberChrisalm30-Mar-05 9:39 
GeneralRe: string.Format and constants PinmemberLorenzoDV30-Mar-05 22:01 
GeneralRe: string.Format and constants PinsussAnonymous31-Mar-05 2:22 
GeneralRe: string.Format and constants PinmemberDev-Guru22-May-07 3:46 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140721.1 | Last Updated 26 Apr 2004
Article Copyright 2004 by Ernst Kuschke
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid