Click here to Skip to main content
Email Password   helpLost your password?

Introduction

Extension methods (a new feature of C# 3.0) are useful as they enable to "add" methods to a class without modifying its source code. Such methods behave (from a point of writing code and intellisense) like member methods. This is very useful for built-in .NET classes or third-party libraries. Hundreds of articles have been written about this; the aim of this article is not to introduce extension methods, but to show a collection of several most useful extension methods for the System.String class.

This article brings a small library (a code file and unit tests for this code). Some of the extension methods have been collected from various websites, and some were written by me. Unit tests are presented for demonstration purposes.

Background

For those who don't know about extension methods, I suggest reading this nice article on Wikipedia.

Using the Code

Let me introduce the source code without much delay. The first method was written by David Hayden and checks if an email ID is in valid format.

/// <summary>
/// true, if is valid email address
/// from http://www.davidhayden.com/blog/dave/
/// archive/2006/11/30/ExtensionMethodsCSharp.aspx
/// </summary>
/// <param name="s">email address to test</param>
/// <returns>true, if is valid email address</returns>

public static bool IsValidEmailAddress(this string s)
{
    return new Regex(@"^[\w-\.]+@([\w-]+\.)+[\w-]{2,6}$").IsMatch(s);
}

The counterpart test method is the following:

[TestMethod()]
public void IsValidEmailAddressTest()
{
    Assert.IsTrue("yellowdog@someemail.uk".IsValidEmailAddress());
    Assert.IsTrue("yellow.444@email4u.co.uk".IsValidEmailAddress());
    Assert.IsFalse("adfasdf".IsValidEmailAddress());
    Assert.IsFalse("asd@asdf".IsValidEmailAddress());
}

I have found a lot of UR validation functions, but not all of them seemed to be OK. This method is inspired by a Regular Expression written by bb, which seems to work fine.

/// <summary>
/// Checks if url is valid. 
/// from http://www.osix.net/modules/article/?id=586
/// and changed to match http://localhost
/// 
/// complete (not only http) url regex can be found 
/// at http://internet.ls-la.net/folklore/url-regexpr.html
/// </summary>
/// <param name="text"></param>

/// <returns></returns>
public static bool IsValidUrl(this string url)
{
    string strRegex = "^(https?://)"
+ "?(([0-9a-z_!~*'().&=+$%-]+: )?[0-9a-z_!~*'().&=+$%-]+@)?" //user@
+ @"(([0-9]{1,3}\.){3}[0-9]{1,3}" // IP- 199.194.52.184
+ "|" // allows either IP or domain
+ @"([0-9a-z_!~*'()-]+\.)*" // tertiary domain(s)- www.
+ @"([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]" // second level domain
+ @"(\.[a-z]{2,6})?)" // first level domain- .com or .museum is optional
+ "(:[0-9]{1,5})?" // port number- :80
+ "((/?)|" // a slash isn't required if there is no file name
+ "(/[0-9a-z_!~*'().;?:@&=+$,%#-]+)+/?)$";
    return new Regex(strRegex).IsMatch(url);
}

The counterpart test method is the following:

/// <summary>
///A test for IsValidUrl
///</summary>
[TestMethod()]
public void IsValidUrlTest()
{
    Assert.IsTrue("http://www.codeproject.com".IsValidUrl());
    Assert.IsTrue("https://www.codeproject.com/#some_anchor".IsValidUrl());
    Assert.IsTrue("https://localhost".IsValidUrl());
    Assert.IsTrue("http://www.abcde.nf.net/signs-banners.jpg".IsValidUrl());
    Assert.IsTrue("http://aa-bbbb.cc.bla.com:80800/test/" + 
                  "test/test.aspx?dd=dd&id=dki".IsValidUrl());
    Assert.IsFalse("http:wwwcodeprojectcom".IsValidUrl());
    Assert.IsFalse("http://www.code project.com".IsValidUrl());
}

I have written a third method to test if the user provides the existing homepage:

/// <summary>
/// Check if url (http) is available.
/// </summary>
/// <param name="httpUri">url to check</param>
/// <example>

/// string url = "www.codeproject.com;
/// if( !url.UrlAvailable())
///     ...codeproject is not available
/// </example>
/// <returns>true if available</returns>
public static bool UrlAvailable(this string httpUrl)
{
    if (!httpUrl.StartsWith("http://") || !httpUrl.StartsWith("https://"))
        httpUrl = "http://" + httpUrl;
    try
    {
        HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(httpUrl);
        myRequest.Method = "GET";
        myRequest.ContentType = "application/x-www-form-urlencoded";
        HttpWebResponse myHttpWebResponse = 
           (HttpWebResponse)myRequest.GetResponse();
        return true;
    }
    catch
    {
        return false;
    } 
}

The counterpart test method is the following:

public void UrlAvailableTest()
{
    Assert.IsTrue("www.codeproject.com".UrlAvailable());
    Assert.IsFalse("www.asjdfalskdfjalskdf.com".UrlAvailable());
}

The reversing string example can be found on Wikipedia. This version without the cycle looks better.

/// <summary>

/// Reverse the string
/// from http://en.wikipedia.org/wiki/Extension_method
/// </summary>
/// <param name="input"></param>
/// <returns></returns>
public static string Reverse(this string input)
{
    char[] chars = input.ToCharArray();
    Array.Reverse(chars);
    return new String(chars);
}

The counterpart test method is as follows:

public void ReverseTest()
{
    string input = "yellow dog";
    string expected = "god wolley";
    string actual = input.Reverse();
    Assert.AreEqual(expected, actual);
}

Sometimes, you need to provide a preview of a long text. This can be done using this Reduce extension method:

/// <summary>

/// Reduce string to shorter preview which is optionally ended by some string (...).
/// </summary>
/// <param name="s">string to reduce</param>
/// <param name="count">Length of returned string including endings.</param>
/// <param name="endings">optional edings of reduced text</param>

/// <example>
/// string description = "This is very long description of something";
/// string preview = description.Reduce(20,"...");
/// produce -> "This is very long..."
/// </example>
/// <returns></returns>

public static string Reduce(this string s, int count, string endings)
{
    if (count < endings.Length)
        throw new Exception("Failed to reduce to less then endings length.");
    int sLength = s.Length;
    int len = sLength;
    if (endings != null)
        len += endings.Length;
    if (count > sLength)
        return s; //it's too short to reduce
    s = s.Substring(0, sLength - len + count);
    if (endings != null)
        s += endings;
    return s;
}

The counterpart test method is the following:

[TestMethod()]
public void ReduceTest()
{
    string input = "The quick brown fox jumps over the lazy dog";
    int count = 10; 
    string endings = "...";
    string expected = "The qui...";
    string actual = input.Reduce(count, endings);
    Assert.AreEqual(expected, actual);
}

Sometimes you need to parse a phone number or a price, and the user might have interposed the string with spaces. To not boss the user about, and to avoid duplicating test conditions, you can use the RemoveSpaces extension method when parsing numbers.

/// <summary>
/// remove white space, not line end
/// Useful when parsing user input such phone,
/// price int.Parse("1 000 000".RemoveSpaces(),.....
/// </summary>
/// <param name="s"></param>

/// <param name="value">string without spaces</param>
public static string RemoveSpaces(this string s)
{
    return s.Replace(" ", "");
}

The counterpart test method is the following:

[TestMethod()]
public void RemoveSpacesTest()
{
    string input = "yellow dog" + Environment.NewLine  + "black cat";
    string expected = "yellowdog" + Environment.NewLine + "blackcat";
    string actual = input.RemoveSpaces();
    Assert.AreEqual(expected, actual);
}

If you need to ensure the user input to be a number and you want to be tolerant of the number format, use the IsNumber extension.

/// <summary>
/// true, if the string can be parse as Double respective Int32
/// Spaces are not considred.
/// </summary>
/// <param name="s">input string</param>

/// <param name="floatpoint">true, if Double is considered,
/// otherwhise Int32 is considered.</param>
/// <returns>true, if the string contains only digits or float-point</returns>
public static bool IsNumber(this string s, bool floatpoint)
{
    int i;
    double d;
    string withoutWhiteSpace = s.RemoveSpaces();
    if (floatpoint)
        return double.TryParse(withoutWhiteSpace, NumberStyles.Any,
            Thread.CurrentThread.CurrentUICulture , out d);
    else
        return int.TryParse(withoutWhiteSpace, out i);
}

The counterpart test method is the following:

[TestMethod()]
public void IsNumberTest()
{
    Thread.CurrentThread.CurrentUICulture = CultureInfo.InvariantCulture;

    Assert.IsTrue("12345".IsNumber(false));
    Assert.IsTrue("   12345".IsNumber(false));
    Assert.IsTrue("12.345".IsNumber(true));
    Assert.IsTrue("   12,345 ".IsNumber(true));
    Assert.IsTrue("12 345".IsNumber(false));
    Assert.IsFalse("tractor".IsNumber(true));
}

The more restrictive version of the IsNumber method is IsNumberOnly, which ensures that all characters are digits, possibly float point. This could also be done using LINQ via s.ToCharArray().Where(...).Count() == 0.

/// <summary>
/// true, if the string contains only digits or float-point.
/// Spaces are not considred.
/// </summary>
/// <param name="s">input string</param>

/// <param name="floatpoint">true, if float-point is considered</param>
/// <returns>true, if the string contains only digits or float-point</returns>
public static bool IsNumberOnly(this string s, bool floatpoint)
{
    s = s.Trim();
    if (s.Length == 0)
        return false;
    foreach (char c in s)
    {
        if (!char.IsDigit(c))
        {
            if (floatpoint && (c == '.' || c == ','))
                continue;
            return false;
        }
    }
    return true;
}

The counterpart test method is the following:

[TestMethod()]
public void IsNumberOnlyTest()
{
    Assert.IsTrue("12345".IsNumberOnly(false));
    Assert.IsTrue("   12345".IsNumberOnly(false));
    Assert.IsTrue("12.345".IsNumberOnly(true));
    Assert.IsTrue("   12,345 ".IsNumberOnly(true));
    Assert.IsFalse("12 345".IsNumberOnly(false));
    Assert.IsFalse("tractor".IsNumberOnly(true));
}

Michael Kaplan describes a very useful method for removing diacritics (accents) from strings. It is useful when implementing URL rewriting, and you need to generate valid and readable URLs.

/// <summary>
/// Remove accent from strings 
/// </summary>
/// <example>
///  input:  "Příliš žluťoučký kůň úpěl ďábelské ódy."
///  result: "Prilis zlutoucky kun upel dabelske ody."
/// </example>
/// <param name="s"></param>
/// <remarks>founded at http://stackoverflow.com/questions/249087/
/// how-do-i-remove-diacritics-accents-from-a-string-in-net</remarks>
/// <returns>string without accents</returns>

public static string RemoveDiacritics(this string s)
{
    string stFormD = s.Normalize(NormalizationForm.FormD);
    StringBuilder sb = new StringBuilder();

    for (int ich = 0; ich < stFormD.Length; ich++)
    {
        UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
        if (uc != UnicodeCategory.NonSpacingMark)
        {
            sb.Append(stFormD[ich]);
        }
    }
    return (sb.ToString().Normalize(NormalizationForm.FormC));
}

The counterpart test method is the following:

/// <summary>
///A test for RemoveDiacritics
///</summary>
[TestMethod()]
public void RemoveDiacriticsTest()
{
    //contains all czech accents
    ///  input:  "Příliš žluťoučký kůň úpěl ďábelské ódy."
    ///  result: "Prilis zlutoucky kun upel dabelske ody."
    string actual = input.RemoveDiacritics();
    Assert.AreEqual(expected, actual);
}

When I was programming in PHP, Nl2Br was a very useful PHP function. This one was posted by DigiMortal.

/// <summary>
/// Replace \r\n or \n by <br />
/// from http://weblogs.asp.net/gunnarpeipman/archive/2007/11/18/c-extension-methods.aspx
/// </summary>

/// <param name="s"></param>
/// <returns></returns>
public static string Nl2Br(this string s)
{
    return s.Replace("\r\n", "<br />").Replace("\n", "<br />");
}

The counterpart test method is the following:

[TestMethod()]
public void Nl2BrTest()
{
    string input = "yellow dog" + Environment.NewLine + "black cat";
    string expected = "yellow dog<br />black cat";
    string actual = input.Nl2Br();
    Assert.AreEqual(expected, actual);
}

The MD5 function can be used in almost every application.

/// <summary>
static MD5CryptoServiceProvider s_md5 = null;

/// from http://weblogs.asp.net/gunnarpeipman/archive/2007/11/18/c-extension-methods.aspx
/// </summary>
/// <param name="s"></param>
/// <returns></returns>
public static string MD5(this string s)
{
    if( s_md5 == null) //creating only when needed
        s_md5 = new MD5CryptoServiceProvider();
    Byte[] newdata = Encoding.Default.GetBytes(s);
    Byte[] encrypted = s_md5.ComputeHash(newdata);
    return BitConverter.ToString(encrypted).Replace("-", "").ToLower();
}

The counterpart test method is the following:

[TestMethod()]
public void MD5Test()
{
    string input = "The quick brown fox jumps over the lazy dog";
    string expected = "9e107d9d372bb6826bd81d3542a419d6";
    string actual = input.MD5();
    Assert.AreEqual(expected, actual);
}

Points of Interest

While writing this article, I have found an extensive library here. Unfortunately, some links don't work.

History

You must Sign In to use this message board.
 
 
Per page   
 FirstPrevNext
GeneralMy vote of 4
Paw Jershauge
6:35 19 Jan '10  
I see some dont understand the usage of extensions methods in the discussion area.
But I like what you have done here. so I will give you 4 out of 5.

keep up the good work.Thumbs Up Thumbs Up Thumbs Up Thumbs Up

With great code, comes great complexity, so keep it simple stupid...Shucks Shucks

GeneralMy vote of 1
Frederic Sivignon
23:57 13 Sep '09  
Good stuffs, but putting those extensions to the string class is not a good approach. For example, strings have nothing to deal with email addresses.
GeneralRe: My vote of 1
Paw Jershauge
6:29 19 Jan '10  
Well you couldnt bee further away from understanding this.
The idea here is to extend the usage of the string.
Its just like having int.Parse("123") here you could use the extension method to make an ToInt method on the string, this would simplify the code to look like this "123".ToInt().

Now this does not mean that its a good idea to use the method on "ABC".ToInt() this would ofcause fail, just like the int.Parse would.

By giving your vote of 1, that to me, just indicates that you have no idea of what you are talking about.

I would recommend learning and knowning the subject before voting. D'Oh! D'Oh! D'Oh! D'Oh!

With great code, comes great complexity, so keep it simple stupid...Shucks Shucks

GeneralMy vote of 1
jachymko
16:37 23 May '09  
ignoring problems with the individual methods, having this kind of stuff in String extensions is the most stupid idea I've seen in a while
GeneralRe: My vote of 1
Paw Jershauge
6:31 19 Jan '10  
Ones again: I would recommend learning and knowning the subject before voting.
You obviously dont know what you are talking about. D'Oh! D'Oh! D'Oh!

With great code, comes great complexity, so keep it simple stupid...Shucks Shucks

GeneralGood article
Donsw
7:05 8 Feb '09  
Good article. I will trys some of these now that I know they are here.
GeneralMy vote of 2
Rasqual Twilight
2:00 6 Dec '08  
many flaws in that approach
GeneralRe: My vote of 2
Paw Jershauge
6:32 19 Jan '10  
Ones again: I would recommend learning and knowning the subject before voting.
You obviously dont know what you are talking about. D'Oh! D'Oh! D'Oh!
But if you do... please post you point of view of the flaws to the Author...

With great code, comes great complexity, so keep it simple stupid...Shucks Shucks

GeneralStatic Regex.IsMatch() method
TobiasP
4:47 4 Dec '08  
The Regex class has both static and instance overloads of the IsMatch() method. If the static IsMatch(string input, string pattern) overload is used in the articles extension methods, you avoid creating new temporary instances of Regex on a couple of occasions. The result of the code would not change.
GeneralBullet list of methods in the Introduction
HC72
23:43 25 Nov '08  
Would be nice. Makes it faster for the reader. Of course, if they were URLs to the actual section in the article it would be even nicer. Thanks.
GeneralAbout MD5
ichramm
2:39 25 Nov '08  
I think that you have are making to much work on that function, and that is not good for performance

Here is the code that i use to encript using MD5, I hope it helps

// it is important the provider to be static
// because create an instance of MD5CryptoServiceProvider takes some time
// And you must ensure the best performance for your application
static MD5CryptoServiceProvider s_md5= new MD5CryptoServiceProvider();

string MD5(string data)
{
Byte[] newdata = Encoding.Default.GetBytes(data);
Byte[] encrypted = s_md5.ComputeHash(newdata);
return BitConverter.ToString(encrypted).Replace("-", "").ToLower();
}

Note that you can change the encoding to use

Saludos!!

____Juan

GeneralRe: About MD5
TobiasP
2:37 27 Nov '08  
On the other hand, if this method is put into a library when the MD5 method might never be used and the MD5CryptoServiceProvider is created unnecessarily. Perhaps creation of s_md5 on first use is a better alternative?

Speaking of performance though: Using the ToString() method of BitConverter requires one for-loop, Replace() probably at least one more, and ToLower() a third. I would assume that using the StringBuilder with a capacity given as an argument to the constructor and a for each-loop that appends the byte string (ToLower() is not needed, by the way - "x" should produce lowercase hexadecimal digits, "X" uppercase) would be more efficient, as it would only require one loop. However it might not be of much importance as the strings are so small, and using BitConverter makes the code easier to read.
GeneralRe: About MD5
Tomas Kubes
6:38 27 Nov '08  
I have already made correction, I hope it appears soon.
GeneralIsNumber is using InvariantCulture
x2develop.com
13:49 22 Nov '08  
The IsNumber method is using InvariantCulture, so it may fail with some culture specific numbers.
GeneralRe: IsNumber is using InvariantCulture
Tomas Kubes
16:40 22 Nov '08  
Yes, but I think that it is more likely that in some specific cultures will be working with , and . together.
GeneralRe: IsNumber is using InvariantCulture
x2develop.com
23:44 22 Nov '08  
That's not true. I.e. 1,23.45 is valid for english, but not for czech. That's simplest case.

Jiri {x2} Cincura

GeneralRe: IsNumber is using InvariantCulture
TobiasP
2:04 27 Nov '08  
In my opinion any such interpretation method should use the current culture rather than the invariant culture, if not otherwise specified (as an additional argument). That is how parsing method usually work in .NET. IsNumberOnly is even more flawed: It would consider, e.g., ",,," as a valid floating point number Smile . It could use the current cultures NumberFormatInfo to look for the NumberDecimalSeparator instead, and ensure that at most one such separator occurs in the string, and that digits occur before and/or after, but using TryParse as in IsNumber seems much easier and more reliable anyway as long as we talk about a number that fits into the standard numerical types.
GeneralRe: IsNumber is using InvariantCulture
Tomas Kubes
6:22 27 Nov '08  
I have already made correction, I hope it appears soon.
General....and why not?
Rob Philpott
6:06 18 Nov '08  
Good stuff. Extension methods worry me slightly being an old purist but as you point out they have their place here.

Perhaps checking for valid urls and email addresses would better be done in specialised classes or as ordinary methods rather 'specialising' one of the most primitive types. You could find your intellisense getting very cluttered with this.

The two things I miss most in .NET regarding strings are the good old fashioned Left and Right functions which seem to have been replaced with the more precarious and fiddly Substring method. These could be easily implemented like this and I'm surprised you missed them.

Substring. Bah. Looks like a Javaism. Don't get me started on Java. Why the Hell do they .CompareTo rather than just use == ??

Regards,
Rob Philpott.

GeneralRe: ....and why not?
Ramon Smits
23:58 18 Nov '08  
I agree..

This is NOT the correct way to validate string values.
GeneralRe: ....and why not?
Tomas Kubes
12:44 21 Nov '08  
And why not?
GeneralUrl validation is invalid
megger83
5:40 18 Nov '08  
I didn't investigated your code too much, but I found a bug in the url validation. The part "(:[0-9]{1,4})?" of the regex is invalid, because it limits the portrange to 9999, but the allowed portrange on TCP is 0 to 65535. A quick fix would be, to alter the part to "(:[0-9]{1,5})?", but this would also match at a port of 99999, which is obviously invalid. So you have to write a more complex regex to accomplish a valid validation Wink

I also think, creating a directory if it doesn't exist, isn't a role of the string class but of the System.IO.Directory class.
GeneralRe: Url validation is invalid
Aiscrim
6:01 18 Nov '08  
megger83 wrote:
I also think, creating a directory if it doesn't exist, isn't a role of the string class but of the System.IO.Directory class.

And in fact, the System.IO.Directory already does it, so this extension method is completely useless.

From the online help for Directory.CreateDirectory, Remarks section:

"Any and all directories specified in path are created, unless they already exist or unless some part of path is invalid. The path parameter specifies a directory path, not a file path. If the directory already exists, this method does nothing."
GeneralRe: Url validation is invalid
Tomas Kubes
6:25 18 Nov '08  
Hi, you are right. I lived in miscue that I can call that function only if the directory doesn't exist. I am going to fix it. It is useless then.
GeneralJust one comment
Dmitri Nesteruk
5:20 18 Nov '08  
http://wwwcodeprojectcom is a valid URL. Imagine you have a server called WWWCODEPROJECTCOM on a LAN and you want people to access it.


Last Updated 27 Nov 2008 | Advertise | Privacy | Terms of Use | Copyright © CodeProject, 1999-2010