Don't count spaces when counting words.
I also use a Regex expression to count words, which returns the same number of words as MS Word. I wrap the Regular Expression in a String extension method to make it easy to use.public static class StringExtensions{ /// /// WordCounts Regular Expression /// ...
I also use a Regex expression to count words, which returns the same number of words as MS Word. I wrap the Regular Expression in a String
extension method to make it easy to use.
public static class StringExtensions
{
/// <summary>
/// WordCounts Regular Expression
/// </summary>
private const string WordCountRegex = @"[^\s!?¡¿\-\–]+";
/// <summary>
/// Static WordCounts Regular Expression Object
/// </summary>
private static Regex regexWordCounts = new Regex(WordCountRegex,
RegexOptions.Compiled | RegexOptions.Multiline);
/// <summary>
/// Returns the number of words in a given <paramref name="sentence" />
/// </summary>
/// <param name="sentence">Text in which to count words</param>
/// <returns>Number of words, or zero if regular expression failed</returns>
public static int WordCounts(this string sentence)
{
try
{
MatchCollection matchCollection = regexWordCounts.Matches(sentence);
return matchCollection.Count;
}
catch
{
return 0;
}
}
}
Taking the samples above, this would give the following:
string input =
"The total number of words \t this sentence is 10.";
int wordCounts = input.WordCounts(); //Returns 9
input = "Mr O'Brien-Smith arrived at 8.30 and spent \t $1,000.99";
int wordCounts = input.WordCounts(); //Returns 9
Hope this helps.