Click here to Skip to main content
14,699,107 members
Please Sign up or sign in to vote.
4.13/5 (5 votes)
See more:
Hi,


I want to know about how many words in a text file?
please help me?
Posted

If you are doing this using C#, read the file into a string or a buffer and then apply a regular expression to locate words within the file's content. Finally, just count the matches returned. Faster than looping through all the characters.

using System;
using System.Text.RegularExpressions;
public CountWordsProgram
{
  public static int CountWords(string completeText)
  {
    MatchCollection collection = Regex.Matches(completeText, @"[\S]+");
    return collection.Count;
  }
  public static void Main(string[] args)
  {
    Console.WriteLine( CountWordsProgram.CountWords(File.ReadAllText( @"filename.txt" )) );
  }
}
   
v4
A naive example.
#include <iostream>
#include <fstream>
using namespace std;

bool in_word_set(char c)
{ // here you should define the set of characters allowed in a word
  return  ( c >= 'A' && c <= 'Z' ) || (c>='a' && c<='z');
}

int main()
{
  ifstream ifs;
  ifs.open("foo.txt");
  char c;
  int words = 0;
  bool inside = false;
  while ( ifs.get(c).good())
  {
    if (inside) 
    {
      if ( ! in_word_set(c) )  inside = false;
    }
    else
    {
      if (  in_word_set(c) )
      {
        inside = true;
        words++;
      }
    }
  }
  ifs.close();
  cout << words << endl;
  return 0;
}
   
You can try to refine this crude, fast example:
private static int CountWords(string textFileName)
{
    string textFileContents = File.ReadAllText(textFileName);
    char[] separators = new char[] { ' ', '\t', '\n' };
    return textFileContents.Split(separators, StringSplitOptions.RemoveEmptyEntries).Length;
}
   
Please remember
that some text files may provide only two "tabs" too :)
   
Read each character in the file. For each 'space', 'tab', 'newline' increment the count by 1. Final value of count is the number of words in the file. :)
   
using System.IO;


string[] strfile = File.ReadAllLines(@"C:\Documents and Settings\Gratiff\Desktop\meeting.txt");
for (int i=0;i<strfile.Length;i++)
{
    string[] strword=strfile[i].Split(' ');
    for (int j = 0; j < strword.Length; j++)
    {
        if (strword[j].ToString() != "")
        {
            intcount++;
        }
    }

}
MessageBox.Show(Convert.ToString(intcount));
   
v4

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900