Click here to Skip to main content
13,287,834 members (53,477 online)
Click here to Skip to main content
Add your own
alternative version


32 bookmarked
Posted 8 Apr 2000

String Tokenizer class

, 8 Apr 2000
Rate this:
Please Sign up or sign in to vote.
A customizable string tokenizer.


Here is a customizable string tokenizer class. You can attach it to a CString object and you can tokenize the respective string. You can customize the tokenizing at a very high level, you can set up which characters can be used for words, whitespace chars, numbers and so on. If you are familiar with the StreamTokenizer class from Java, you already know how to use this class.

The tokenizer class is called CStringTokenizer. By default, this class is initialized with standard parsing parameters, but you can reset that and customize it the way you want it to work. For using the tokenizer class, the string you pass has to be terminated with FILE_EOF char. This is because the algorithm can be used for streams. (If you modify the class, you can adapt it to use a stream instead of a string.) For example, a simple parsing can be as:

m_sSampleString += FILE_EOF;
CStringTokenizer tokenizer(m_sSampleString);
while (TT_EOF != tokenizer.NextToken ())
    m_sResultString+= tokenizer.GetStrValue () + "\r\n";

// eliminate the added EOF to the end of the string
m_sSampleString = m_sSampleString.Left (m_sSampleString.GetLength ()-1);

You need to include: StringTokenizer.h.

The class's public interface is:

  // Public functions
  // Constructor, specify the string asociated
  CStringTokenizer(CString& string);
  virtual ~CStringTokenizer();    // Destructor
  double GetNumValue(); // Get the numeric value of the token
  void PascalComments(BOOL bFlag); // Enable / disable Pascal comments
  virtual CString GetStrValue(); // Get the str value
  void QuoteChar(int ch); // Specify the quote char
  int LineNo(); // Get the curent line number
  virtual void PushBack(); // Get's back one token (can go back only once)
  virtual int NextToken(); // Get the next token
  void LowerCaseMode(BOOL bFlag); // enable/disable case sensitive
  void SlSlComments(BOOL bFlag); // // coments
  void SlStComments(BOOL bFlag); // /* comments
  void EolIsSignificant(BOOL bFlag); // consider EOL as token or not
  void ParseNumbers(); // set upt the parsing so that will parse characters
  // Character Type Setting Functions
  // Reset the syntax so that characters are asigned no special meanings
  void ResetSyntax(); 
  // Set the chars in the range as chars that can be used for words
  void WordChars(int cLow, int cHi); 
  // Set the chars in the range as whitespace chars
  void WhiteSpaceChars(int cLow, int cHi);
  // Set the chars in the range as ordinary chars  
  void OrdinaryChars(int cLow, int cHi); 
  // Set the char as ordinary
  void OrdinaryChar(int ch);
  // Set the char as a char used for commenting 
  void CommentChar(int ch); 


The class is using the specified string directly, not a copy of it.

If you find any bugs, please send me a report to:

That's it!


This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


About the Author

Zoly Farkas
United States United States
No Biography provided

You may also be interested in...

Comments and Discussions

GeneralBug Pin
Erik_G3-Nov-09 23:29
memberErik_G3-Nov-09 23:29 
QuestionBug? Pin
aslak23-Apr-03 23:28
memberaslak23-Apr-03 23:28 
The string "1word" is split into two tokens. That is not what I expected! Is it possible to avoid this?
GeneralSome strings missing _T macro Pin
stormcoder28-Dec-01 12:55
memberstormcoder28-Dec-01 12:55 
GeneralSuggestions Pin
Duncan Strand4-Sep-01 11:27
memberDuncan Strand4-Sep-01 11:27 
GeneralRe: Suggestions Pin
Duncan Strand4-Sep-01 11:44
memberDuncan Strand4-Sep-01 11:44 
QuestionHow can use it to parse & evaluate a function? Pin
Bulent Ozkir5-Oct-00 23:21
sussBulent Ozkir5-Oct-00 23:21 
AnswerRe: How can use it to parse & evaluate a function? Pin
Anonymous5-Feb-03 0:04
sussAnonymous5-Feb-03 0:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.171207.1 | Last Updated 9 Apr 2000
Article Copyright 2000 by Zoly Farkas
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid