Click here to Skip to main content
Click here to Skip to main content

Tagged as

String Tokenizer class

, 8 Apr 2000
Rate this:
Please Sign up or sign in to vote.
A customizable string tokenizer.

Introduction

Here is a customizable string tokenizer class. You can attach it to a CString object and you can tokenize the respective string. You can customize the tokenizing at a very high level, you can set up which characters can be used for words, whitespace chars, numbers and so on. If you are familiar with the StreamTokenizer class from Java, you already know how to use this class.

The tokenizer class is called CStringTokenizer. By default, this class is initialized with standard parsing parameters, but you can reset that and customize it the way you want it to work. For using the tokenizer class, the string you pass has to be terminated with FILE_EOF char. This is because the algorithm can be used for streams. (If you modify the class, you can adapt it to use a stream instead of a string.) For example, a simple parsing can be as:

m_sSampleString += FILE_EOF;
CStringTokenizer tokenizer(m_sSampleString);
while (TT_EOF != tokenizer.NextToken ())
{
    m_sResultString+= tokenizer.GetStrValue () + "\r\n";
}

// eliminate the added EOF to the end of the string
m_sSampleString = m_sSampleString.Left (m_sSampleString.GetLength ()-1);

You need to include: StringTokenizer.h.

The class's public interface is:

public:
  // Public functions
  
  // Constructor, specify the string asociated
  CStringTokenizer(CString& string);
  virtual ~CStringTokenizer();    // Destructor
  double GetNumValue(); // Get the numeric value of the token
  void PascalComments(BOOL bFlag); // Enable / disable Pascal comments
  virtual CString GetStrValue(); // Get the str value
  void QuoteChar(int ch); // Specify the quote char
  int LineNo(); // Get the curent line number
  virtual void PushBack(); // Get's back one token (can go back only once)
  virtual int NextToken(); // Get the next token
  void LowerCaseMode(BOOL bFlag); // enable/disable case sensitive
  void SlSlComments(BOOL bFlag); // // coments
  void SlStComments(BOOL bFlag); // /* comments
  void EolIsSignificant(BOOL bFlag); // consider EOL as token or not
  void ParseNumbers(); // set upt the parsing so that will parse characters
  
  // Character Type Setting Functions
  
  // Reset the syntax so that characters are asigned no special meanings
  void ResetSyntax(); 
  // Set the chars in the range as chars that can be used for words
  void WordChars(int cLow, int cHi); 
  // Set the chars in the range as whitespace chars
  void WhiteSpaceChars(int cLow, int cHi);
  // Set the chars in the range as ordinary chars  
  void OrdinaryChars(int cLow, int cHi); 
  // Set the char as ordinary
  void OrdinaryChar(int ch);
  // Set the char as a char used for commenting 
  void CommentChar(int ch); 

Notes

The class is using the specified string directly, not a copy of it.

If you find any bugs, please send me a report to: zolyfarkas@mail.usa.com.

That's it!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Zoly Farkas

United States United States
No Biography provided

Comments and Discussions

 
GeneralBug PinmemberErik_G3-Nov-09 22:29 
QuestionBug? Pinmemberaslak23-Apr-03 22:28 
GeneralSome strings missing _T macro Pinmemberstormcoder28-Dec-01 11:55 
in StringTokenizer.cpp lines 619 and 622 are missing _T macros for unicode. I noticed because I am working on CE software.Eek! | :eek:
GeneralSuggestions PinmemberDuncan Strand4-Sep-01 10:27 
GeneralRe: Suggestions PinmemberDuncan Strand4-Sep-01 10:44 
QuestionHow can use it to parse & evaluate a function? PinsussBulent Ozkir5-Oct-00 22:21 
AnswerRe: How can use it to parse & evaluate a function? PinsussAnonymous4-Feb-03 23:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140821.2 | Last Updated 9 Apr 2000
Article Copyright 2000 by Zoly Farkas
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid