Introduction
Here is a customizable string tokenizer class. You can attach it to a CString
object and you can tokenize the respective string. You can customize the tokenizing at a very high level, you can set up which characters can be used for words, whitespace chars, numbers and so on. If you are familiar with the StreamTokenizer
class from Java, you already know how to use this class.
The tokenizer class is called CStringTokenizer
. By default, this class is initialized with standard parsing parameters, but you can reset that and customize it the way you want it to work. For using the tokenizer class, the string you pass has to be terminated with FILE_EOF
char. This is because the algorithm can be used for streams. (If you modify the class, you can adapt it to use a stream instead of a string.) For example, a simple parsing can be as:
m_sSampleString += FILE_EOF;
CStringTokenizer tokenizer(m_sSampleString);
while (TT_EOF != tokenizer.NextToken ())
{
m_sResultString+= tokenizer.GetStrValue () + "\r\n";
}
m_sSampleString = m_sSampleString.Left (m_sSampleString.GetLength ()-1);
You need to include: StringTokenizer.h.
The class's public interface is:
public:
CStringTokenizer(CString& string);
virtual ~CStringTokenizer();
double GetNumValue();
void PascalComments(BOOL bFlag);
virtual CString GetStrValue();
void QuoteChar(int ch);
int LineNo();
virtual void PushBack();
virtual int NextToken();
void LowerCaseMode(BOOL bFlag);
void SlSlComments(BOOL bFlag);
void SlStComments(BOOL bFlag);
void EolIsSignificant(BOOL bFlag);
void ParseNumbers();
void ResetSyntax();
void WordChars(int cLow, int cHi);
void WhiteSpaceChars(int cLow, int cHi);
void OrdinaryChars(int cLow, int cHi);
void OrdinaryChar(int ch);
void CommentChar(int ch);
Notes
The class is using the specified string directly, not a copy of it.
If you find any bugs, please send me a report to: zolyfarkas@mail.usa.com.
That's it!
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.