Click here to Skip to main content
15,868,004 members
Articles / Programming Languages / C++
Article

Simple string parsing in nested loops

Rate me:
Please Sign up or sign in to vote.
4.71/5 (19 votes)
14 Dec 2004CPOL2 min read 108.3K   855   19   5
Fast string parsing in nested loops.

Introduction

Parsing strings is a simple operation, and can be done using the C function strtok which is a function of the C run time library. It can help in finding string tokens in a fast and simple way like that:

#define DELIMITERS    " \r\n\t!@#$%^&*()_+-={}|\\:\"'?¿/.,<>’¡º×÷‘"

char string[] = "A string\tof ,,tokens\nand some  more tokens";
char* token = strtok(string, DELIMITERS);
while(token != NULL)
{    // While there are tokens in "string"
    // ...
    // doing some thing with token
    // ...
    // Get next token
    token = strtok(NULL, DELIMITERS);
}

Problems

But with this way or this function, you will face many problems like:

  1. You can't get the delimiter char that delimits this token, as the strtok function inserts '0' at token end, so the input string is modified.
  2. You can't use this function in nested loops as the function strtok is using a static variable to hold some passing information, as you can see in the help note:

    Note: Each function uses a static variable for parsing the string into tokens. If multiple or simultaneous calls are made to the same function, a high potential for data corruption and inaccurate results exists. Therefore, do not attempt to call the same function simultaneously for different strings, and be aware of calling one of these functions from within a loop where another routine may be called that uses the same function. However, calling this function simultaneously from multiple threads does not have undesirable effects.

  3. You can't parse strings for sequence of delimiters, like a delimiter that contains many characters, but they should appear in sequence.

Solution

So, I have built the class CStrTok to solve all of these problems, specially the second problem of the usage of a static variable; just encapsulate it in a class like this.

class CStrTok
{
public:
    CStrTok();
    ~CStrTok();
public:
    LPSTR m_lpszNext;
    char m_chDelimiter;
    // ... some attributes
public:
    LPSTR GetFirst(LPSTR lpsz, LPCSTR lpcszDelimiters);
    LPSTR GetNext(LPCSTR lpcszDelimiters);
    // ... some functions
};

The variable m_lpszNext is used to hold the next token to be parsed, and the variable m_chDelimiter is used to hold the delimiter that was ending the current token, to be returned after the next call of GetNext, so the class can be used in nested loops without any problems, as you can see:

CStrTok Usage

// code to parse tab delimited text files
CStrTok StrTok[3];
StrTok[0].m_bDelimitersInSequence = true; // for "\r\n"
// parse file buffer for rows and columns
char* pRow = StrTok[0].GetFirst(pFileBuffer, "\r\n");
while(pRow)
{
    // parse the row
    char* pCol = StrTok[1].GetFirst(pRow, "\t");
    while(pCol)
    {
        // parse the col
        char* pToken = StrTok[2].GetFirst(pCol, " ,;");
        while(pToken)
        {
            // ... using pToken
            pToken = StrTok[2].GetNext(" ,;");
        }
        // get next column
        pCol = StrTok[1].GetNext("\t");
    }
    // get next row
    pRow = StrTok[0].GetNext("\r\n");
}

I think you will find it so easy to use.

Source code files

StrTok.cpp, StrTok.h

Thanks to...

I owe a lot to my colleagues for helping me in implementing and testing this code. (JAK)

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Egypt Egypt

Comments and Discussions

 
GeneralString Parsing Pin
Anonymous11-Oct-05 22:28
Anonymous11-Oct-05 22:28 
GeneralClassic strtok method Pin
m0w8-Jun-05 22:24
m0w8-Jun-05 22:24 
GeneralVery Good Pin
Hing18-Mar-05 21:45
Hing18-Mar-05 21:45 
GeneralRe: Very Good Pin
michaelinp18-Nov-08 22:46
michaelinp18-Nov-08 22:46 
Generalstrtok() as lifted from the C standard lib Pin
mef52628-Dec-04 19:35
mef52628-Dec-04 19:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.