Click here to Skip to main content
Click here to Skip to main content
Go to top

Simple string parsing in nested loops

, 14 Dec 2004
Rate this:
Please Sign up or sign in to vote.
Fast string parsing in nested loops.

Introduction

Parsing strings is a simple operation, and can be done using the C function strtok which is a function of the C run time library. It can help in finding string tokens in a fast and simple way like that:

#define DELIMITERS    " \r\n\t!@#$%^&*()_+-={}|\\:\"'?¿/.,<>’¡º×÷‘"

char string[] = "A string\tof ,,tokens\nand some  more tokens";
char* token = strtok(string, DELIMITERS);
while(token != NULL)
{    // While there are tokens in "string"
    // ...
    // doing some thing with token
    // ...
    // Get next token
    token = strtok(NULL, DELIMITERS);
}

Problems

But with this way or this function, you will face many problems like:

  1. You can't get the delimiter char that delimits this token, as the strtok function inserts '0' at token end, so the input string is modified.
  2. You can't use this function in nested loops as the function strtok is using a static variable to hold some passing information, as you can see in the help note:

    Note: Each function uses a static variable for parsing the string into tokens. If multiple or simultaneous calls are made to the same function, a high potential for data corruption and inaccurate results exists. Therefore, do not attempt to call the same function simultaneously for different strings, and be aware of calling one of these functions from within a loop where another routine may be called that uses the same function. However, calling this function simultaneously from multiple threads does not have undesirable effects.

  3. You can't parse strings for sequence of delimiters, like a delimiter that contains many characters, but they should appear in sequence.

Solution

So, I have built the class CStrTok to solve all of these problems, specially the second problem of the usage of a static variable; just encapsulate it in a class like this.

class CStrTok
{
public:
    CStrTok();
    ~CStrTok();
public:
    LPSTR m_lpszNext;
    char m_chDelimiter;
    // ... some attributes
public:
    LPSTR GetFirst(LPSTR lpsz, LPCSTR lpcszDelimiters);
    LPSTR GetNext(LPCSTR lpcszDelimiters);
    // ... some functions
};

The variable m_lpszNext is used to hold the next token to be parsed, and the variable m_chDelimiter is used to hold the delimiter that was ending the current token, to be returned after the next call of GetNext, so the class can be used in nested loops without any problems, as you can see:

CStrTok Usage

// code to parse tab delimited text files
CStrTok StrTok[3];
StrTok[0].m_bDelimitersInSequence = true; // for "\r\n"
// parse file buffer for rows and columns
char* pRow = StrTok[0].GetFirst(pFileBuffer, "\r\n");
while(pRow)
{
    // parse the row
    char* pCol = StrTok[1].GetFirst(pRow, "\t");
    while(pCol)
    {
        // parse the col
        char* pToken = StrTok[2].GetFirst(pCol, " ,;");
        while(pToken)
        {
            // ... using pToken
            pToken = StrTok[2].GetNext(" ,;");
        }
        // get next column
        pCol = StrTok[1].GetNext("\t");
    }
    // get next row
    pRow = StrTok[0].GetNext("\r\n");
}

I think you will find it so easy to use.

Source code files

StrTok.cpp, StrTok.h

Thanks to...

I owe a lot to my colleagues for helping me in implementing and testing this code. (JAK)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author


Comments and Discussions

 
GeneralString Parsing PinsussAnonymous11-Oct-05 22:28 
GeneralClassic strtok method Pinmemberm0w8-Jun-05 22:24 
GeneralVery Good PinmemberHing18-Mar-05 21:45 
GeneralRe: Very Good Pinmembermichaelinp18-Nov-08 22:46 
Generalstrtok() as lifted from the C standard lib Pinmembermef52628-Dec-04 19:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140905.1 | Last Updated 14 Dec 2004
Article Copyright 2004 by Hatem Mostafa
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid