Click here to Skip to main content
15,888,527 members
Articles / Programming Languages / C
Tip/Trick

Find a Word in a String

Rate me:
Please Sign up or sign in to vote.
4.83/5 (7 votes)
14 Mar 2014CPOL 16.9K   12   6
Finding a substring is trivial, but is it a word?

Introduction

There is no C library function to find the first occurrence of a word in a string. When using strstr(), a code defect occurs when the match finds a value that is not in a complete word.

Using the Code

Link with shlwapi.lib to get StrStrI():

In header

C++
#define IN 
#define OPTIONAL
#include <windows.h>
#include <shlwapi.h>

char * StrStrWord 
    ( IN char *pcSearched
    , IN const char *pcWordToFind
    , IN const size_t nSearchedSize
    , OPTIONAL IN int bUseCase = 1
    , OPTIONAL IN char *pcWordChars = NULL);   

In C++ module

C++
/*=======*=========*=========*=========*=========*=========*=========*=========*
* FUNCTION: StrStrWord
* -----------------------------------------------------------------------------
* \brief
*    Point to the word in the search string
*
* PARAMETERS:
*    pcSearched: IN. The string to search
*    pcWordToFind: IN. The string to find
*    nSearchedSize: IN. The sizeof of the searched string
*    bUseCase: IN. Flag for case insensitive search
*    pcWordChars. IN. A list of valid chars in a word
*
* RETURN VALUE: Pointer to location of word in the search string
*    or NULL if string not found
*--------+---------+---------+---------+---------+---------+---------+--------*/
char * StrStrWord
    (IN char *pcSearched
    , IN const char *pcWordToFind
    , IN const size_t nSearchedSize
    , OPTIONAL IN int bUseCase // = 1
    , OPTIONAL IN char *pcWordChars // = NULL
) {
    static const char acWord[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_";
    if (!pcSearched || !pcWordToFind) 
        return NULL;
    size_t nLenSearchStr = strnlen(pcWordToFind, nSearchedSize);
    size_t nLenWordStr = strnlen(pcWordToFind, nSearchedSize); 
    if (nSearchedSize < nLenSearchStr)
        return NULL;
     const char *pcCharsInWord;
    if (pcWordChars == NULL)
        pcCharsInWord = acWord;
    else
        pcCharsInWord = pcWordChars;
     if (bUseCase) {
        for (char *pc = strstr(pcSearched, pcWordToFind)
            ; pc
            ; pc = strstr(pc + 1, pcWordToFind))
        {
            const char *pcSepEnd = strchr(pcCharsInWord, pc[nLenWordStr]);
            if (!pcSepEnd) {
                if (pc == pcSearched)
                    return pc;
                const char *pcSepStart = strchr(pcCharsInWord, pc[-1]);
                if (!pcSepStart)
                    return pc;
            }
        }
    }
    else {
        for (char *pc = StrStrI(pcSearched, pcWordToFind)
            ; pc
            ; pc = StrStrI(pc + 1, pcWordToFind))
        {
            const char *pcSepEnd = strchr(pcCharsInWord, pc[nLenWordStr]);
            if (!pcSepEnd) {
                if (pc == pcSearched)
                    return pc;
                const char *pcSepStart = strchr(pcCharsInWord, pc[-1]);
                if (!pcSepStart)
                    return pc;<br />            }
        }
    }
    return NULL;
} // StrStrWord()  

History

  • 14th March, 2014: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) Independant Software Developer
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralMy vote of 5 Pin
Volynsky Alex23-Mar-14 10:07
professionalVolynsky Alex23-Mar-14 10:07 
GeneralMy vote of 4 Pin
Volynsky Alex23-Mar-14 10:07
professionalVolynsky Alex23-Mar-14 10:07 
QuestionIt is a bug? Pin
maomx16-Mar-14 15:19
maomx16-Mar-14 15:19 
AnswerRe: It is a bug? Pin
mef52616-Mar-14 22:27
mef52616-Mar-14 22:27 
Question[My vote of 1] Useless Pin
.:floyd:.15-Mar-14 11:53
.:floyd:.15-Mar-14 11:53 
AnswerRe: [My vote of 1] Useless Pin
mef52616-Mar-14 0:05
mef52616-Mar-14 0:05 
Thank you for your criticism. My original code was using strstr() and a bug was revealed when the user added a tag to parse that started with the same characters as another tag.

For example, the tag I was searching for was "FieldName" but a new tag was added that preceded it in the string "FieldNameAbs". So strstr() found the wrong tag and the analysis was wrong.

I used this code to solve the problem but forgot to check the beginning of the word to verify strstr() didn't match in the middle of a word. I updated the sample to reflect the improvement you spotted.

The reason I didn't use a regex is that this is very old code and I wanted the solution to be as lightweight as possible.

The program is using ASCII characters so I had no reason to implement as multi-byte chars.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.