Click here to Skip to main content
12,453,873 members (83,115 online)
Click here to Skip to main content
Add your own
alternative version

Stats

98.8K views
22 bookmarked
Posted

String Tokenizer Iterator Class

, 26 Jun 2002 Public Domain
Rate this:
Please Sign up or sign in to vote.
A string tokenizer iterator class that works with std::string
<!-- Add the rest of your HTML here -->

Introduction

As a part of a larger project I had to write some basic string utility functions and classes. One of the things needed was a flexible way of splitting strings into separate tokens.

As is often the case when it comes to programming, there are different ways to handle a problem like this. After reviewing my options I decided that an iterator based solution would be flexible enough for my needs.

Non-iterator based solutions to this particular problem often have the disadvantage of tying the user to a certain container type. With an iterator based tokenizer the programmer is free to chose any type of container (or no container at all). Many STL containers such as std::list and std::vector offer constructors that can populate the container from a set of iterators. This feature makes it very easy to use the tokenizer.

Example usage

std::vector<std::string> s(string_token_iterator("one two three"),
                             string_token_iterator());
std::copy(s.begin(),
          s.end(),
          std::ostream_iterator<std::string>(std::cout,"\n"));
// output:
// one
// two
// three

std::copy(string_token_iterator("one,two..,..three",",."),
          string_token_iterator(),
          std::ostream_iterator<std::string>(std::cout,"\n"));
// same output as above

The code has been tested with Visual C++.NET and GCC 3.

The Code

#include <string>
#include <iterator>

struct string_token_iterator 
  : public std::iterator<std::input_iterator_tag, std::string>
{
public:
  string_token_iterator() : str(0), start(0), end(0) {}
  string_token_iterator(const std::string & str_, const char * separator_ = " ") :
    separator(separator_),
    str(&str_),
    end(0)
  {
    find_next();
  }
  string_token_iterator(const string_token_iterator & rhs) :
    separator(rhs.separator),
    str(rhs.str),
    start(rhs.start),
    end(rhs.end)
  {
  }

  string_token_iterator & operator++()
  {
    find_next();
    return *this;
  }

  string_token_iterator operator++(int)
  {
    string_token_iterator temp(*this);
    ++(*this);
    return temp;
  }

  std::string operator*() const
  {
    return std::string(*str, start, end - start);
  }

  bool operator==(const string_token_iterator & rhs) const
  {
    return (rhs.str == str && rhs.start == start && rhs.end == end);
  }

  bool operator!=(const string_token_iterator & rhs) const
  {
    return !(rhs == *this);
  }

private:

  void find_next(void)
  {
    start = str->find_first_not_of(separator, end);
    if(start == std::string::npos)
    {
      start = end = 0;
      str = 0;
      return;
    }

    end = str->find_first_of(separator, start);
  }

  const char * separator;
  const std::string * str;
  std::string::size_type start;
  std::string::size_type end;
};

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication

Share

About the Author

Daniel Andersson
Web Developer
Sweden Sweden
No Biography provided

You may also be interested in...

Pro
Pro

Comments and Discussions

 
QuestionWhat license? Pin
Michael Broida5-Oct-12 4:57
memberMichael Broida5-Oct-12 4:57 
AnswerRe: What license? Pin
Daniel Andersson5-Oct-12 5:02
memberDaniel Andersson5-Oct-12 5:02 
GeneralRe: What license? Pin
Michael Broida5-Oct-12 6:11
memberMichael Broida5-Oct-12 6:11 
GeneralMy vote of 3 Pin
cccfff77730-Jun-10 4:49
membercccfff77730-Jun-10 4:49 
GeneralSubtle Bug Pin
mollevp31-Jul-06 4:59
membermollevp31-Jul-06 4:59 
AnswerRe: Subtle Bug Pin
Daniel Andersson13-Aug-06 22:37
memberDaniel Andersson13-Aug-06 22:37 
QuestionA little trouble with the basics? Pin
mtwombley4-Jul-04 17:49
membermtwombley4-Jul-04 17:49 
AnswerRe: A little trouble with the basics? Pin
Daniel Andersson4-Jul-04 21:53
memberDaniel Andersson4-Jul-04 21:53 
GeneralRe: A little trouble with the basics? Pin
mtwombley5-Jul-04 9:39
membermtwombley5-Jul-04 9:39 
GeneralRe: A little trouble with the basics? Pin
Daniel Andersson5-Jul-04 21:32
memberDaniel Andersson5-Jul-04 21:32 
QuestionTwo bugs - or features? Pin
RealSkydiver31-Mar-04 23:18
memberRealSkydiver31-Mar-04 23:18 
AnswerRe: Two bugs - or features? Pin
Daniel Andersson31-Mar-04 23:56
memberDaniel Andersson31-Mar-04 23:56 
GeneralRe: Two bugs - or features? Pin
RealSkydiver1-Apr-04 21:15
memberRealSkydiver1-Apr-04 21:15 
GeneralRe: Two bugs - or features? Pin
Daniel Andersson1-Apr-04 21:23
memberDaniel Andersson1-Apr-04 21:23 
GeneralRe: Two bugs - or features? Pin
tidi23-Sep-04 23:02
membertidi23-Sep-04 23:02 
GeneralRe: Two bugs - or features? Pin
tidi23-Sep-04 23:50
membertidi23-Sep-04 23:50 
GeneralNice Pin
Giles23-Sep-03 6:28
memberGiles23-Sep-03 6:28 
GeneralRe: Nice Pin
mouratos14-Oct-06 14:30
membermouratos14-Oct-06 14:30 
GeneralProblems with vector Pin
bss25-Feb-03 22:33
sussbss25-Feb-03 22:33 
GeneralRe: Problems with vector Pin
Daniel Andersson26-Feb-03 4:11
memberDaniel Andersson26-Feb-03 4:11 
GeneralRe: Problems with vector Pin
SaurweinAndreas15-Sep-03 18:16
memberSaurweinAndreas15-Sep-03 18:16 
This seems to be not a problem with VC6 but rather with the STL implementation that comes with VC6. I'm using stlPort and it works nicely with the vector constructor.

Just in case you didnt notice: VC6's STL sucks. Laugh | :laugh:


Finally moved to Brazil
Generalmake it unicode aware Pin
Anonymous27-Jun-02 18:25
memberAnonymous27-Jun-02 18:25 
GeneralRe: make it unicode aware Pin
Anonymous23-Oct-02 7:56
sussAnonymous23-Oct-02 7:56 
GeneralVery good article (and a pointer to some more stuff) Pin
Joaquín M López Muñoz27-Jun-02 9:22
memberJoaquín M López Muñoz27-Jun-02 9:22 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.160826.1 | Last Updated 27 Jun 2002
Article Copyright 2002 by Daniel Andersson
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid