Click here to Skip to main content
15,887,350 members
Articles / Programming Languages / C++
Article

String Tokenizer Iterator Class

Rate me:
Please Sign up or sign in to vote.
4.00/5 (6 votes)
26 Jun 2002Public Domain 115.5K   23   24
A string tokenizer iterator class that works with std::string

Introduction

As a part of a larger project I had to write some basic string utility functions and classes. One of the things needed was a flexible way of splitting strings into separate tokens.

As is often the case when it comes to programming, there are different ways to handle a problem like this. After reviewing my options I decided that an iterator based solution would be flexible enough for my needs.

Non-iterator based solutions to this particular problem often have the disadvantage of tying the user to a certain container type. With an iterator based tokenizer the programmer is free to chose any type of container (or no container at all). Many STL containers such as std::list and std::vector offer constructors that can populate the container from a set of iterators. This feature makes it very easy to use the tokenizer.

Example usage

std::vector<std::string> s(string_token_iterator("one two three"),
                             string_token_iterator());
std::copy(s.begin(),
          s.end(),
          std::ostream_iterator<std::string>(std::cout,"\n"));
// output:
// one
// two
// three

std::copy(string_token_iterator("one,two..,..three",",."),
          string_token_iterator(),
          std::ostream_iterator<std::string>(std::cout,"\n"));
// same output as above

The code has been tested with Visual C++.NET and GCC 3.

The Code

#include <string>
#include <iterator>

struct string_token_iterator 
  : public std::iterator<std::input_iterator_tag, std::string>
{
public:
  string_token_iterator() : str(0), start(0), end(0) {}
  string_token_iterator(const std::string & str_, const char * separator_ = " ") :
    separator(separator_),
    str(&str_),
    end(0)
  {
    find_next();
  }
  string_token_iterator(const string_token_iterator & rhs) :
    separator(rhs.separator),
    str(rhs.str),
    start(rhs.start),
    end(rhs.end)
  {
  }

  string_token_iterator & operator++()
  {
    find_next();
    return *this;
  }

  string_token_iterator operator++(int)
  {
    string_token_iterator temp(*this);
    ++(*this);
    return temp;
  }

  std::string operator*() const
  {
    return std::string(*str, start, end - start);
  }

  bool operator==(const string_token_iterator & rhs) const
  {
    return (rhs.str == str && rhs.start == start && rhs.end == end);
  }

  bool operator!=(const string_token_iterator & rhs) const
  {
    return !(rhs == *this);
  }

private:

  void find_next(void)
  {
    start = str->find_first_not_of(separator, end);
    if(start == std::string::npos)
    {
      start = end = 0;
      str = 0;
      return;
    }

    end = str->find_first_of(separator, start);
  }

  const char * separator;
  const std::string * str;
  std::string::size_type start;
  std::string::size_type end;
};

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication


Written By
Web Developer
Sweden Sweden
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionWhat license? Pin
Michael Broida5-Oct-12 4:57
Michael Broida5-Oct-12 4:57 
AnswerRe: What license? Pin
Daniel Andersson5-Oct-12 5:02
Daniel Andersson5-Oct-12 5:02 
GeneralRe: What license? Pin
Michael Broida5-Oct-12 6:11
Michael Broida5-Oct-12 6:11 
GeneralMy vote of 3 Pin
cccfff77730-Jun-10 4:49
cccfff77730-Jun-10 4:49 
GeneralSubtle Bug Pin
mollevp31-Jul-06 4:59
mollevp31-Jul-06 4:59 
AnswerRe: Subtle Bug Pin
Daniel Andersson13-Aug-06 22:37
Daniel Andersson13-Aug-06 22:37 
QuestionA little trouble with the basics? Pin
mtwombley4-Jul-04 17:49
mtwombley4-Jul-04 17:49 
Hi I'm just getting back into C++, and my template knowledge is not very good.

I can't figure out why this doesn't work

	std::string test;<br />
<br />
//>> is just to mark the code in question<br />
>>	string_token_iterator myIt("one two three");<br />
<br />
	test = *myIt;


If I trace the code specifically at '>>' myIt doesn't contain a str with "one two three".
It looks to me as if the copy constructor didn't copy.

The thing about this that causes me extra confusion is that this does work

std::string test;<br />
test = *(string_token_iterator("one two three"));


If I can't get the first method to work I can't see how to use the ++ operator.

Thanks for your help.

D'Oh! | :doh: Mark Twombley
I never said I knew it all (and not much at all, I might add).
AnswerRe: A little trouble with the basics? Pin
Daniel Andersson4-Jul-04 21:53
Daniel Andersson4-Jul-04 21:53 
GeneralRe: A little trouble with the basics? Pin
mtwombley5-Jul-04 9:39
mtwombley5-Jul-04 9:39 
GeneralRe: A little trouble with the basics? Pin
Daniel Andersson5-Jul-04 21:32
Daniel Andersson5-Jul-04 21:32 
QuestionTwo bugs - or features? Pin
RealSkydiver31-Mar-04 23:18
RealSkydiver31-Mar-04 23:18 
AnswerRe: Two bugs - or features? Pin
Daniel Andersson31-Mar-04 23:56
Daniel Andersson31-Mar-04 23:56 
GeneralRe: Two bugs - or features? Pin
RealSkydiver1-Apr-04 21:15
RealSkydiver1-Apr-04 21:15 
GeneralRe: Two bugs - or features? Pin
Daniel Andersson1-Apr-04 21:23
Daniel Andersson1-Apr-04 21:23 
GeneralRe: Two bugs - or features? Pin
tidi23-Sep-04 23:02
tidi23-Sep-04 23:02 
GeneralRe: Two bugs - or features? Pin
tidi23-Sep-04 23:50
tidi23-Sep-04 23:50 
GeneralNice Pin
Giles23-Sep-03 6:28
Giles23-Sep-03 6:28 
GeneralRe: Nice Pin
mouratos14-Oct-06 14:30
mouratos14-Oct-06 14:30 
GeneralProblems with vector Pin
bss25-Feb-03 22:33
bss25-Feb-03 22:33 
GeneralRe: Problems with vector Pin
Daniel Andersson26-Feb-03 4:11
Daniel Andersson26-Feb-03 4:11 
GeneralRe: Problems with vector Pin
Andreas Saurwein15-Sep-03 18:16
Andreas Saurwein15-Sep-03 18:16 
Generalmake it unicode aware Pin
27-Jun-02 18:25
suss27-Jun-02 18:25 
GeneralRe: make it unicode aware Pin
Anonymous23-Oct-02 7:56
Anonymous23-Oct-02 7:56 
GeneralVery good article (and a pointer to some more stuff) Pin
Joaquín M López Muñoz27-Jun-02 9:22
Joaquín M López Muñoz27-Jun-02 9:22 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.