Click here to Skip to main content
Licence 
First Posted 26 Jun 2002
Views 87,622
Bookmarked 19 times

String Tokenizer Iterator Class

By | 26 Jun 2002 | Article
A string tokenizer iterator class that works with std::string

Introduction

As a part of a larger project I had to write some basic string utility functions and classes. One of the things needed was a flexible way of splitting strings into separate tokens.

As is often the case when it comes to programming, there are different ways to handle a problem like this. After reviewing my options I decided that an iterator based solution would be flexible enough for my needs.

Non-iterator based solutions to this particular problem often have the disadvantage of tying the user to a certain container type. With an iterator based tokenizer the programmer is free to chose any type of container (or no container at all). Many STL containers such as std::list and std::vector offer constructors that can populate the container from a set of iterators. This feature makes it very easy to use the tokenizer.

Example usage

    
std::vector<std::string> s(string_token_iterator("one two three"),
                             string_token_iterator());
std::copy(s.begin(),
          s.end(),
          std::ostream_iterator<std::string>(std::cout,"\n"));
// output:
// one
// two
// three

std::copy(string_token_iterator("one,two..,..three",",."),
          string_token_iterator(),
          std::ostream_iterator<std::string>(std::cout,"\n"));
// same output as above

The code has been tested with Visual C++.NET and GCC 3.

The Code

#include <string>
#include <iterator>

struct string_token_iterator 
  : public std::iterator<std::input_iterator_tag, std::string>
{
public:
  string_token_iterator() : str(0), start(0), end(0) {}
  string_token_iterator(const std::string & str_, const char * separator_ = " ") :
    separator(separator_),
    str(&str_),
    end(0)
  {
    find_next();
  }
  string_token_iterator(const string_token_iterator & rhs) :
    separator(rhs.separator),
    str(rhs.str),
    start(rhs.start),
    end(rhs.end)
  {
  }

  string_token_iterator & operator++()
  {
    find_next();
    return *this;
  }

  string_token_iterator operator++(int)
  {
    string_token_iterator temp(*this);
    ++(*this);
    return temp;
  }

  std::string operator*() const
  {
    return std::string(*str, start, end - start);
  }

  bool operator==(const string_token_iterator & rhs) const
  {
    return (rhs.str == str && rhs.start == start && rhs.end == end);
  }

  bool operator!=(const string_token_iterator & rhs) const
  {
    return !(rhs == *this);
  }

private:

  void find_next(void)
  {
    start = str->find_first_not_of(separator, end);
    if(start == std::string::npos)
    {
      start = end = 0;
      str = 0;
      return;
    }

    end = str->find_first_of(separator, start);
  }

  const char * separator;
  const std::string * str;
  std::string::size_type start;
  std::string::size_type end;
};

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Daniel Andersson

Web Developer

Sweden Sweden

Member



Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
GeneralMy vote of 3 Pinmembercccfff7774:49 30 Jun '10  
GeneralSubtle Bug Pinmembermollevp4:59 31 Jul '06  
AnswerRe: Subtle Bug PinmemberDaniel Andersson22:37 13 Aug '06  
QuestionA little trouble with the basics? Pinmembermtwombley17:49 4 Jul '04  
AnswerRe: A little trouble with the basics? PinmemberDaniel Andersson21:53 4 Jul '04  
GeneralRe: A little trouble with the basics? Pinmembermtwombley9:39 5 Jul '04  
GeneralRe: A little trouble with the basics? PinmemberDaniel Andersson21:32 5 Jul '04  
QuestionTwo bugs - or features? PinmemberRealSkydiver23:18 31 Mar '04  
AnswerRe: Two bugs - or features? PinmemberDaniel Andersson23:56 31 Mar '04  
GeneralRe: Two bugs - or features? PinmemberRealSkydiver21:15 1 Apr '04  
GeneralRe: Two bugs - or features? PinmemberDaniel Andersson21:23 1 Apr '04  
GeneralRe: Two bugs - or features? Pinmembertidi23:02 23 Sep '04  
GeneralRe: Two bugs - or features? Pinmembertidi23:50 23 Sep '04  
GeneralNice PinmemberGiles6:28 23 Sep '03  
GeneralRe: Nice Pinmembermouratos14:30 14 Oct '06  
GeneralProblems with vector Pinsussbss22:33 25 Feb '03  
GeneralRe: Problems with vector PinmemberDaniel Andersson4:11 26 Feb '03  
GeneralRe: Problems with vector PinmemberSaurweinAndreas18:16 15 Sep '03  
Generalmake it unicode aware PinmemberAnonymous18:25 27 Jun '02  
GeneralRe: make it unicode aware PinsussAnonymous7:56 23 Oct '02  
GeneralVery good article (and a pointer to some more stuff) PinmemberJoaquín M López Muñoz9:22 27 Jun '02  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web02 | 2.5.120517.1 | Last Updated 27 Jun 2002
Article Copyright 2002 by Daniel Andersson
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid