|
#ifndef __TOKENIZER_H__
#define __TOKENIZER_H__
#include <istream>
using namespace std;
// CTokenizer is a base class defining how tokenization should proceed.
class CTokenizer
{
public:
// Constructor:
// REQUIREMENTS:
// An istream successfully opened for reading.
// PROMISES:
// The object will be ready to return tokens with GetNextToken if HasMoreToken return true.
CTokenizer(istream &inputStream) throw();
// Destructor:
// REQUIREMENTS:
// None.
// PROMISES:
// None.
virtual ~CTokenizer() throw();
// GetNextToken():
// REQUIREMENTS:
// HasMoreToken() must have returned true for this call to return an actual token, otherwise, it returns an empty string.
// PROMISES:
// The next token from the input stream.
virtual string GetNextToken() throw() = 0;
// HasMoreToken():
// REQUIREMENTS:
// None.
// PROMISES:
// If the return value is true, GetNextToken() will return a token, otherwise, no more tokens are available from the input stream.
virtual bool HasMoreToken() throw() = 0;
protected:
istream &m_stream;
};
#endif
|
By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.
If a file you wish to view isn't highlighted, and is a text file (not binary), please
let us know and we'll add colourisation support for it.
Philippe Roy was a key contributor throughout his 20+ years career with many high-profile companies such as Nuance Communications, IBM (ViaVoice and ProductManager), VoiceBox Technologies, just to name a few. He is creative and proficient in OO coding and design, knowledgeable about the intellectual-property world (he owns many patents), tri-lingual, and passionate about being part of a team that creates great solutions.
Oh yes, I almost forgot to mention, he has a special thing for speech recognition and natural language processing... The magic of first seeing a computer transform something as chaotic as sound and natural language into intelligible and useful output has never left him.