Click here to Skip to main content
15,883,647 members
Articles / Programming Languages / C++

Boolean Text Search Queries and their Processing

Rate me:
Please Sign up or sign in to vote.
4.83/5 (12 votes)
20 May 2011CPOL7 min read 45.6K   1.2K   46  
This article describes the development of the library for performing text search based on Boolean search queries.
//
// ACMachine.hpp
//

#ifndef AHO_CORASICK_MACHINE_HPP_INCLUDED
#define AHO_CORASICK_MACHINE_HPP_INCLUDED

#include <string>

#include <queue>
#include <hash_map>
#include <algorithm>
#include <set>


class Expression;
struct ACPMMachineState;

typedef std::pair< std::wstring, Expression* > Match;
typedef std::set< Match > MatchList;

typedef stdext::hash_map< wchar_t, ACPMMachineState* > StatesHash;
typedef std::queue< ACPMMachineState* > StatesQueue;


struct ACPMMachineState {
    ACPMMachineState();
    ACPMMachineState( const ACPMMachineState* parent, wchar_t character );

	/* Character, associated with this state*/
    wchar_t character_;

	/* Child states */
    StatesHash transitions_;

	/* For acceptance states - contains patterns, associated with this state */
    MatchList output_;

	/* Parent state */
    const ACPMMachineState* parent_;

	/* Failure transition state */
    ACPMMachineState* failure_;
};

class ACPMMachine {
public:
    ACPMMachine();
	~ACPMMachine();

	/* Adds a string to the prefix tree */
    void addPattern( const std::wstring& pattern, Expression* associatedObject );

	/* Computes failure transitions for each state */
    void computeStates();

	/* Changes current FSM state */
    bool feedCharacter( wchar_t );

	/* Returns list of patterns associated with current state*/
    const MatchList& getCurrentMatch() const;

	/* Resets current state to root state*/
    void reset();
    
private:
    ACPMMachine( const ACPMMachine& );
    ACPMMachine& operator= ( const ACPMMachine& );

private:
    static ACPMMachineState* FindTransition ( ACPMMachineState* state, wchar_t character );
    static void AddTransition ( ACPMMachineState* from, ACPMMachineState* to );
private:
    ACPMMachineState* currentState_;
    ACPMMachineState* root_;
}; 

#endif // AHO_CORASICK_MACHINE_HPP_INCLUDED

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Chief Technology Officer Apriorit Inc.
United States United States
ApriorIT is a software research and development company specializing in cybersecurity and data management technology engineering. We work for a broad range of clients from Fortune 500 technology leaders to small innovative startups building unique solutions.

As Apriorit offers integrated research&development services for the software projects in such areas as endpoint security, network security, data security, embedded Systems, and virtualization, we have strong kernel and driver development skills, huge system programming expertise, and are reals fans of research projects.

Our specialty is reverse engineering, we apply it for security testing and security-related projects.

A separate department of Apriorit works on large-scale business SaaS solutions, handling tasks from business analysis, data architecture design, and web development to performance optimization and DevOps.

Official site: https://www.apriorit.com
Clutch profile: https://clutch.co/profile/apriorit
This is a Organisation

33 members

Comments and Discussions