Click here to Skip to main content
Click here to Skip to main content

Aho-Corasick string matching in C#

By , 3 Dec 2005
 

Introduction

In this article, I will describe the implementation of an efficient Aho-Corasick algorithm for pattern matching. In simple words, this algorithm can be used for searching a text for specified keywords. The following code is useful when you have a set of keywords and you want to find all occurrences of a keywords in the text or check if any of the keywords is present in the text. You should use this algorithm especially if you have a large number of keywords that don't change often, because in this case, it is much more efficient than other algorithms that can be simply implemented using the .NET class library.

Aho-Corasick algorithm

In this section, I'll try to describe the concept of this algorithm. For more information and for a more exact explanation, please take a look at the links at the end of this article. The algorithm consists of two parts. The first part is the building of the tree from keywords you want to search for, and the second part is searching the text for the keywords using the previously built tree (state machine). Searching for a keyword is very efficient, because it only moves through the states in the state machine. If a character is matching, it follows goto function otherwise it follows fail function.

Tree building

In the first phase of the tree building, keywords are added to the tree. In my implementation, I use the class StringSearch.TreeNode, which represents one letter. The root node is used only as a place holder and contains links to other letters. Links created in this first step represents the goto function, which returns the next state when a character is matching.

During the second phase, the fail and output functions are found. The fail function is used when a character is not matching and the output function returns the found keywords for each reached state. For example, in the text "SHIS", the failure function is used to exit from the "SHE" branch to "HIS" branch after the first two characters (because the third character is not matching). During the second phase, the BFS (breadth first search) algorithm is used for traversing through all the nodes. Functions are calculated in this order, because the fail function of the specified node is calculated using the fail function of the parent node.

Building of the keyword tree (figure 1 - after the first step, figure 2 - tree with the fail function)

Searching

As I already mentioned, searching only means traversing the previously built keyword tree (state machine). To demonstrate how this algorithm works, let's look at the commented method which returns all the matches of the specified keywords:

// Searches passed text and returns all occurrences of any keyword
// Returns array containing positions of found keywords
public StringSearchResult[] FindAll(string text)
{
  ArrayList ret=new ArrayList(); // List containing results
  TreeNode ptr=_root;            // Current node (state)
  int index=0;                   // Index in text

  // Loop through characters
  while(index<text.Length)
  {
    // Find next state (if no transition exists, fail function is used)
    // walks through tree until transition is found or root is reached
    TreeNode trans=null;
    while(trans==null)
    {
      trans=ptr.GetTransition(text[index]);
      if (ptr==_root) break;
      if (trans==null) ptr=ptr.Failure;
    }
    if (trans!=null) ptr=trans;

    // Add results from node to output array and move to next character
    foreach(string found in ptr.Results)
      ret.Add(new StringSearchResult(index-found.Length+1,found));
    index++;
  }
  
  // Convert results to array
  return (StringSearchResult[])ret.ToArray(typeof(StringSearchResult));
}

Algorithm complexity

Complexity of the first part is not so important, because it is executed only once. Complexity of the second part is O(m+z) where m is the length of the text and z is the number of found keywords (in simple words, it is very fast and it's speed doesn't drop quickly for longer texts or many keywords).

Performance comparison

To show how efficient this algorithm is, I created a test application which compares this algorithm with two other simple methods that can be used for this purpose. The first algorithm uses the String.IndexOf method to search the text for all the keywords, and the second algorithm uses regular expressions - for example, for keywords he, she, and his, it creates a regular expression (he|she|his). The following graphs show the results of tests for two texts of different sizes. The number of used keywords is displayed on the X axis and the time of search is displayed on the Y axis.

The interesting thing is that for less than 70 keywords, it is better to use a simple method using String.IndexOf. Regular expressions are almost always slower than other algorithms. I also tried compiling the test under both .NET 1.1 and .NET 2.0 to see the difference. Although my measuring method may not be very precise, it looks like .NET 2.0 is a bit faster (about 5-10%), and the method with regular expressions gives much better results (about 60% faster).

Two charts comparing the speed of the three described algorithms - Aho-Corasick (green), IndexOf (blue), and Regex (yellow)

How to use the code

I decided to implement this algorithm when I had to ban some words in a community web page (vulgarisms etc.). This is a typical use case because searching should be really fast, but blocked keywords don't change often (and the creation of the keyword tree can be slower).

The search algorithm is implemented in a file StringSearch.cs. I created the interface that represents any search algorithm (so it is easy to replace it with another implementation). This interface is called IStringSearchAlgorithm, and it contains a property Keywords (gets or sets keywords to search for) and methods for searching. The method FindAll returns all the keywords in the passed text, and FindFirst returns the first match. Matches are represented by the StringSearchResult structure that contains the found keyword and its position in the text. The last method is ContainsAny, which returns true when the passed text contains a keyword. The class that implements the Aho-Corasick algorithm is called StringSearch.

Initialization

The following example shows how to load keywords from a database and create a SearchAlgorithm instance:

// Initialize DB connection
SqlConnection conn = new SqlConnection(connectionString);
SqlCommand cmd = new SqlCommand("SELECT BlockedWord" + 
                                " FROM BlockedWords",conn);
conn.Open();

// Read list of banned words
ArrayList listWords = new ArrayList();
using(SqlDataReader reader = 
  cmd.ExecuteReader(CommandBehavior.CloseConnection))
{
  while(reader.Read()) 
    listWords.Add(myReader.GetString(0));
}
string[] arrayWords = (string[])listWords.ToArray(typeof(string));

// Create search algorithm instance
IStringSearchAlgorithm searchAlg = new StringSearch();
searchAlg.Keywords = arrayWords;

You can also use the StringSearch constructor which takes an array of keywords as parameter.

Searching

Searching the passed text for keywords is even easier. The following sample shows how to write all the matches to the console output:

// Find all matching keywords  
StringSearchResult[] results=searchAlg.FindAll(textToSearch);

// Write all results  
foreach(StringSearchResult r in results)
{
  Console.WriteLine("Keyword='{0}', Index={1}", r.Keyword, r.Index);
}

Conclusion

This implementation of the Aho-Corasick search algorithm is very efficient if you want to find a large number of keywords in a text of any length, but if you want to search only for a few keywords, it is better to use a simple method like String.IndexOf. The code can be compiled in both .NET 1.1 and .NET 2.0 without any modifications. If you want to learn more about this algorithm, take a look at the link in the next section, it was very useful for me during the implementation of the algorithm and explains the theory behind this algorithm.

Links and references

Future work and history

  • 12/03/2005 - First version of this article published at CodeProject.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Tomas Petricek
Czech Republic Czech Republic
Member
I live in Prague, the capital city of Czech republic (most of the time Smile | :) ). I've been very interested in functional programming recently and I have a passion for the new Microsoft F# language. I'm writing a book about Functional Programming in the Real World that shows the ideas using examples in C# 3.0 and F#.
 
I've been Microsoft MVP (for C#) since 2004 and I'm one of the most active members of the F# community. I'm a computer science student at Charles University of Prague. My hobbies include photography, fractals and of course many things related to computers (except fixing them). My favorite book writers are Terry Pratchett and Philip K Dick and I like paintings by M. C. Escher.
 
PS: My favorite codeproject icon is Sheep | [baah] .

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralIndeed ver nice job ...memberWilenX10 Sep '07 - 7:42 
However, it is highly inefficient using C#'s GUI TreeNode for aho-corasick searching.
 
Say you have a very long list of keywords you want to search, and so generating the tree in that case would result in larger memory consuming.
 
I'd very like to see that implemented in a memory efficient way.
GeneralRe: Indeed ver nice job ...memberTomas Petricek10 Sep '07 - 7:55 
Hi, the TreeNode class that is being used isn't of course the one from WinForms. It is a simple nested class and is part of the source code.
 
Homepage: TomasP.net | Photo of the month: Calendar | C# and LINQ, F#, Phalanger: My Blog
Latest article: Phalanger, PHP for .NET: Introduction for .NET developers

QuestionI need some helpmemberstefan stoian12 Sep '06 - 5:02 
I have the following situation:
2 tables containing data like bellow:
 
table 1:
id date dur dest_number orig_number country_code
53438505 2006-07-20 00:34:56.000 281 6581393705 4084289843 65
53580862 2006-07-20 21:07:00.000 86 6581507886 4084289843 65
53583601 2006-07-20 21:32:34.000 1070 6581507886 4084289843 65
 

table 2:
id date dest_number orig_number durat. country_code
98400146 2006-07-19 22:01:24.000 0064403726 4084289843 60 00065
98400251 2006-07-19 22:04:09.000 0064403726 4084289843 42 00065
98412892 2006-07-20 00:26:39.000 0064403726 4084289843 30 00065
98412909 2006-07-20 00:28:00.000 0064403726 4084289843 66 00065
98412937 2006-07-20 00:29:15.000 0064403726 4084289843 54 00065
98412965 2006-07-20 00:34:51.000 0081393705 4084289843 300 00065
98471254 2006-07-20 21:06:48.000 0081507886 4084289843 114 00065
98471536 2006-07-20 21:32:27.000 0081507886 4084289843 1086 00065
 
For every record in table 1 i need to find 1 or more matches in table 2. Matching must be between dest_number. The problem is tha in table 2 dest_number comes sometime malformed like in this case..
Do you have any ideea how to do this? I looked for some algorithms but i did not find a matching one for this situation.
 
Thank you in very much
GeneralSpecialize to "Find if all in string"memberl_d_allan28 Feb '06 - 4:43 
Thanks for provding this.
 
Can this code (or a derivative) detect if all search tokens are in a string, and quit as soon as this has been determined? My impression is that the existing code continues to the end of the string and finds all matches of all tokens.
 
Here is a scenario: a 40 meg text file is in a memory buffer and split into about 300,000 lines. I want to make a list of those lines which contain all of the search words. The memory buffer will be searched repeatedly for different groups of tokens.
 
Suppose line 1000 is:
"In this article, I will describe the implementation of an efficient Aho-Corasick algorithm for pattern matching. In simple words, this algorithm can be used for searching a text for specified keywords. The following code is useful when you have a set of keywords and you want to find all occurrences of a keywords in the text or check if any of the keywords is present in the text. You should use this algorithm especially if you have a large number of keywords that don't change often, because in this case, it is much more efficient than other algorithms that can be simply implemented using the .NET class library."
 
If the search tokens were "simple" and "this", then this line can be considered a "match" after about 120 characters (not all 617), and move on to the next line.
 
I wonder if there is a way for the code to stop looking for the token "this" after it has been found (about position 8), and proceed to only look for "simple". Can a token be removed part way thru a search if this would speed up the search (maybe it wouldn't speed up the search more than a minor amount, if at all)?
 
Another possible optimization would be to switch to something like Boyer-Moore-Horspool for the rest of the string when there was only one token left (but that is getting pretty complicated).
 
Can a call such as DetectWhetherAllTokensExist be done with the existing code? If so, how? If not, what modifications would be involved?
 
Thanks again.

GeneralRe: Specialize to "Find if all in string"memberLior Kaduri14 Jun '06 - 2:06 
I know this may be WAY too late - but look at my post below 'Some improvements'. I think it may just what you are (were Smile | :) ) looking for..
GeneralVery strange tests...memberIvan A. Gusev12 Jan '06 - 22:46 
It seems that you test only ContainsAny method, but Aho-Corasick algorithm is used in cases when you want to find ALL occurencess of a set of patterns within a text.
I slightly changed your performance tests (I call Regex.Matches instead of Match) and it seems that Regex works faster (!).
Could you comment this behavior?
GeneralGreat! some improvementsmemberLior Kaduri3 Jan '06 - 23:26 
Great work! this is just what I was looking for! Big Grin | :-D
 
However, since I needed a high-performance, case-insensitive searcher, I made some modifications:
1. Improved performance of char-hashtable by using a custom Char-Hashtable (no boxing/unboxing).
2. Added FindAllUnique(..) methods for searching unique occurances.
3. Added optional upper limit to FinAllxxx methods ('find no more than X occurances')
4. Search is now case-insensitive
 
Here is the new class:

/* Aho-Corasick text search algorithm implementation
*
* For more information visit
* - http://www.cs.uku.fi/~kilpelai/BSA05/lectures/slides04.pdf
* also at: http://www.codeproject.com/useritems/ahocorasick.asp
*
* modified by lior kaduri for an optimized Hashtable and unique keywords
*/
using System.Collections;
using System.Collections.Specialized;
 
namespace EeekSoft.Text.New
{
///
/// Interface containing all methods to be implemented
/// by string search algorithm
///

public interface IStringSearchAlgorithm
{
#region Methods & Properties
 
///
/// List of keywords to search for
///

string[] Keywords { get; set; }

 
///
/// Searches passed text and returns all occurrences of any keyword
///

/// Text to search
/// Array of occurrences
StringSearchResult[] FindAll(string text);
 
///
/// Searches passed text and returns all occurrences of any keyword, but no more than
/// a given limit
///

/// Text to search
/// Array of occurrences
StringSearchResult[] FindAll(string text, int maxKeywords);

///
/// Searches passed text and returns all the unique occurrences of any keyword
///

/// Text to search
/// Array of occurrences
StringSearchResult[] FindAllUnique(string text);
 
///
/// Searches passed text and returns all the unique occurrences of any keyword, but no more than
/// a given limit
///

/// Text to search
/// Array of occurrences
StringSearchResult[] FindAllUnique(string text, int maxUniqueKeywords);

///
/// Searches passed text and returns first occurrence of any keyword
///

/// Text to search
/// First occurrence of any keyword (or StringSearchResult.Empty if text doesn't contain any keyword)
StringSearchResult FindFirst(string text);
 
///
/// Searches passed text and returns true if text contains any keyword
///

/// Text to search
/// True when text contains any keyword
bool ContainsAny(string text);
 
#endregion
}
 
///
/// Structure containing results of search
/// (keyword and position in original text)
///

public struct StringSearchResult
{
#region Members

private int _index;
private string _keyword;
 
///
/// Initialize string search result
///

/// Index in text
/// Found keyword
public StringSearchResult(int index,string keyword)
{
_index=index; _keyword=keyword;
}
 

///
/// Returns index of found keyword in original text
///

public int Index
{
get { return _index; }
}
 

///
/// Returns keyword found by this result
///

public string Keyword
{
get { return _keyword; }
}
 

///
/// Returns empty search result
///

public static StringSearchResult Empty
{
get { return new StringSearchResult(-1,""); }
}
 
#endregion
}
 

///
/// Case-Insensitive Class for searching string for one or multiple
/// keywords using efficient Aho-Corasick search algorithm
///

public class CIStringSearch : IStringSearchAlgorithm
{
#region Objects
 
///
/// Tree node representing character and its
/// transition and failure function
///

class TreeNode
{
#region Constructor & Methods
 
///
/// Initialize tree node with specified character
///

/// Parent node
/// Character
public TreeNode(TreeNode parent,char c)
{

_char=c;
_parent=parent;
_results=new ArrayList();
_resultsAr=new string[] {};
 
_transitionsAr=new TreeNode[] {};
_transHash=new TreeNodeByCharHash();
}
 

///
/// Adds pattern ending in this node
///

/// Pattern
public void AddResult(string result)
{
if (_results.Contains(result)) return;
_results.Add(result);
_resultsAr=(string[])_results.ToArray(typeof(string));
}
 
///
/// Adds trabsition node
///

/// Node
public void AddTransition(TreeNode node)
{
_transHash.Add(node.Char,node);
TreeNode[] ar=new TreeNode[_transHash.Values.Count];
_transHash.Values.CopyTo(ar,0);
_transitionsAr=ar;
}
 

///
/// Returns transition to specified character (if exists)
///

/// Character
/// Returns TreeNode or null
public TreeNode GetTransition(char c)
{
return _transHash[c];
}
 

///
/// Returns true if node contains transition to specified character
///

/// Character
/// True if transition exists
public bool ContainsTransition(char c)
{
return GetTransition(c)!=null;
}
 
#endregion
#region Properties

private char _char;
private TreeNode _parent;
private TreeNode _failure;
private ArrayList _results;
private TreeNode[] _transitionsAr;
private string[] _resultsAr;
private TreeNodeByCharHash _transHash;
 
///
/// Character
///

public char Char
{
get { return _char; }
}
 

///
/// Parent tree node
///

public TreeNode Parent
{
get { return _parent; }
}
 

///
/// Failure function - descendant node
///

public TreeNode Failure
{
get { return _failure; }
set { _failure=value; }
}
 

///
/// Transition function - list of descendant nodes
///

public TreeNode[] Transitions
{
get { return _transitionsAr; }
}
 

///
/// Returns list of patterns ending by this letter
///

public string[] Results
{
get { return _resultsAr; }
}
 
#endregion
}
 
// Simple hash table for
// This table is used instead of System.Collections.Hashtable
// for performance reasons
class TreeNodeByCharHash
{
// Used internally by this class
private struct HashItem
{
// Hash key
public char Key;
// Value
public TreeNode Value;
}
 
private HashItem[][] Items;
private int Count;
private IList m_values;
 
public TreeNodeByCharHash()
{
Items = new HashItem[256][];
m_values = new ArrayList();
}

// Returns value specified by Key
// and null if the Key isn't found
public TreeNode Get(char Key)
{
int HashedKey = Key % 256;
if (Items[HashedKey] != null)
{
// The most likely variant
if (Items[HashedKey][0].Key == Key)
return Items[HashedKey][0].Value;
for(int i = 1; i < Items[HashedKey].Length; i++)
if (Items[HashedKey][i].Key == Key)
return Items[HashedKey][i].Value;
}
return null;
}
 
// Adds a key-value pair to the hash table
// If the key is already present in the table this method does nothing
public void Add(char Key, TreeNode Value)
{
if (Get(Key) != null)
return;
int HashedKey = Key % 256;
HashItem HI = new HashItem();
HI.Key = Key;
HI.Value = Value;
if (Items[HashedKey] == null)
Items[HashedKey] = new HashItem[1] {HI};
else
{
HashItem[] NewItems = new HashItem[Items[HashedKey].Length+1];
Items[HashedKey].CopyTo(NewItems, 0);
NewItems[Items[HashedKey].Length] = HI;
Items[HashedKey] = NewItems;
}
Count++;
m_values.Add(Value);
}
 
public TreeNode this[char key]
{
get { return this.Get(key);}
}
 
public IList Values
{
get { return m_values; }
}
}
 
#endregion
#region Local fields

///
/// Root of keyword tree
///

private TreeNode _root;
 
///
/// Keywords to search for
///

private string[] _keywords;
 
#endregion
 
#region Initialization

///
/// Initialize search algorithm (Build keyword tree)
///

/// Keywords to search for
public CIStringSearch(string[] keywords)
{
Keywords=keywords;
}
 

///
/// Initialize search algorithm with no keywords
/// (Use Keywords property)
///

public CIStringSearch()
{ }
 
#endregion
#region Implementation
 
///
/// Build tree from specified keywords
///

void BuildTree()
{
// Build keyword tree and transition function
_root=new TreeNode(null,' ');
foreach(string keyword in _keywords)
{
string keywordLower = keyword.ToLower();
// add pattern to tree
TreeNode nd=_root;
for (int i = 0; i < keywordLower.Length; i++)
{
char c = keywordLower[i];
// find a transition for c
TreeNode ndNew = nd.GetTransition(c);
 
if (ndNew==null)
{
ndNew=new TreeNode(nd,c);
nd.AddTransition(ndNew);
}
nd=ndNew;
}
// add the keyword to the final node
nd.AddResult(keyword);
}
 
// Find failure functions
ArrayList nodes=new ArrayList();

// level 1 nodes - fail to root node
foreach(TreeNode nd in _root.Transitions)
{
nd.Failure=_root;
foreach(TreeNode trans in nd.Transitions)
nodes.Add(trans);
}
// other nodes - using BFS
while(nodes.Count!=0)
{
ArrayList newNodes=new ArrayList();
foreach(TreeNode nd in nodes)
{
TreeNode r=nd.Parent.Failure;
char c=nd.Char;
 
while(r!=null && !r.ContainsTransition(c)) r=r.Failure;
if (r==null)
nd.Failure=_root;
else
{
nd.Failure=r.GetTransition(c);
foreach(string result in nd.Failure.Results)
nd.AddResult(result);
}

// add child nodes to BFS list
foreach(TreeNode child in nd.Transitions)
newNodes.Add(child);
}
nodes=newNodes;
}
_root.Failure=_root;
}
 

#endregion
#region Methods & Properties
 
///
/// Keywords to search for (setting this property is slow, because
/// it requieres rebuilding of keyword tree)
///

public string[] Keywords
{
get { return _keywords; }
set
{
_keywords=value;
BuildTree();
}
}
 

///
/// Searches passed text and returns all occurrences of any keyword
///

/// Text to search
/// Array of occurrences
public StringSearchResult[] FindAll(string text)
{
return FindAll(text, int.MaxValue);
}
 
///
/// Searches passed text and returns all occurrences of any keyword, but no more than
/// a given limit
///

/// Text to search
/// Array of occurrences
public StringSearchResult[] FindAll(string text, int maxKeywords)
{
StringSearchResult[] uniqueMatches;
StringSearchResult[] allMatches;
InnerFindAll(text, maxKeywords, int.MaxValue, out allMatches, out uniqueMatches);
return allMatches;
}
 
///
/// Searches passed text and returns all the unique occurrences of any keyword
///

/// Text to search
/// Array of occurrences
public StringSearchResult[] FindAllUnique(string text)
{
return FindAllUnique(text, int.MaxValue);
}
 
///
/// Searches passed text and returns all the unique occurrences of any keyword, but no more than
/// a given limit
///

/// Text to search
/// Array of occurrences
public StringSearchResult[] FindAllUnique(string text, int maxUniqueKeywords)
{
StringSearchResult[] uniqueMatches;
StringSearchResult[] allMatches;
InnerFindAll(text, int.MaxValue, maxUniqueKeywords, out allMatches, out uniqueMatches);
return uniqueMatches;
}
 
///
/// finds all the occurances (unique and not unique) of any keyword in a given text,
/// limited to maximum upper bounds
///

///
///
///
///
///
private void InnerFindAll(string text, int maxKeywords, int maxUniqueKeywords, out StringSearchResult[] allMatches, out StringSearchResult[] uniqueMatches)
{
ArrayList allMatchesList = new ArrayList();
HybridDictionary uniqueMatchesList = new HybridDictionary(maxUniqueKeywords);
 
TreeNode ptr=_root;
int index=0;
 
while(index= maxUniqueKeywords ||
allMatchesList.Count >= maxKeywords)
break;
index++;
}
// set the output variables
allMatches = (StringSearchResult[])allMatchesList.ToArray(typeof(StringSearchResult));
uniqueMatches = new StringSearchResult[uniqueMatchesList.Count];
uniqueMatchesList.Values.CopyTo(uniqueMatches,0);
}
 

///
/// Searches passed text and returns first occurrence of any keyword
///

/// Text to search
/// First occurrence of any keyword (or StringSearchResult.Empty if text doesn't contain any keyword)
public StringSearchResult FindFirst(string text)
{
StringSearchResult[] result = FindAll(text, 1);
if (result.Length > 0)
return result[0];
return StringSearchResult.Empty;
}
 
///
/// Searches passed text and returns true if text contains any keyword
///

/// Text to search
/// True when text contains any keyword
public bool ContainsAny(string text)
{
StringSearchResult[] result = FindAll(text, 1);
return result.Length > 0;
}
 
#endregion
}
}

 

-- modified at 5:47 Wednesday 4th January, 2006
GeneralRe: Great! some improvementsmemberrmircea14 Jun '06 - 0:14 
Hello! I need to find whole words in a text and marked them after that. The finding of whole words do not work in the versions above. Can you help me with this please?!

 
Mircea Ruja
Developer
QuestionLicensememberleppie30 Dec '05 - 23:13 
Hi Tomas
 
I dont see any licensing info for this. I wanna use it in my xacc project. Would that be ok?
 
Thanks
 
leppie
 
xacc.ide-0.1.1 released! :) Download and screenshots
GeneralPlease remove source controlmemberGary Thom5 Dec '05 - 8:30 
The solution and projects include source control information, which is plain nasty.
 
Interesting article though.
 
Gary
 
FireFox 1.5The award-winning Web browser just got better. It’s free, and easy to use. Join the millions of people worldwide enjoying a better Web experience. - so far, not so good Sigh | :sigh:
GeneralTomas, its nice :)memberJan Seda (Skilldrive.com)4 Dec '05 - 22:34 
Hi Tomas!
 
Again you wrote a very nice article Smile | :) Seems like other Skilldrive guys are behind you Smile | :)
 
Regards,
Jan
GeneralTypo :pmemberleppie3 Dec '05 - 21:56 
The text legend of the graph is 'slightly' wrong Smile | :)
 
xacc.ide-0.1 released! Download and screenshots
AnswerRe: Typo :pmemberTomas Petricek4 Dec '05 - 2:00 
Fixed Blush | :O
Thanks for noticing.
 

Tomas Petricek (Microsoft C# MVP)
www.eeeksoft.net | Photos | ASP.NET Multi-column layout control

QuestionQuestionsmemberleppie3 Dec '05 - 21:49 
1. What pattern & options did you use with the Regex?
2. How does Boyer Moore compare (esp large texts)?
 
xacc.ide-0.1 released! Download and screenshots
AnswerRe: QuestionsmemberTomas Petricek4 Dec '05 - 2:09 
1. This is decribed in Performance comparsion section. For example for he,she,his keywords I used (he,she,his) pattern.
 
2. I looked for some information about this algorithm now, but I'm not sure wether it supports multiple keywords (thats what I wanted)?
 

Tomas Petricek (Microsoft C# MVP)
www.eeeksoft.net | Photos | ASP.NET Multi-column layout control

GeneralRe: Questionsmemberleppie4 Dec '05 - 2:24 
1. Did you use the Compiled option?
 
2. Yes, you are correct, I didnt think of that, but it would be possible. I will look into adapting my version of BM and paste you some code Smile | :)
 
xacc.ide-0.1 released! Download and screenshots
GeneralRe: QuestionsmemberTomas Petricek4 Dec '05 - 3:13 
I used compiled option for testing, because it should be faster (compilation is done only once before searching), but I tried it without Compiled and there wasn't big diference.
 

Tomas Petricek (Microsoft C# MVP)
www.eeeksoft.net | Photos | ASP.NET Multi-column layout control

GeneralAwesome!memberSuper Lloyd3 Dec '05 - 19:51 
I was looking for something like that!
 
For now I only browsed through you article and it seems a perfect answer to a problem I will face... later Laugh | :laugh:
 
But I bookmarked it! Big Grin | :-D

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130516.1 | Last Updated 3 Dec 2005
Article Copyright 2005 by Tomas Petricek
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid