Click here to Skip to main content
15,867,453 members
Articles / Programming Languages / C#

hOOt - full text search engine

Rate me:
Please Sign up or sign in to vote.
4.92/5 (156 votes)
24 Feb 2019CPOL17 min read 1M   22.5K   388  
Smallest full text search engine (lucene replacement) built from scratch using inverted MGRB bitmap index, highly compact storage, operating in database and document modes
v1.0
-----
- initial release

v1.1
----
- tweaked parameters and reduced index size by 46%
- speed increase bitmap index save 5x
- bug fix sample ui 
- bug fix bitarray resize
- thread safe internals
- code refactoring
- OptimizeIndex() implemented

v1.2
----
- FindDocumentFileNames() for faster string only return
- Better word extractor ~19% smaller index
	- breaks up camel case compound words
    - ignores strings >60 chars and less than 2 chars

v1.3
----
- Bug fix bitarray operations

v1.4
----
- replaced WAHBitarray with v2
- ~9x bitmap save speed increase
- ** index must be rebuilt from previous version **

v1.5
----
- added support for wildcard characters (*,?)
- added support for AND (+) , NOT (-) queries

v2.0
----
- updated and embedded fastjson v2.0.11 in the project
- post back code from RaptorDB
- thread safe locks updates
- bug fix logger threading
- restructured source code
- fixed sample form tab order
- added the ability for incremental indexing
- storing document file information for future checking

v2.1
----
- upgrade to fastJSON v2.0.15
- bug fix last word missing last character

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect -
United Kingdom United Kingdom
Mehdi first started programming when he was 8 on BBC+128k machine in 6512 processor language, after various hardware and software changes he eventually came across .net and c# which he has been using since v1.0.
He is formally educated as a system analyst Industrial engineer, but his programming passion continues.

* Mehdi is the 5th person to get 6 out of 7 Platinum's on Code-Project (13th Jan'12)
* Mehdi is the 3rd person to get 7 out of 7 Platinum's on Code-Project (26th Aug'16)

Comments and Discussions