Click here to Skip to main content
15,860,859 members
Articles / Database Development / NoSQL

RaptorDB - the Key Value Store

Rate me:
Please Sign up or sign in to vote.
4.89/5 (118 votes)
22 Jan 2012CPOL22 min read 904.8K   9.9K   266  
Smallest, fastest embedded nosql persisted dictionary using b+tree or MurMur hash indexing. (Now with Hybrid WAH bitmap indexes)
v1
---

v1.1
----
- Count() 
- InMemoryIndex property and SaveIndex() -> log files kept until index saved to disk
- storage file header size bug fix
- safedictionary byte[] key fix 

v1.2
-----
- EnumerateStorageFile() : keyvaluepair -> using yield
- GetDuplicates(key)
- FetchDuplicate(offset)  
- FreeCacheOnCommit property

v1.3
-----
- bug fix CheckHeader in StorageFile
- bug fix Get in LogFile 

v1.4
----
- reverse short, int, long bytes (big endian) so they sort properly in byte compare (in ascending order)
- read operation on data file uses another stream 
- bug fix duplicate page not set in keypointer

v1.4.1
------
- bug fix helper.compare 

v1.5
-----
- changed long to int for smaller index size and lower memeory usage should get 50% saving in both
- all internal operations are on record numbers instead of file offsets
- added a new file (mgrec extension) for mapping record numbers (int) to file offset (long) 
- ** index files are not backward compatible with pre v1.5 ** 
- bug in unsafe bitconvert routines, using ms versions until workaround found

v1.5.1
------
- unsafe bitconvert fixed
- speed optimized storage write to disk, removed slow file.seek and file.length (amazingly moving from long to int took a ~50% performance hit, now back to ~5% to the original code speed)

v1.5.2
------
- bug fix int seek overflow in index
- change to flush(true) for all files
- bug fix hash duplicate code

v1.6
-----
- IndexerThread will do indexing continuosly until the queue is empty
- resolved a concurrrency issue with Get() and IndexerThread
- replaced flush(true) with flush() for .net2 compatibility

v1.7
-----
- implemented Shutdown() to close all files and shutdown the engine
- duplicates are now encoded as WAH bitarrays (hybrid b+tree/bitmap and hash/bitmap indexes)
- special case WAH bitmap for optimized memory usage
- ** index files are not backward compatible with pre v1.7 because of duplicate encoding and compare changes **
- now using IEnumerable<> instead of List<> for lower memory usage
- code restructuring and cleanup
- SaveIndex(true) will now flush all in-memory structures to disk indexes
- added unit test project
- optimized compare() routine ~18% faster, thanks to Dave Dolan
- Read speed optimized 10x faster
- Separate read/write streams for bitmap indexes

v1.7.5
------
- added ThrottleInputWhenLogCount = 400 this kicks in after about 8 million items in the log queue
- btree cache clearing now only clears leaf nodes to save memory and keep performance
- optimized the indexer thread
- bug fix shutdown
- optimized Dave's compare()
- added Twenty_Million_Set_Get_BTREE() test to unit tests
- bug fix shutdown()
- moved from SortedList to Dictionary for internal caches

v1.8
------
- hash caching activated, read speed is on par with btree
- zero length files are deleted on shutdown
- bug fix hash shutdown 
- new tests to test shutdown and flushing logic
- Added RaptorDBString with unlimited key size
- Count() now uses the storage file contents instead of the index (faster for huge row count >10million)
- Changed internals to generics
- Implemented IDisposable for clean shutdown
- optimized bitmap index read speed
- added minilogger for log progress files
- bug fix WAHBitArray
- Shutdown can flush memory logs t index on flag
- added SaveMemoryIndexOnShutdown to Global

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect -
United Kingdom United Kingdom
Mehdi first started programming when he was 8 on BBC+128k machine in 6512 processor language, after various hardware and software changes he eventually came across .net and c# which he has been using since v1.0.
He is formally educated as a system analyst Industrial engineer, but his programming passion continues.

* Mehdi is the 5th person to get 6 out of 7 Platinum's on Code-Project (13th Jan'12)
* Mehdi is the 3rd person to get 7 out of 7 Platinum's on Code-Project (26th Aug'16)

Comments and Discussions