|
|
Comments and Discussions
|
|
 |

|
Mehdi,
I really like the performance of your project. Would like to use it on multiple mobile devices. Have you considered creating a PCL solution? Has anyone tried this yet?
Thanks,
Mark
|
|
|
|
|
|

|
I am using RaptorDB for a project that inserts a few million records (both keys and values are typically close to 100 bytes or more) and I have been getting OutOfMemory exceptions. I've played with various options to control memory usage (decreasing PageItemCount/SaveTimerSeconds, and setting both FreeBitMapMemoryOnSave and FlushStorageFileImmediatley to true), but none of them seem to have any effect on how much memory is used. In particular, I would think that setting FlushStorageFileImmediatley to true would minimize memory usage, if data is not buffered for long before being written to disk.
Any ideas what I might be missing?
Thanks!
|
|
|
|

|
I made a small insert test. Using esent for compare, I found the result is not same.
Raptor DB was inserted 3221327 strings, but esent was inserted 3222542 strings.
While, the speed of Raptor is faster.Cost about 20 minutes, but esent is 4 hours.
|
|
|
|

|
When I inserted several times, including opening and closing the database I found missing pages contain items.
I do not know why what happened so confused in the slippage is not. The type of Database is KeyValue with Key is long type.
|
|
|
|

|
Excellent article. One thing that was lacking was byte-valued keys. Here ya' go: ByteKey gist[^]
He who asks a question is a fool for five minutes. He who does not ask a question remains a fool forever. [Chinese Proverb]
Jonathan C Dickinson (C# Software Engineer)
|
|
|
|

|
Hi Mehdi,
I am evaluating RaptorDB and I have to say it works very well!
However, when I use it on a server with a slow disk there is a big degradation in performance, especially in throughput (20K reads/sec with a 20M data set).
Is there a way to load the entire DB into memory?
Thanks,
Alex.
|
|
|
|

|
Should these handlers(double_handler,byte_handler,float_handler...) in DataTypes.cs be singleton instance?
|
|
|
|

|
There are some helper methods in "SafeDictionary.cs",for example,ToInt32.
In .Net,"System.BitConverter" can do the same thing.
Why do you implement them by yourself?
|
|
|
|

|
There is a way to list all the value ?
|
|
|
|

|
I found "RaptorDB-The Key Value Store V2" and "
RaptorDB - the Document Store",
what is the diffrence between two project?
Thank you.
|
|
|
|

|
I'm interested in trying this DB out; it looks like it's well implemented, and well supported. However, I'm ideally trying to find a solution that will work across a variety of .NET platforms, specifically Windows, Windows RT, and Windows Phone. Does RaptorDB support usage on Windows RT and/or Windows Phone? I'm not seeing any documentation that says it does, but figured it couldn't hurt to ask. If not, do you plan to add this support? Regardless, nice job!
Jamie Nordmeyer Portland, Oregon, USA
|
|
|
|

|
Hi Mehdi,
Really great piece of code! I have a situation where I'm currently using a Dictionary with 10 milion items in it. The problem with this is the population of the Dictionary (can take a while).
RaptorDB could be a great to solve this start up time. However, everytime I open the DB, it's going to rebuild all indexes (even if I'm only using it for reading). With this 10 milion items, it takes about an minute.
2012-12-22 04:47:18|DEBUG|10|RaptorDB.KeyStore`1|| Current Count = 9,999,872
2012-12-22 04:47:18|DEBUG|10|RaptorDB.KeyStore`1|| Checking Index state...
2012-12-22 04:47:18|DEBUG|10|RaptorDB.KeyStore`1|| Rebuilding index...
2012-12-22 04:47:18|DEBUG|10|RaptorDB.KeyStore`1|| last index count = 0
2012-12-22 04:47:18|DEBUG|10|RaptorDB.KeyStore`1|| data items count = 9999872
2012-12-22 04:47:18|DEBUG|10|RaptorDB.KeyStore`1|| 100,000 items re-indexed
2012-12-22 04:47:18|DEBUG|10|RaptorDB.KeyStore`1|| 100,000 items re-indexed
Is there anything that I can do to prevent this?
Thanks,
Danny
|
|
|
|

|
How we can make foreach enumeration on all keys?
foreach(var key in storage){
var itemContent = raptor.Get(key)....
//do something with value
//delete key
}
or
foreach(var item in storage){
var itemContent = item.Value;
//do something with value
//delete key
}
I want to store logs in raptorDB and later with Quartz to check if there is any keys and if any, copy to original DB.
Still amater
|
|
|
|

|
Hi,
Thank you for this amazing library, I'm currently porting it to Windows CE (certainly will let you know how it helps us)
While reading the code of Helper.CompareMemCmp in SafeDictionary I've noticed that the return value of this function may not be the result of a general comparison function. As an example the following compare will return true, which is obviously not right. Can you confirm whether it is a bug or a particular optimization in your code based on usage pattern ?
byte[] left = new byte[] {1, 2, 3, 4, 5};
byte[] right = new byte[] {1, 2, 3};
int res = Helper.CompareMemCmp(left, right);
res will be 0 while I expected it to be +1 !
Regards,
|
|
|
|

|
Excellent project. But I have one question. What is the easiest way to get maximum key from RaptorDB?
|
|
|
|

|
I want to use RaptorDB with MONO on Linux. It works prefect with only minimal changes.
The only thing is to replace the hard-coded path-seperator "\\" with the .NET path field
"Path.DirectorySeparatorChar" in the following files.
BitmapIndex.cs
MGIndex.cs
KeyStore.cs
mylogger.cs
So in a few minutes the RaptorDB is cross-platform ready with MONO. Maybe you can change this in the next version of RaptorDB.
br,
Alex
|
|
|
|

|
I have a scenario where I just need to test for the existence of a key in the index. Here is an addition to the KeyStore.cs file to just return if a key exists.
public bool Exists( T key ) {
lock ( _lock ) {
int off;
return _index.Get( key, out off );
}
}
Thoughts?
|
|
|
|

|
Can more processes (on the same host) open and read/write the same database?
Thanks.
|
|
|
|

|
Why max cpu usage is 25% even on read and write as exe.
I tested on SSD, Raid5 and RamDisk, all benchmarks have similar time.
10 millions : ~25 sec set, ~45 sec get
Isn't it supposed to use max cpu and why there is no difference on ssd, raid5 and ramdisk?
Cpu is I5 2500, Quad core 3.3 Ghz
Raptor 2.5 version
Still amater
|
|
|
|

|
Hi, It's great fast database. We want to use your db on medium loaded service (5000 requests per minute)
We've configured to save index database every 30 seconds. After simulated incorrect shutdown
we noticed that indexes are being rebuilded from the begining and it tooks a lot of time.
1. May be there is opportunity to rebuild for only new data?
2. Sometimes system hangs on rebuilding indexes on start for long time without any text in log. With last record
2012-10-03 03:22:26|DEBUG|32|RaptorDB.KeyStore`1|| Checking Index state...
2012-10-03 03:22:26|DEBUG|32|RaptorDB.KeyStore`1|| Rebuilding index...
2012-10-03 03:22:26|DEBUG|32|RaptorDB.KeyStore`1|| last index count = 0
2012-10-03 03:22:26|DEBUG|32|RaptorDB.KeyStore`1|| data items count = 2510698
2012-10-03 03:22:26|DEBUG|32|RaptorDB.KeyStore`1|| 100,000 items re-indexed
Can we do something with that problems?
|
|
|
|

|
Am I doing something wrong? It crashes on EnumerateStorageFile
RaptorDB<int> db1 = RaptorDB<int>.Open("c:\\raptordbtest\\t1", false);
RaptorDB<int> db2 = RaptorDB<int>.Open("c:\\raptordbtest\\t2", true);
db1.Set(1, "s1");
db1.Set(2, "s2");
db1.Set(3, "s3");
db2.Set(2, "log2lab2: str1");
db2.Set(2, "log2lab2: str2");
db2.Set(3, "log3lab3: str1");
db2.Set(3, "log3lab3: str2");
db2.Set(1, "log1lab1: str1");
db2.Set(1, "log1lab1: str2");
foreach (var pair in db2.EnumerateStorageFile())
{
listBox1.Items.Add(pair.Key.ToString());
}
db1.Shutdown();
db2.Shutdown();
|
|
|
|

|
Hi, sorry for my bad english.
First, thank you for this work.
The function FetchRecordString and RemoveKey are not thread safe. If I add a lock, the problem is resolve.
An another scenario seem to be problematic:
I add 100 item, iterate with a for, call FetchRecordString, remove the key. After the iteration, I save the index and I ask for the Count() method and I receive a count of 200 ! How it's possible ? I remove all so the count suppose to be 0.
|
|
|
|
|

|
First, thanks very much for work!
Second, I have very little knowledge of document oriented databases so sorry if I'm asking some obvious questions but -
1) How do we use the full text search feature? I get putting the [FullText] attribute on the appropriate field, is there anything else that needs to be done? And when searching is it done through the same db.Query() function?
2) Also, how would you achieve something like paging over a table of data? Is it possible to for e.g specify just to retrieve the top 25 results, or results 26-50 etc?
thanks again,
Jeremy
|
|
|
|

|
Hello
Thank you for your updates. That's nice to see this project is alive.
Can I add some suggestions ?
. EnumerateStorageFile should not collect the duplicates, or maybe only as an option.
(because it does even when I open the database with AllowDuplicateKeys=false).
. A new function EnumerateKeys to obtain the keys only.
Values are sometimes useless for a filter. It is for performance.
. Publish enumerate functions on RaptorDBString, not only on KeyStore.
. Add some informations to KeyValuePair: IsADuplicate, IsDeleted, etc..
. Add the standard documentation to the functions and classes (with ///).
It is nice to have some details directly inside Visual Studio (with F12).
. KeyStore.Set should replace ( = delete) the previous value.
As of v2.5, I obtain duplicates when I use EnumerateStorageFile.
. A new function: SetIfDifferent to Set only when the bytes are different to a previous value.
Because AFAIK Raptor always add a new key&value, which can causes a lot of duplicates. BTW, I noted it is true even when the size of the new value is equal to the size of the old value. Why not use the old location in the file ?
I know this function would have poor performance, but it can be a choice, depending on what is in the database.
Thank you, and have a nice day.
|
|
|
|
|

|
Is there any tool like RaptorDB manager or browser?
|
|
|
|

|
Great !
But one Question, is the License COPL (CodeProject) or Apache 2 (Codeplex) ?
|
|
|
|

|
Hi,
Nice work and thanks for open sourcing it !
I just wanted to point out that the GET performance numbers appear to be the best case numbers and don't seem to reflect the average performance. When I added a shuffling of the guids between the SET and GET phase with following code :
guids = guids.OrderBy(a => Guid.NewGuid()).ToList();
Then the get performance decreased (about 3 times slower).
Regards,
Samuel
|
|
|
|

|
Article written by Riyad Kalla:
[^]
|
|
|
|

|
Hi,
I am exploring the option of using RaptorDB for caching purposes. In my application, we will need to trim the cache periodically based on value being outdated, or overtime etc.
I see the RemoveKey() method, but, is there a way to remove a list of keys based on their value? Or, is there a way to remove last x unused keys for y number of days?
Thanks! BTW, v2.4 was very easy to integrate into my application!
|
|
|
|

|
Hi,
thanks for this project. Could you please consider for upcoming release to add "record expiration" support? Redis has this feature. Or maybe some other datetime attribute. I'm currently thinking to store viewstate (about 400-600KB each postback) on server and use NOSQL db, to keep db size on hdd small I was thinking to have a worker thread that would delete all records elder than 3 hours (if automatic expiration is not built-in).
P.S.: one of caveats you've mentioned in the article is that data is not deleted, basically this means that I can't benefit from raptordb in that case (to keep hdd usage at minimum), am I right?
Thanks in advance!
|
|
|
|

|
As far as I can see RaptorDB is limited to 2^31 records. It would be very nice to upgrade this to a 64 bit value.
Some comments:
- There is some array resizing done not using the build-in Array.Resize. I think there can ben some performance gain there.
- The Helper class supports Array.Reverse, but it is not used within the project. Some performance gain could be found there.
- The Helper class supports unsafe conversion of basic types while BitConverter does that too... Removing this makes the footprint slightly smaller and removes some unsafe methods which is a good practice.
- Within the Index file the nodesize = 65536, but is loaded with an Int16, so only 65536/2 can be used.
- On several locations bitwise operations can replace the often slower modulo (%32 == &31) and multiplication (*8 == <<3) or division (/8 == >>3).
- I might be wrong but the MurMur hashing limits the effective indexing to 2^32 while NoSQL databases are experts in going beyond the 2^32. To overcome this limit you can add another Int32 hashing technique and combine MurMur with the another hash to get an Uint64.
I personaly use:
internal static uint Hash1(uint a)
{
a = (a ^ 61) ^ (a >> 16);
a = a + (a << 3);
a = a ^ (a >> 4);
a = a * 0x27d4eb2d;
a = a ^ (a >> 15);
return a;
}
internal static uint Hash0(uint a)
{
a = (a + 0x7ed55d16) + (a << 12);
a = (a ^ 0xc761c23c) ^ (a >> 19);
a = (a + 0x165667b1) + (a << 5);
a = (a + 0xd3a2646c) ^ (a << 9);
a = (a + 0xfd7046c5) + (a << 3);
a = (a ^ 0xb55a4f09) ^ (a >> 16);
return a;
}
They work fine and the code is safe unlike MurMur.
But thanks for the project!
|
|
|
|

|
Hi, I found a little bug in the EnumerateStorageFile method. You have to flush the data before reading it.
Here is the code:
public IEnumerable<KeyValuePair<T, byte[]>> Traverse()
{
if (_flushNeeded)
{
_writefile.Flush();
_recordfile.Flush();
_flushNeeded = false;
}
long offset = 0;
offset = _fileheader.Length;
}
|
|
|
|

|
As a favor could you put some sample code up on how to enumerate through all the records in the db? In other words how exactly do you use the EnumerateStorageFile method?
Thanks and great job!
Irwin
|
|
|
|

|
That's a huge difference. Of course the allocation profile for the objects is going to be totally different now, so much more will happen stack-side instead of in the heap(s).. have you cracked open a profiler on it to see which operations are the heaviest? (I have a perf profiler, but not a good memory profiler, or I'd try it myself.)
also: If you don't mind my saying so a little more forcefully, you should add that range query code, doesn't have to be the stuff I sent you, but the main benefit of an index is range queries, and so far you do not yet expose this ability. Since you have the key markers on each block it's easy to know in which block will be necessary to 'check' for end of range, in all the rest you can simply enumerate. I'd be happy to help with this part.
|
|
|
|

|
Mehdi,
Correct me if I'm wrong, but after starting the RaptorBD engine the first time, my first Get call throws this exception.
A suggestion to correct this is to change IndexFile.cs, line 290 from
SeekPage(number);
to this
SeekPage(number - 1);
This will make the index to look into first page saved in disk. I believe the current version always looks into the second page, where at first moment doesn't exists, causing this error. I believe the first page in file is never being touched, this probably will correct this too.
Regards
|
|
|
|

|
RaptorDB looks to be great. Just what I am looking for. Are you considering making it available on Nuget?
|
|
|
|

|
A very well written and illustrated article that presents a very useful storage tool. Five from me.
Just because the code works, it doesn't mean that it is good code.
|
|
|
|

|
For a dual quad-core E5420 @2.5GHz
Windows Server 2008
16 GB RAM
15K SAS Drive
I got for the 10,000,000 writes test:
Page Count = 1208
Total Split Time = 6.69499999...
set time = 75.489
get time = 98.648
This is for version 2.2.
|
|
|
|

|
First nice piece of work.
When I call the method "RemoveKey", i got a Null Reference Exception:
System.NullReferenceException was unhandled
Message=Object reference not set to an instance of an object.
Source=RaptorDB
StackTrace:
at RaptorDB.StorageFile`1.WriteData(T key, Byte[] data, Boolean deleted) in ...\RaptorDB_v2.1\RaptorDB\Storage\StorageFile.cs:line 136
at RaptorDB.RaptorDB`1.RemoveKey(T key) in ...\RaptorDB_v2.1\RaptorDB\RaptorDB.cs:line 308
byte[] hdr = CreateRowHeader(kl, data.Length);
data is an array of bytes, this parameter is set by the RemoveKey to null.
|
|
|
|

|
I was just looking at SafeDictionary code, and there is a small useless overhead.
When you declare the _Dictionary variable you initialize it, and then in the constructors you initialize it again.
Considering that the constructors may pass parameters to it, it should not be initialized during declaration.
|
|
|
|

|
Do you have any benchmarks which compare Raptor DB with Redis?
|
|
|
|

|
I road your article since v1.6. At that time, I saw your RaptorDB had a limit of deleting. And now, the V2 support this. You are great!
Please also take a view of freeMDB: http://code.google.com/p/freemdb/
|
|
|
|

|
congratulations
|
|
|
|

|
On my SSD(Micron M4 128G) + i5 + 6GB
Page Count = 1198, Total Split Time = 5.60031889999999
set time = 74.7922778
get time = 57.2912769
Sounds not as fast as I expected?
2012-01-28 08:06:01|DEBUG|1|RaptorDB.RaptorDB`1|| Current Count = 0
2012-01-28 08:06:01|DEBUG|1|RaptorDB.RaptorDB`1|| Checking Index state...
2012-01-28 08:06:01|DEBUG|1|RaptorDB.RaptorDB`1|| Starting save timer
2012-01-28 08:08:14|DEBUG|1|RaptorDB.RaptorDB`1|| Shutting down
2012-01-28 08:08:14|DEBUG|1|RaptorDB.RaptorDB`1|| saving to disk
2012-01-28 08:08:14|DEBUG|1|RaptorDB.MGIndex`1|| Total split time (s) = 5.60031889999999
2012-01-28 08:08:14|DEBUG|1|RaptorDB.MGIndex`1|| Total pages = 1198
2012-01-28 08:08:19|DEBUG|1|RaptorDB.RaptorDB`1|| index saved
2012-01-28 08:08:19|DEBUG|1|RaptorDB.IndexFile`1|| Shutdown IndexFile
2012-01-28 08:08:19|DEBUG|1|RaptorDB.BitmapIndex|| Shutdown BitmapIndex
2012-01-28 08:08:19|DEBUG|1|RaptorDB.RaptorDB`1|| Shutting down log
Regards,
unruledboy_at_gmail_dot_com
http://www.xnlab.com
|
|
|
|

|
Great article, as usual
|
|
|
|

|
Really interesting article that covers many performance aspects. Thx!
I have some questions:
* The RemoveKey seems to have a problem because called WriteData method is getting data.Length on a null data (CreateRowHeader(kl, data.Length);). I changed this to CreateRowHeader(kl, data != null ? data.Length : 0);. Not so sure about this.
* I was planning on writing a defrag method by creating a temporary RaptorDb, copying all values (last value for each key, skipping duplicates), and finally reopening a RaptorDb instance from these files. I don't know how to do this:
static RaptorDB<T> DefragmentDataBase<T>(string fileName, RaptorDB<T> raptorDb) where T : IComparable<T>
{
string path = Path.GetDirectoryName(fileName);
string dbName = Path.GetFileName(fileName);
string tempFileName = fileName + "Tmp";
string tempDbName = Path.GetFileName(tempFileName);
DeleteDataBase(tempFileName);
var tempDb = new RaptorDB<T>(tempFileName, false);
raptorDb.Dispose();
tempDb.Dispose();
DeleteDataBase(fileName);
Directory.GetFiles(path, tempDbName + ".*").ToList().ForEach(f => File.Move(f, Path.Combine(path, dbName, Path.GetExtension(f))));
return new RaptorDB<T>(fileName, false);
}
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
|
Even faster Key/Value store nosql embedded database engine utilizing the new MGIndex data structure with MurMur2 Hashing and WAH Bitmap indexes for duplicates. (with MonoDroid support)
| Type | Article |
| Licence | CPOL |
| First Posted | 18 Jan 2012 |
| Views | 130,581 |
| Downloads | 3,660 |
| Bookmarked | 125 times |
|
|