Click here to Skip to main content
Click here to Skip to main content
Go to top

Persistant Hash Table

, 27 Oct 2011
Rate this:
Please Sign up or sign in to vote.
Key value store

Introduction 

I got inspired by the NO SQL and wanted to start writing one myself, I saw few very good open source C++ NO SQL like “MongoDB” and “levelDB” and wanted to write a key value store myself. I wanted to start off with a minimalistic set of goals to start and complete in a weekend. So with these parameters, I am posting this one. I hope it will be helpful to someone who wants to develop key store as a beginner.  

Background

There are many ways to create a table and access it, each one has its own advantages and disadvantages. Here, I am using a hash function to access a record in a file. Other methods include maintain index tables, sequential access, b+trees, persistent pointers, etc.

Using the Code

I have used a simple logic to construct the persistent hash table. If we define number of buckets, their size and record size, it is easy to store and retrieve. When we want to write a record or read a record, key is passed to the hash function, with the hash value, the bucket offset is derived. Then we need to jump to that offset and write the record or read the record. Before doing that, a check is made if a record is already present in that location, if present then we jump to the next record and then that record is written.

  • CHash(long nBuck,long bucksize,long recsize): is one of the overloaded ctor where we pass number of buckets, bucket size and each record size (table size). For this table, key should be long.
  • CHash(long nBuck,long bucksize,long recsize,unsigned long (*uh)(void *)): is another ctor where user defined hash function pointer can be passed. key can be user defined.
  • createdb: creates the hdb file with the name given, writes and reads can be made.
  • writedb: opens the file for writing
  • readdb: opens the file for reading
  • closedb: closes the file handle
  • writerec: write record has two overloads, one for key type long and another for user defined key type, writes user defined data, writes data into the bucket if record is present (i.e. collision occurred) then it jumps to the next record until it finds a vacant place to write.
  • readrec: readrec  has two overloads, one for key type long and another for user defined key type, reads user defined data. Reads the record from the bucket offset, if the value searched is different, then readnextrec is used to fetch the next record with the offset taken from previous readrec call.
  • readnextrec: as mentioned above, this needs to be used to read consecutive elements in the same bucket by passing in the offset from readrec.
  • writeoffset,readoffset,GetbucketOffset are the helper functions to move around the file for writing and reading.
int CHash::GetbucketOffset(long buckno)//returns the bucket offset
{
   int offset = 1;
   offset = buckno * m_bucketsize;
   return offset;
}
int CHash::writerec(void* key,void *data)
{
	long buckno = GetbucketOffset(hash(key));
    void *temp = malloc(m_recsize);
	while(0 != readoffset(buckno,temp,m_recsize))
	{
		buckno += m_recsize;//go to the next record
	}
	free(temp);
	writeoffset(buckno,data,m_recsize);
	return 0;
}
int CHash::writerec(void* key,void *data)
{
	long buckno = GetbucketOffset(hash(key));
    void *temp = malloc(m_recsize);
	while(0 != readoffset(buckno,temp,m_recsize))
	{
		buckno += m_recsize;//go to the next record
	}
	free(temp);
	writeoffset(buckno,data,m_recsize);
	return 0;
}
int CHash::readrec(long key,void *data,long &offset)
{
	long buckno = GetbucketOffset(hash(key));
	readoffset(buckno,data,m_recsize);
	offset = buckno;
	return 0;
}
int CHash::readrec(void* key,void *data,long &offset)
{
	long buckno = GetbucketOffset(hash(key));
	readoffset(buckno,data,m_recsize);
	offset = buckno;
	return 0;
}
int CHash::readnextrec(long offset,void *data)
{
	offset += m_recsize;
	readoffset(offset,data,m_recsize);
	return 0;
}

Testhash.cpp is written to give a demo of how to use this class.

Limitations

May be in my next article, I can improve on the pre-definition of the size for buckets and tables. Instead, I will build these tables with some other class or use some other technique. Another limitation I found in this is if sufficient space is not allocated to a bucket, then data corruption can happen. All these are due to fixed sizes.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Dharmateja Challa
Software Developer (Senior) EMC
India India
I have worked in ESM, Storage and Embedded domains. Mainly taken part in design, coding and maintaining in c/c++ on windows.
For fun I play CounterStrike:GO and also read books.
Follow on   LinkedIn

Comments and Discussions

 
GeneralSee the next improved version PinmemberDharmateja Challa7-Nov-11 1:04 
GeneralRe: See the next improved version Pinmembersalamat200026-Nov-12 5:11 
Questionpersistent, not persistant PinmemberCharvak Karpe1-Nov-11 4:02 
AnswerRe: persistent, not persistant PinmemberDharmateja Challa7-Nov-11 1:00 
SuggestionRe: persistent, not persistant PinmemberKapsD18-May-12 0:19 
QuestionMore information PinmemberMehdi Gholam27-Oct-11 4:31 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140922.1 | Last Updated 27 Oct 2011
Article Copyright 2011 by Dharmateja Challa
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid