Click here to Skip to main content
Click here to Skip to main content

Hash Tables as an Efficient Means of Search

, 16 Nov 2011
Rate this:
Please Sign up or sign in to vote.
An implementation of hash tables as a means of fast lookup in C++.

Introduction

When I was once involved in the design of a routing optimization software for a telecom consultancy, a required component was a mechanism for finding least cost routes for traffic demands between communicating network sites. In other words, an efficient means of searching for the best tariffs provided by all of the telecom carriers.

There are a huge number of tariffs for routing different types of telephone calls, be they geographic, non-geographic, mobile, premium rate, freefone, international, etc., which are frequently subject to change. Of concern was the problem of creating a mapping between strings representing telephone numbers and their associated tariffs in ways that were accurate and quick.

When looking up tariffs for many thousands of numbers on large networks, a sequential search for anything but trivial examples is impractical for a commercial least-cost routing package. Binary searches on sorted lists were investigated and were a big improvement, giving a good performance even on very large lists, akin to an average of O( log n ). It was also realized that hash tables are among the fastest means of lookup, usually executing in constant time O( 1 ).

About Hash Tables

A hash table is a software data structure based on the idea of using an indexing lookup table to significantly accelerate the mapping. Given that the telephone numbers are usually represented by strings and the tariffs represented by numbers, we cannot directly use the strings to index an array. Instead we calculate the string’s index value using a special hash function, and use this index value to access the array containing the pre-defined tariffs that we want.

Real-world Hashing

In reality, perfect hashing is less than straightforward to implement. What usually happens is that the hashing function will sometimes map two or more different strings on to the same index value, known as collisions. To get around this, our hash table needs to map each string onto a linked list of potential choices, not just a single one, so that by searching the candidate linked list we can find the string we are after, as well as the tariff value it contains. When using sequential searches (for loops), the average number of comparisons needed to find what we are after is n/2. When using hash tables, the average number of comparisons required is one, plus the one hash function calculation.

Defining a Hash Table Class

Here is the definition of the class to implement the hash table. The hash table itself is an indexed array of linked lists. The majority of these linked lists will contain none or one entry. When collisions occur with more than one string getting hashed to the same index value, the entry gets added to the same linked list via the Add function:

class HashTable
{
private:
    int hash( const char* str ) const;
    LinkedList list[ TableSize ];
    LinkedList const & FindList( const char* str ) const;

public:

    void Add( const char* str, int id );
    bool Get( const char* str, int& value );
};

Here is the choice of hash function, which is based on a shift-and-add algorithm as a means of randomizing the strings.

// Hashing function to convert string into integer index
int HashTable::hash( const char* str ) const
{
    assert( str != NULL && str[ 0 ] != NULL );
    unsigned h = str[ 0 ];

    for ( int i = 1; str[ i ] != NULL; ++i )
    {
        h = (h << 4) + str[ i ];
    }

    // Return remainder
    return h % TableSize;
}

To find the appropriate linked list in the hash table, it is a simple matter of calculating the hash value and using this value to index the correct linked list:

// Return the linked list containing one or more items
LinkedList const & HashTable::FindList( const char* str ) const
{
    int h = hash( str );
    return list[ h ];
}

Example Usage

Here is some code outlining how to use the HashTable APIs to add new items and search for items using the hashing algorithm:

HashTable hTable;  
int value = 0;    
      
// Add some entries  
hTable.Add( "01316530969", 11 );  
hTable.Add( "01314405220", 12 );  
hTable.Add( "02920591318", 10 );  
hTable.Add( "02920565967", 14 );  
      
// Get the value from the return linked list  
bool success = hTable.Get( "01314405220", value );

The Visual Studio project is also downloadable from here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

AndyUk06
Software Developer http://technical-recipes.com
United Kingdom United Kingdom
Software Engineer who has worked on projects in telecommunications, industrial robotics and image processing.
 
My home page is http://technical-recipes.com
Follow on   Twitter

Comments and Discussions

 
QuestionFault Pinmemberjohn_china29-Jun-12 21:52 
GeneralRe: Fault PinmemberAndyUk061-Jul-12 10:01 
GeneralRe: Fault [modified] PinmemberAndyUk061-Jul-12 10:07 
Generalcollision [modified] Pinmemberjohn_china29-Jun-12 16:24 
GeneralRe: collision PinmemberAndyUk0629-Jun-12 21:43 
GeneralRe: collision Pinmemberjohn_china30-Jun-12 0:14 
QuestionWhere's the beef? Pinmembernv316-Nov-11 2:34 
AnswerRe: Where's the beef? PinmemberAndyUk0616-Nov-11 4:28 
GeneralRe: Where's the beef? PinmemberOscar Caulfield16-Nov-11 6:23 
SuggestionNice attempt PinmemberShakti Misra15-Nov-11 22:49 
Attempt is good always. But here are a few concerns/suggestions:
There is no provision for chaining.
You can take a look at other hash function libraries, like boost's hash (if you are using other boost components).
It's always better to use some other hash function library and base your code on that if time and cost permits. Like http://libhashish.sourceforge.net/[^]
or std::tr1::hash or http://cmph.sourceforge.net/[^]
Over this you can build your library as a combination of policies.
GeneralRe: Nice attempt [modified] PinmemberAndyUk0616-Nov-11 3:52 
GeneralRe: Nice attempt PinmemberShakti Misra16-Nov-11 6:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140827.1 | Last Updated 16 Nov 2011
Article Copyright 2011 by AndyUk06
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid