Click here to Skip to main content
Click here to Skip to main content

Implementing Ordering in a Hashtable

By , 20 Jun 2002
 

Contents

Introduction

Hashtables are a convenient choice when it comes to searching, particularly because in the collection, an element is stored depending upon the information in the element itself. This information is extracted in an (almost) exclusive way by use of hash-functions. While hashtables are a useful data structure, they do lack one thing - the order in which elements are stored in a hashtable is not preserved. That means, given a Hashtable, I can get an Iterator on the set of Hashtable keys, (by keySet() method) or I can get an Enumeration (by keys() method). But none of the two is guaranteed to retrieve elements in the same order as they were put in. This is plausible because Hashtables have generally been used more for their hashing capability rather than anything else.

Why Ordering

In our project, we had a situation where the results returned by an SQL query was stored in a Hashtable (so that it could be searched easily later). But at the same time, we needed to preserve the ordering returned by SQL, since ordering at database side is much faster than doing it programmatically. Moreover, database can handle multiple and complex sort orders quite well (using the order by clause). Here was a situation where we needed to implement ordering capabilities into a Hashtable.

The Algorithm

The obvious place to look was into the source code of the Java programming language, which gave us Hashtable.java. The solution came up right in the form of writing another class - OrderedHashtable.java which uses some techniques like inner classes.

The algo was simple: Take a Hashtable and a Vector. Whenever you put the objects into the Hashtable, put the corresponding keys in the Vector. Since Vector is ordered, Enumerations or Iterators on it are guaranteed to traverse the keys in the same order as they were put in. So if later you need to iterate through all the objects in the Hashtable (in the order they were put in), just take hold of the Iterator of the Vector and get the corresponding objects from the Hashtable. There you have an OrderedHashtable! Of course it still has the hashing capabilities of a Hashtable, and you can of course retrieve an object directly if you have its key (same way as in a Hashtable). Only new thing is that, Iterator (or Enumeration) is guaranteed to be ordered.

Some Issues to be Careful of!

Of course, there are some important things to take care of, like synchronization. Since items can be added as well as removed from OrderedHashtable, we need to make sure that the internal Vector and Hashtable of the class remains thread-safe. If two threads access it at the same time and one adds and other deletes the same object, it could leave the Hashtable in an arbitrary, indeterminate state, where either the Vector or the Hashtable or the internal COUNT of number of elements (or all of the three) may be incorrect. This calls for certain sections of code to be synchronized.

Similarly, there is another important issue of overflow. Both Hashtable and Vector are resizable, but the way their size increases is different. In Hashtable, we have a loadfactor which specifies how full the Hashtable is allowed to become before the size increases (by calling the rehash() method). Whenever the number of entries in a Hashtable exceeds the product of loadfactor and current capacity, the capacity is increased by twice the current capacity. Default values of initial capacity and loadfactor for Hashtable are 10 and 0.75 respectively. So in the default case, when the 9th object is put in, the Hashtable is doubled to 20.

The Vector however operates differently. In a Vector, you can specify a capacityIncrement value. Whenever the Vector becomes full, its capacity is incremented by capacityIncrement (or doubled in case the capacityIncrement is 0). Default values of initialcapacity and capacityIncrement for a Vector are 10 and 0. Here in the default case, the size grows only when the 11th object is put in and the size grows to 20. Again when 21st object comes in, size is doubled to 40 and so on. One would need to be careful in put() and get() methods to handle this correctly. Other issues like deletion etc, can also be handled easily with slight care.

Implementation

Now comes the magic of inner classes. Iterators and Enumerations are implemented as inner classes. (Obviously, how else?). When either Enumeration (enumerateKeys() method) or an Iterator (iterateKeys() method) is called, it is this inner class which iterates on a private Vector and returns the corresponding object from the Hashtable and makes sure to keep it within error limits.

Performance

A final thought that comes to mind immediately after this is - "What is the performance overhead of implementing this ordering?" Since we are using a Vector, which is known to be one of the slow performing classes among all Java classes, we do expect our OrderedHashtable to be slower than Hashtable. I have included two small test files apart from the source code of OrderedHashtable, in the download package. The code is well commented, and I hope it is easily understood. While one of them (Test.java) establishes the fact that ordering is properly implemented (after all that is what this article is about!) the other file (Performance.java) does some basic performance testing. It populates a Hashtable and an OrderedHashtable with one million objects and compares the time spent in accessing them. While Test.java prints the items retrieved to the standard output stream to show that they are indeed in the same order as they were inserted, Performance.java doesn't print the objects themselves. There are two reasons for this - one, I don't want to be flooding the screen with two million lines, and second (more importantly) the System.out.println() will itself take some time, whereas we just want to compare the time spent in accessing the elements of the Hashtables.

On my system, a number of trial runs indicate that for very large values (like 1 million), the time required by both are almost same (31 milliseconds). This is surprising because I had expected the Vector to slow down the OrderedHashtable quite a bit. Yes, once in a while, the OrderedHashtable takes twice the time as that of Hashtable, but such instances are rare. You are welcome to test-run it on your system. Your mileage may vary because of the system, the system settings and the current load on it. If there are any issues, comments, suggestions, or new findings, you are most welcome to enlighten me and other fellow programmers at CP.

Final Notes

The source code presented here implements most of the methods of java.util.Hashtable. But there are a few methods which have been left out, some because they are not in line with the concept of ordering (e.g. those which return a Set), and some because of their being declared as protected or private in the underlying Hashtable class (e.g. rehash()). However, enough methods have been implemented to provide the full functionality of a Hashable (for almost all situations) while maintaining ordering at the same time. But if you feel that it should be having some other methods also, which could improve the functionality/usefulness of the OrderedHashtable, then you are most welcome to provide any feedback and discuss it with the CP community. :-)

Thanks for reading the article. Have a great day!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Animesh Srivastava
Web Developer
India India
Member
C/C++ Win32 developer, working on Java! Wink | ;)

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralNice idea but I was doing some search...memberOlivier Voutat1 Oct '07 - 4:19 
Did you read something about LinkedHashSet and LinkedHashMap?
GeneralNice... but here's something a bit cleanermemberwlburgess7 Jan '06 - 11:16 
First of all, thank you to Animesh for his solution, and getting our creative juices flowing by presenting an interesting solution to a pre 1.4 problem.
 
I agree with the comments that this class is heavy on space usage, but time-wise the performance is not bad (the trade-off here is space).
For small usages, its not bad, and it still has merit as thread-safe even with the existance of LinkedHashMap (needs help being thread safe) since 1.4.2.
 
One thing I noticed was that the class was not taking full use of inheritance and had not only extended the Hashtable class but contained a Hashtable object...
 
So I shortened things a bit and made calls to non-overided methods work at the same time.
 
There's lots of ways to do something similar but this is still of interest.
 
Here's the updated code:
import java.util.*;
 

/* This class is used to combine the key-value lookup capabilities of a
Hashtable along with order preserving capabilities of a Vector.
Iterator on a Set of Hashtable keys, obtained by keySet() or an
Enumeration obtained by keys() method, both are not guaranteed to
iterate in the same order as the values were put in.
 
This class behaves like a queue, (FIFO). Objects are returned in the
same order they were put in.
 
@author Animesh Srivastava
@author Wayne Burgess
*/
 
// friendly only
public class OrderedHashtable extends Hashtable {
 
//member variables
private Vector mSerialOrder;
 

/** Public Constructor */
public OrderedHashtable()
{
super();
this.mSerialOrder = new Vector();
}
 

/** Clears this OrderedHashtable so that it has no keys.
*
* @exception UnsupportedOperationException - clear is not supported by
* the underlying Interface java.util.Map.
*/
synchronized public void clear() throws UnsupportedOperationException
{
super.clear();
this.mSerialOrder.clear();
}
 

/** Removes the key (and its corresponding value) from this OrderedHashtable.
* Does nothing if key is not in the OrderedHashtable.
*
* @param key - the key that needs to be removed.
* @returns the value to which the key had been mapped in this OrderedHashtable,
* or null if the key did not have a mapping.
*/
synchronized public Object remove(Object key)
{
this.mSerialOrder.remove(key);
return super.remove(key);
}
 

/** Maps the specified key to the specified value in this OrderedHashtable.
* Neither the key nor the value can be null. If the key already exists
* then the ordering is not changed. If it does not exists then it is added
* at the end of the OrderedHastable.
*
* @param key - the key.
* @param value - the value.
* @exception - NullPointerException, if the key or value is null.
* @returns the previous value of the specified key in this hashtable, or
* null if it did not have one.
*
*/
synchronized public Object put(Object key,Object value) throws NullPointerException
{
Object toReturn = super.put(key,value);
if(toReturn == null)
this.mSerialOrder.add(key);
return toReturn;
}
 
/** Returns an Iterator to iterate through the keys of the OrderedHashtable.
* Iteration will occur in the same order as the keys were put in into the
* OrderedHashtable.
*
* The remove() method of Iterator interface is optional in jdk1.3 and hence
* not implemented.
*/
public Iterator iterateKeys() {
return new Enumerator();
}
 

/** Returns an Enumeration to enumerate through the keys of the OrderedHashtable.
* Enumeration will occur in the same order as the keys were put in into the
* OrderedHashtable.
*
*/
public Enumeration enumerateKeys() {
return new Enumerator();
}
 

/** Returns a string representation of the OrderedHashtable. */
public String toString()
{
StringBuffer s = new StringBuffer();
s.append("{ ");
Object key=null;
int i=0;
while(i
Generali need codes for a projectmemberaaku21 Dec '05 - 6:46 
hello friends
i need a code for hast table encryption. please send me the code to the mail
ferrai_20@yahoo.com

 
aaku
GeneralNice onememberManish Hatwalne10 Sep '02 - 22:02 
WIth JDK 1.4 you have the built in class - LinkedHashMap to implement this.
 
- Manish
GeneralA bit crazymemberMike Asher22 Jun '02 - 7:58 
If you want a data structure with the lookup characteristics of a hash table, but with gauranteed iteration order, then you probably want a B tree, red/black tree, or something similiar. It will will consume a lot less memory and be far faster than the dual structure listed above.
 

GeneralRe: A bit crazysussAnonymous22 Dec '03 - 4:42 
Would a TreeMap not be better possibly with a custom Comparator?
GeneralRe: A bit crazymembernewkidtown8 Sep '04 - 12:16 
It won't be faster as a sorted tree has access times of log(n) whereas a Hashtable has constant times. 1.4's LinkedHashMap is the answer
GeneralIncorrectmemberMike Asher10 Sep '04 - 8:33 
No, you're not correct. First of all, the author's structure isn't a hashtable-- its a compound hash/vector. The insert and particularly delete times will be far, far slower than any tree-based structure. If you need multithreaded access, it's slower still, since you'll need to lock both structures to guarantee synchronization.
 
Furthermore, this compound structure doesn't allow arbitary ordering...the "ordering" is by insert order only. If you want to iterate or order by key value (the normal case), then you'd need to sort the vector portion of the data structure...and scan it for every access-- at n/2 cost!
 
The actual key-value access time *could* be faster for this structure. If you assume a "good" hashing algorithm, a low collision rate on hash keys, AND you ignore the setup time for the vector and hash table itself. In practice, a tree would have faster access times except in the case of a well-behaved dataset larger than a million rows or so...and even in that case, it would be close. Throw in even a small percentage of inserts or deletes, and your performance dives way below the results for a tree.
 
Finally, the memory cost of the hash table (almost always more memory intensive than a tree) PLUS duplicating all the keys in a vector, makes the memory footprint of this solution prohibitive for large datasets...the very case in which they might actually have a chance to be faster! And all for what? Just to be able to iterate in the order you performed inserts?
 
In summary, this is an **extremely** specialized structure. While I can certainly conceive of some cases where it would outperform a tree, in 99.9% of all real-world cases, its going to do far worse.

GeneralcoolmemberRishabh21 Jun '02 - 5:04 
its cool man, useful too.. But that is not what i liked. The good part is that its implemented neatly.

GeneralRe: coolmemberJamesCook20 Aug '03 - 3:27 
Actually, it's not all that cool. It extends Hashtable but declares another one. The idea is good, but I had to rewrite it before using:
 
import java.util.Enumeration;
import java.util.Hashtable;
import java.util.Iterator;
import java.util.NoSuchElementException;
import java.util.Vector;
import java.util.Map;
 
/**
* This class is used to combine the key-value lookup capabilities of a
* Hashtable along with order preserving capabilities of a Vector.
* Iterator on a Set of Hashtable keys, obtained by keySet() or an
* Enumeration obtained by keys() method, both are not guaranteed to
* iterate in the same order as the values were put in.
*
* This class behaves like a queue, (FIFO). Objects are returned in the
* same order they were put in.
*
* @author Animesh Srivastava
* @author James Cook
*/
public class OrderedHashtable extends Hashtable {
      //member variables
      private Vector _serialOrder;
 
      //----------------------------------------------------------
      // constructors
      //----------------------------------------------------------
 
      /** Public Constructor */
      public OrderedHashtable() {
            super();
            _serialOrder = new Vector();
      }
 
      //----------------------------------------------------------
      // overridden methods
      //----------------------------------------------------------
 
      /**
      * Clears this OrderedHashtable so that it has no keys.
      *
      * @exception UnsupportedOperationException - clear is not supported by
      * the underlying Interface java.util.Map.
      */
      synchronized public void clear() throws UnsupportedOperationException {
            super.clear();
            _serialOrder.clear();
      }
 
      /**
      * Removes the key (and its corresponding value) from this OrderedHashtable.
      * Does nothing if key is not in the OrderedHashtable.
      *
      * @param key - the key that needs to be removed.
      * @returns the value to which the key had been mapped in this OrderedHashtable,
      *                              or null if the key did not have a mapping.
      */
      synchronized public Object remove(Object key) {
            _serialOrder.remove(key);
            return super.remove(key);
      }
 
      /**
      * Maps the specified key to the specified value in this OrderedHashtable.
      * Neither the key nor the value can be null. If the key already exists
      * then the ordering is not changed. If it does not exists then it is added
      * at the      end of the OrderedHastable.
      *
      * @param key - the key.
      * @param value - the value.
      * @throws NullPointerException if the key or value is null.
      * @returns the previous value of the specified key in this hashtable, or
      *                              null if it did not have one.
      */
      synchronized public Object put(Object key, Object value) throws NullPointerException {
            Object toReturn = super.put(key, value);
            if (toReturn == null) {
                  _serialOrder.add(key);
            }
            return toReturn;
      }
 
      //----------------------------------------------------------
      // public methods
      //----------------------------------------------------------
 
      /**
      * Returns an Iterator to iterate through the keys of the OrderedHashtable.
      * Iteration will occur in the same order as the keys were put in into the
      * OrderedHashtable.
      *
      * The remove() method of Iterator interface is optional in jdk1.3 and hence
      * not implemented.
      *
      * @return the Iterator of keys.
      */
      public Iterator iterateKeys() {
            return new Enumerator(true);
      }
 
      /**
      * Returns an Iterator to iterate through the values of the OrderedHashtable.
      * Iteration will occur in the same order as the values were put in into the
      * OrderedHashtable.
      *
      * The remove() method of Iterator interface is optional in jdk1.3 and hence
      * not implemented.
      *
      * @return the Iterator of values.
      */
      public Iterator iterateValues() {
            return new Enumerator(false);
      }
 
      /**
      * Returns an Enumeration to enumerate through the keys of the OrderedHashtable.
      * Enumeration will occur in the same order as the keys were put in into the
      * OrderedHashtable.
      *
      * @return the Enumeration of keys.
      */
      public Enumeration enumerateKeys() {
            return new Enumerator(true);
      }
 
      /**
      * Returns an Enumeration to enumerate through the values of the OrderedHashtable.
      * Enumeration will occur in the same order as the values were put in into the
      * OrderedHashtable.
      *
      * @return the Enumeration of values.
      */
      public Enumeration enumerateValues() {
            return new Enumerator(false);
      }
 
      /**
      * Returns a string representation of the OrderedHashtable.
      *
      * @return a String representation of this class
      */
      public String toString() {
            StringBuffer s = new StringBuffer();
            s.append("{ ");
            Object key = null;
            int i = 0;
            while (i < _serialOrder.size()) {
                  key = _serialOrder.elementAt(i++);
                  s.append(key);
                  s.append("=");
                  s.append(get(key));
                  s.append("; ");
            }
            s.append(" }");
            return s.toString();
      }
 
      //----------------------------------------------------------
      // inner class
      //----------------------------------------------------------
 
      private class Enumerator implements Enumeration, Iterator {
            private int _count = _serialOrder.size(); //number of elements in the Vector
            private int _serial = 0; //keep track of the current element
            private boolean _keys;
 
            public Enumerator(boolean keys) {
                  _keys = keys;
            }
 
            public boolean hasMoreElements() {
                  return _serial < _count;
            }
 
            public Object nextElement() {
                  synchronized (OrderedHashtable.this) {
                        if ((_count == 0) || (_serial == _count)) {
                              throw new NoSuchElementException("OrderedHashtable Enumerator");
                        }
                        if (_keys) {
                              return _serialOrder.elementAt(_serial++);
                        } else {
                              return get(_serialOrder.elementAt(_serial++));
                        }
                  }
            }
 
            public boolean hasNext() {
                  return hasMoreElements();
            }
 
            public Object next() {
                  return nextElement();
            }
 
            //optional in jdk1.3
            public void remove() {
            }
      }
 
}

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 21 Jun 2002
Article Copyright 2002 by Animesh Srivastava
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid