Click here to Skip to main content
Click here to Skip to main content

STL containers map, set, and list with fast array access

By , 5 Aug 2002
 
<!-- Link to source file download --> <!-- Add the rest of your HTML here -->

Introduction

This source code contains five STL compliant container classes with the following names:

index_list<T>
index_set<T,Cmp>
index_multiset<T,Cmp>
index_map<Key,T,Cmp>
index_multimap<Key,T,Cmp>

Each of these implements the same interface as the STL class in the second part of its name. In addition, each one also implements the STL interface for the vector<T> class, providing fast array access via operator[] and at(). T is the type of the contained class. Key is the type of the lookup key for pair-associative containers. Cmp is the comparison predicate and must be a binary predicate with arguments of type T for set and type Key for map. If you do not provide a comparison predicate, it will default to std::less<Arg>.

The classes are all derived from an AVL tree that I borrowed form Adreas Jager (copyright included) and modified to allow for the indexed access. Each tree node contains an extra long integer that holds the number of child nodes below it. The algorithm to update these counts does not add to the complexity of any of the original tree algorithms. Since the interfaces are documented in the STL, I will simply provide the complexity analysis below. I believe I have covered the most-used functions of these containers, but some functions may not be included. Please contact me if you wish to add another STL function and I will be glad to update the source.

Complexity Analysis

index_list<T>

An index_list is a sequence which combines the interfaces of array, deque, and list.

Here are the complexity comparisons with other sequence containers:

array deque list index_list
Indexed access 1 1 (N) lgN
Insert/delete N N 1 1+
Front insertion(N) 1+ 1 1+
Back insertion 1+ 1+ 1 1+

It tends to work best for large sets (N >> 1).

It also has the extra function insert(const T&), usually used for sets, which chooses an insert position that keeps the tree balanced with a minimal number of rotation operations.

index_set<T,Cmp>
index_multiset<T,Cmp>
index_map<Key,T,Cmp>
index_multmap<Key,T,Cmp>

These classes implement the STL interfaces for set, multiset, map, and multimap respectively, along with the indexed access (operator[], at()). Here are the complexity comparisons with other associative containers:

arraylistsetindex_set/map etc.
Indexed access 1 (N) (N+) lgN
Insert/delete N 1 1+ 1+
Search lgNN lgN lgN

Enjoy!

History

21 July 2002 - updated downloads

6 Aug 2002 - updated downloads

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Ray Virzi

United States United States
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralMy vote of 5memberDemonS7728-Oct-10 0:13 
i need this!
Questionis this still supportedmembermaciekboni20-Jun-09 5:37 
Hello, just wondering if these classes are still being supported. I just tried to compile with g++ version 4.2.3 on kubuntu 8.04, and I got some errors. The first few are:
 
index_set/iterator_base.h:42: error: to refer to a type member of a template parameter, use ‘typename I::value_type’
index_set/iterator_base.h:42: error: to refer to a type member of a template parameter, use ‘typename I::distance_type’
index_set/iterator_base.h:75: error: type ‘I’ is not derived from type ‘stl_bidirectional_iterator<I>’
 

Thanks.
AnswerRe: is this still supportedmemberRay Virzi20-Jun-09 8:18 
I am not uploading new version, but you can fix this issue by adding the "typename" parameter in front of the class type definitions as reported for line 42. That may get rid of the one on 75. There may be several other places where "typename" must be added.
 
I compiled on MSVC which does not enforce the typename rule which I believe is required by ANSI.
GeneralRe: is this still supportedmembermaciekboni30-Jun-09 18:54 
Thanks Ray. I actually wasn't able to get it to compile as I'm not well versed in typenames, class templates, and the other tools I need to get this to work. ( I also got an error about an expected "nested-name-specifier" ). I mainly use STL classes, but don't add anything into the implementation.
 
Basically, I need a data structure that is an STL set, but where I can select elements at random. So, your index_set would be perfect. Can you recommend another implementation that compiles readily in Ubuntu? Or an implementation of a red-black tree that allows random selection of elements that compiles readily on Ubuntu?
 
Thanks very much.
GeneralRe: is this still supportedmemberRay Virzi30-Jun-09 19:45 
Sorry, never saw another container like you describe. I think you should get a friend who knows STL pretty well and have him/her make the correction. It should be easy. Wherever you see a class definition that has "typedef X: ", you replace it with "typedef typename X: ". For example:
 
typedef T:iterator_type iterator_type
 
should be changed to:
 
typedef typename T:iterator_type iterator_type
 
I hope that helps. I would be a shame not be able to use the class just because of simple compiler error. I'm sorry I did not compile under ANSI restrictions myself.
 
Ray Virzi
GeneralRe: is this still supportedmembermaciekboni1-Jul-09 0:31 
Thanks Ray. I still can't get it to compile. I realize that you're not maintaining this anymore, but if for any reason you end up modifying it to compile on Linux/Ubuntu, please let me know.
 
Best Regards,
Maciek
GeneralI'm stumpedmemberJörgen Sigvardsson8-Nov-04 12:57 
I just can't understand why one would want to map linear structures onto binary search trees. I can understand if one wants to track the order items were put into a tree, but as far as I can tell, that's not possible using these structures.
 
Search trees are used when there is a functional relation between the domain and range (key/value pairs), and linear structures when there exists no such relation.
 
If I understand the code correctly, an index into a tree is not fixed. By that I mean tree[1] may not necessarily be the same value as tree[1] after an insertion or deletion of a value in the tree due to rotations. That doesn't seem too advantageous to me. It looks like a linear structure, but doesn't behave like one.
 
I cannot see what problems indexed access to tree structures can solve. Am I missing something here? Confused | :confused:
 
--
Weiter, weiter, ins verderben.
Wir müssen leben bis wir sterben.
 
I blog too now[^]

GeneralRe: I'm stumpedmemberRay Virzi8-Nov-04 16:01 

I can understand your puzzlement, but you don't have to use the code if you don't need it Wink | ;)
 
But to offer a suggestion, what this container does is to allow lists, sets, and maps to have a random access iterator. Normally, they only have bi-directional iterators. This allows them to work inside algorithms that require random access iterators. You can do a lot of things with and indexed list that you could normally only do with an array. As for sets and maps, it is a little more obscure as you have mentioned. The algorithms you use on these must be non-mutating.
 

GeneralRe: I'm stumpedmemberJörgen Sigvardsson9-Nov-04 9:19 
Ray Virzi wrote:
But to offer a suggestion, what this container does is to allow lists, sets, and maps to have a random access iterator.
 
Aha, I see. Smile | :)
 
--
Weiter, weiter, ins verderben.
Wir müssen leben bis wir sterben.
 
I blog too now[^]

GeneralRe: I'm stumpedmemberlathot25-May-05 9:02 
Another thing that is confusing: stl maps already have an operator[] defined.
The new operator[] defined in index_map has a completely different action.

QuestionHow about Judy Arrays?sussAnonymous8-Sep-03 4:53 
http://judy.sourceforge.net

AnswerRe: How about Judy Arrays?memberRay Virzi8-Sep-03 6:02 

Hadn't heard of them, but I checked out your link. Looks VERY interesting. Seems to be tuned for performance at the machine level. I'll have to read more about it.
 
Thanks
Ray
QuestionDoesn't compile in Visual Studio .NET 2003?memberChris Coble9-Jun-03 9:41 
I tried including the header files into my Visual Studio .NET 2003 C++ project and it fails with all kinds of errors like the following:
 
warning C4346: 'I::value_type' : dependent name is not a type prefix with 'typename' to indicate a type, see reference to class template instantiation 'stl_bidirectional_iterator' being compiled
 
error C2923: 'std::iterator' : 'I::value_type' is invalid as template argument '#2', type expected
 
warning C4346: 'I::distance_type' : dependent name is not a type prefix with 'typename' to indicate a type
 
error C2923: 'std::iterator' : 'I::distance_type' is invalid as template argument '#3', type expected
 

Any suggestions?
AnswerRe: Doesn't compile in Visual Studio .NET 2003?memberRay Virzi11-Jun-03 20:35 

I guess Microsoft decided to adhere to the C++ standard more strictly in VS.NET. How convenient for them.
 
You can fix the errors by putting in the typename keyword just before the template type reference. Like so...
 
typedef typename I:value_type value_type
 
I don't have time to fix it right now, so I hope that helps.
 
Ray
 

GeneralRe: Doesn't compile in Visual Studio .NET 2003? [modified]memberStefan Gebauer18-Aug-07 23:10 
Well as Ray Virzi wrote it seems that one has to put a lot of "typename" keywords, especially before "iterator"(e.g. "AVLTree::iterator"->
"typename AVLTree::iterator"), "const_iterator", "reference",
"const_reference", "value_type", "size_type".
So it compiles under VS 2005.
Questionwhere is the source code ?membermwdev4-Apr-03 18:55 
I downloaded the source files at top of the article, but it only has base_iterator.h and avltree.h... where are all the index_* classes?
thanks,
Matias Confused | :confused:
AnswerRe: where is the source code ?memberRay Virzi4-Apr-03 20:59 
They are at the end of the avltree.h file. These are template classes so they don't need a source code file. Just include the header in your code and declare the class as an instantiated template.
Generalsome small extensionsmemberGeorge Anescu4-Oct-02 3:51 
Sometimes can be useful to know the index where new elements are inserted or elements are found. Some overloaded member functions find() and insert() can be easily implemented based on the provided functionality, for example for index_set:
 
template<class T, class Cmp>
index_set<T,Cmp>::const_iterator index_set<T,Cmp>::find(const T& item, unsigned long& index) const
{
AVLNode<T>* pos = Find(item);
if(!pos)
pos = _endnode;
index = pos->GetIndex();
return AVLTree_Iterator(*this, pos);
}
 
template<class T, class Cmp>
std::pair<index_set<T,Cmp>::iterator,bool> index_set<T,Cmp>::insert(const value_type& item, unsigned long& index)
// Inserts an item into the set. Returns end of sequence and false if the item is
// already in the set, otherwise returns a pointer to the item inserted and true.
{
AVLNode<T>* pos = Insert(item);
index = pos->GetIndex();
bool found = (pos ? true: false);
return std::make_pair(AVLTree_Iterator(*this, pos), found);
}
 

GeneralRe: some small extensionsmemberRay Virzi4-Oct-02 20:06 
Interesting extensions to the standard library. I must point out that the GetIndex() function does not work for the _endnode marker, so you must check for that condition. What is 'index' set to when the item is not found or cannot be inserted? Perhaps -1?
 
I think the conventional way to get the index is to take the iterator returned by both find() and insert() and subtract it from the beginning of the container, like so:
 
index_set<T> myset;
index_set<T>::iterator at;
unsigned index;
 
at = myset.insert(T()).first;   (or at = myset.find(T());)
index = myset.begin() - at;
 
This should give you the 0-based index. If the function fails, at is set to myset.end() and index will be the total size of the container, so this condition must still be checked for.
 
But I have also noticed a bug in my code as well. The insert function won't work this way for set and map if the item cannot be inserted. This is because the end maker must point to _endnode. My last line of the insert function should read:
 
return std::make_pair(AVLTree_Iterator(*this, (pos ? pos : _endnode)), found);
 
I will upload a corrected version of the code when I get a chance.
 
Ray
GeneralRe: some small extensionsmemberGeorge Anescu7-Oct-02 5:55 
I tried your suggestions and everything is fine, exept that instead of:
 
index = myset.begin() - at;
 
it should be:
 
index = at - myset.begin();
 
See the following example code:
 
//Some Testing Code
 
#include "avltree.h"
 
#include <iostream>
 
using namespace std;
 
int main()
{
index_set<int> oSet;
 
oSet.insert(11);
oSet.insert(17);
oSet.insert(5);
oSet.insert(3);
oSet.insert(9);
oSet.insert(15);
oSet.insert(101);
oSet.insert(97);
oSet.insert(91);
 
//List Contents
index_set<int>::const_iterator it = oSet.begin();
while(it != oSet.end())
{
cout << *it << endl;
it++;
}
 
//Check Index Operator
cout << endl;
cout << "[7] = " << oSet[7] << endl;
 
//Test find with index
cout << endl;
it = oSet.find(15);
if(it != oSet.end())
{
unsigned long index = it - oSet.begin();
cout << index << endl; //should be 4
}
 
//Test insert with index
cout << endl;
pair<index_set<int>::iterator, bool> pr = oSet.insert(93);
if(pr.first != oSet.end())
{
unsigned long index = pr.first - oSet.begin(); //should be 7
cout << index << endl;
}
return 0;
}
 
Thanks!
 

GeneralRe: some small extensionsmemberRay Virzi7-Oct-02 16:35 
>>I tried your suggestions and everything is fine, exept >>that instead of:
>>
>>index = myset.begin() - at;
>>
>>it should be:
>>
>>index = at - myset.begin();
 
You are correct. My mistake. Glad the suggestion worked out.
Ray

Generalperformance issuessussm.6-Aug-02 1:25 
I tried the sources on linux, gcc 3.0.1..
first I needed to do some adjustments in sources in order to make it work..
 
I made 3 classes, 1 std::map, 1 indexed_map, 1 std::hash_map, each inserted with 1000000 elements, key type of long int, value type of my class A (consists of int, char, std::string).
I run tests, where I was generating random numbers using random() which was my key and I read the data from containers using at() or find() methods.
 
Filling the index_map was a little faster than filling std::map but a lot slower than filling std::hash_map.
Finding an element in index_map was slower (about 1.3 times slower) than finding an element in std::map and about 5 times slower than finding and element in std::hash_map.
 

GeneralRe: performance issuessussMaxim Locktyukhin6-Aug-02 5:20 
This results are absolutly obvious. std::hash_map works faster then std::map because it uses hash table algorithm that has O(1) time estimation for search|add|delete operations, while map uses "red-black tree" data structure that has O(log N) for this opearions, but it has a capability to store data in sorted order while hash_map could not. And in conclusion the "red-black tree" is the fastest structure in general case from the all wide balanced trees structures.
GeneralRe: performance issuesmemberRay Virzi7-Aug-02 15:11 

I use an AVL tree for this implementation. If your point is that a red-black tree is faster than an AVL tree, my research indicates that they are essentially the same average speed. But these containers could be just as easily implemented with a red-black tree. It's just a matter of choice.
 
The unique thing about it is that I have added fast indexing capability to the tree, which is not possible for a normal tree structure and has no meaning for a hash-map. So until someone does this with a red-black tree instead of AVL, its the best one here Smile | :) .
 
Ray

GeneralRe: performance issuessussMaxim Locktyukhin8-Aug-02 6:43 
By the way, Red black tree in fact is AVL tree, but more advanced. It used the same idea of subtree rotaion, but Red-Black tree is optimized for best perfomance in case of large number of random additions and removing from the tree, because it could be more disbalanced then simple AVL, so it requres less rotations in common case, but could require some much time to search operations it the same time. Anyway the speed diffrence between this realizations is minimal.
GeneralRe: performance issuesmemberJörgen Sigvardsson8-Nov-04 12:29 
No, AVL-trees are AVL-trees, and RB-trees are RB-trees. Different algorithms, although the same goal. RB-trees are slightly faster than AVL-trees, by a constant factor. The AVL algorithm got its name from its inventors: Adelson & Velskii, and RB-trees got its name from Rudolf Bayer (Red/Black was no coincidence Wink | ;) )
 
As the height difference in RB-trees may be at most 1 as in the AVL case, there is a not much of a difference between AVL-trees and RB-trees in terms of "optimization".
 
--
Weiter, weiter, ins verderben.
Wir müssen leben bis wir sterben.
 
I blog too now[^]

GeneralRe: performance issuesmemberRay Virzi7-Aug-02 15:23 

Oops - I didn't see you were replying to another message (dumb webmail system). My reply should have been to m.
 
Ray

Generalindex_multimap->count() function return invalid resultsussLongbird4-Aug-02 15:04 
Wink | ;)
I use your source code at my project. That is very useful thing, but i head serious problem in using count() function at index_multimap.
In case i insert two item at index_multimap, item 1 is <10001, lpData1> and item2 is <10002, lpData2>.
I call count(10001), the result is 0 or -1. (1 is desirable)
I test another method as lower_bound() and upper_bound.
When I count the matched items from lower_bound() to upper_bound(), that's return valid result as 1.
 
Please check count() function.
Thank you.
Smile | :)
GeneralAn example would be helpfulmemberMatt Philmon2-May-02 5:43 
A bundled example would be useful here. For instance, in using the index_map instead of plain old map I can theoretically now iterate through the map using [] instead of an iterator, right? If this is the case, how do I determine the size (number of entries in the map)? I just started looking at this and didn't immediately see how to use it.
 
Thanks,
Matt
GeneralRe: An example would be helpfulmemberRay Virzi2-May-02 21:10 
These classes will not be very useful to you until you have knowledge of the STL container interfaces. I suggest you read a tutorial on the original containers list, set, and map in the standard library. You will find, for example, that the standard function for finding the total number of items in any STL container is to use the size() function.
 
As a side note, you should not use the array operator [] to iterate through any container. The STL provides iterator classes for that. The array operator is for random access at a particular indexed location.
 
Ray
 

GeneralRe: An example would be helpfulmemberCarl Berg6-Aug-02 2:55 
One of STL's more important features is algorithms, which ought to be used whenever possible, with few exceptions. This is something overlooked by many C++ programmers, even experienced ones, mostly due to lack of insight on how efficient algorithms can be.
 
An article well worth reading was written by the 'guru' Scott Meyers on the subject: STL Algorithms vs. Hand-Written Loops (C/C++ Users Journal, October 2001).
QuestionUpdates?memberErnesto Perales Soto3-Dec-01 10:52 
Hi Ray,
 
I see you updated your article... can you post the changes? anything worthwhile?
 
¡Adiós!
 
Smile | :)
AnswerRe: Updates?memberRay Virzi27-Feb-02 15:50 
(Re-post of private email)
 
Just a few bug fixes with the iterators. See the revistion history in the source file.
 
Ray

QuestionBoost.org?memberJames Curran3-Dec-01 8:33 
Ray,
Are you familar with Boost.org? This is an open-source project creating libraries for C++, much like CodeProject, except everything is generic C++ (no MFC) and it's seriously peer-reviewed.
 
Their goal is to create libraries which may one day be added to the Official C++ Standard (many of Boost's founders are committee memebers).
 
Since extensions to the STL such as yours are very much a part of what they do, they may be interested in your contributions.
 
Truth,
James
AnswerRe: Boost.org?memberRay Virzi27-Feb-02 15:52 
(Re-post of private email)
 
Yes, I know about Boost and would love to contribute, but I don't have time to do all the documentation and test examples that they require. If anyone wants to submit my code for me and add their name to the credits be my guest.
 
Ray

GeneralMissing map operator =memberErnesto Perales Soto8-Mar-01 11:47 
Hi!
 
Your stl-style wrapper of the AVL tree is just what i'm looking for. But I'm testing your index_map class, and I found a little detail:

int main(int argc, char* argv[])
{
std::map<DWORD,DWORD> a, b; // this line works
//index_map<DWORD,DWORD> a, b; // this line blows!
 
a=b;
return 0;
}

 
I guess the map needs an assigment operator.:(
GeneralRe: Missing map operator =memberRay Virzi8-Mar-01 16:44 

Oops! I guess I don't copy containers very often. I have just uploaded an update with assignment operators. Consequently, there were also bugs in the copy constructor and swap function of the sorted container base class. I'm glad you caught it before too many downloads.
 
Cheers,
Ray

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web03 | 2.6.130617.1 | Last Updated 6 Aug 2002
Article Copyright 2001 by Ray Virzi
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid