Click here to Skip to main content
Click here to Skip to main content

An STL compliant sorted vector

By , 28 Nov 2002
 

Introduction

sorted_vector adapts a std::vector to the interface required by std::set/std::multiset, thereby providing set/multiset and vector functionality (random access) in one container.

Tests show that sorted_vector's element retrieval (find) speed outperforms that of std::set/std::multiset by a factor of 2 (on most machines). On the downward side is the poor performance of sorted_vector's insert member function, because it inserts an element somewhere in the middle of the sorted vector in order to preserve the sort order. By using sorted_vector's low level interface one can solve this performance bottleneck by inserting a bunch of objects using a series of push_back's followed by a call to the member function sort, which restores the sort order.

sorted_vector should be preferred over std::set/std::multiset if many small elements must be stored. Most STL implementations use a variant of a balanced tree to implement set and multiset, which imposes a per element storage overhead of 3 pointers. The most important reason to use a std::set/std::multiset instead of sorted_vector is the need to keep iterators into a set/multiset while inserting more elements into the set. These iterators remain valid in the case of a set/multiset, but are invalidated in the case of a sorted_vector container.

Namespace

sorted_vector resides in the namespace codeproject.

Basic Usage

The following small table shows corresponding declarations of sorted_vector's and set's/multiset's.

STL concept std library sorted_vector
set set<Key,Pred> sorted_vector<Key,true,Pred>
multiset mutliset<Key,Pred> sorted_vector<Key,Pred>

(sorted_vector<Key,Pred,true> means sorted_vector<Key,Pred,bNoDuplicates=true>)

The various set/multiset insert and erase member functions as well as the set/multiset functions count, lower_bound,upper_bound, equal_range are part of sorted_vector's interface and behave the same as there set/multiset counterparts. The following code snippet shows the use of a sorted_vector to hold the set intersection of the content of two std::vector's.

#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t             //build intersection using set interface of sorted_vector
BuildIntersection(  const std::vector<int>& v0,const std::vector<int>& v1,
                    codeproject::sorted_vector<int,true>& svecIntersection)
{		
    codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
    for(int i=0;i<v1.size();i++){
        if( svec.find(v1[i])!=svec.end() ){
            svecIntersection.insert(v1[i]);
        }
    }
    return svecIntersection.size();
}

The code example shows the use of the member functions find and insert. If you replace codeproject::sorted_vector<int,true> by std::set<int> you get exactly the same result. This piece of code can be optimized for speed by replacing the insert calls in the loop by calls to push_back of the base container (a vector) followed by a call to the member function sort at the end of the loop.

#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t             //same as previous example, optimized insertions 
BuildIntersection1(  const std::vector<int>& v0,const std::vector<int>& v1,
                    codeproject::sorted_vector<int,true>& svecIntersection)
{		
    codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
    codeproject::sorted_vector<int,true>::Cont& vInterSect 
        = svecIntersection.get_container();
    for(int i=0;i<v1.size();i++){
        if( svec.find(v1[i])!=svec.end() ){
            vInterSect.push_back(v1[i]);
        }
    }
    svecIntersection.sort();
    return svecIntersection.size();
}

Interface of sorted_vector

Member Coming from Description
Cont sorted_vector container type, the type of the container used to store the controlled sequence.
value_type vector The type of object, T, stored in the set.
key_type set/multiset The key type associated with value_type.
key_compare set/multiset Function object that compares two keys for ordering.
value_compare set/multiset Function object that compares two values for ordering.
pointer vector Pointer to T.
reference vector Reference to T
const_reference vector Const reference to T
size_type vector An unsigned integral type.
difference_type vector A signed integral type.
iterator vector Random access iterator used to iterate through a vector.
const_iterator vector Random access const iterator used to iterate through a vector
reverse_iterator vector Random access iterator used to iterate backwards through a vector.
const_reverse_iterator vector Random access const iterator used to iterate backwards through a vector.
iterator begin() const vector Returns an iterator pointing to the beginning of the sorted_vector.
iterator end() const vector Returns an iterator pointing to the end of the sorted_vector.
reverse_iterator rbegin() const vector Returns a reverse_iterator pointing to the beginning of the reversed sorted_vector
reverse_iterator rend() const vector Returns a reverse_iterator pointing to the end of the reversed sorted_vector.
size_type size() const vector Returns the size of the sorted_vector.
size_type max_size() const vector Returns the largest possible size of the sorted_vector.
bool empty() const vector true if the sorted_vec's size is 0.
key_compare key_comp() const set/multiset Returns the key_compare object used by the sorted_vector.
value_compare value_comp() const set/multiset Returns the value_compare object used by the sorted_vector.
sorted_vector() set/multiset Creates an empty sorted_vector.
sorted_vector(const key_compare& comp) set/multiset Creates an empty sorted_vector, using comp as the key_compare object.
VC++7.0:
template <class InputIterator><br>sorted_vector(InputIterator f,<br> InputIterator l)

VC++6.0:
sorted_vector(const_iterator f, const_iterator l)
set/multiset Creates a sorted_vector with a copy of a range.
VC++7.0:
template <class InputIterator><br>sorted_vector(InputIterator f,<br>InputIterator l,const key_compare& comp)

VC++6.0:
sorted_vector(const_iterator f, const_iterator l,const key_compare& comp)
set/multiset Creates a sorted_vector with a copy of a range, using comp as the key_compare object.
sorted_vector(const sorted_vector&) set/multiset The copy constructor.
sorted_vector& operator=(const sorted_vector&) set/multiset The assignment operator (assigns other sorted_vector )
sorted_vector& operator=(const Cont&) sorted_vector The assignment operator (assigns other vector<T> )
void swap(sorted_vector&) set/multiset Swaps the contents of two sorted_vector's
pair<iterator, bool>insert(const value_type& x) set/multiset Inserts x into the sorted_vector.
iterator insert(iterator pos, const value_type& x) set/multiset Inserts x into the sorted_vector, using pos as a hint to where it will be inserted.
VC++7.0:
template <class InputIterator><br>void insert(InputIterator f, InputIterator l)

VC++6.0:
void insert(const_iterator first f, const_iterator l)
set/multiset Inserts a range into the sorted_vector.
void erase(iterator pos) vector Erases the element pointed to by pos.
size_type erase(const key_type& k) set/multiset Erases the element whose key is k.
void erase(iterator first, iterator last) vector Erases all elements in a range.
void clear() vector Erases all of the elements.
iterator<br> find(const key_type& k) const
set/multiset Finds an element whose key is k.
size_type<br> count(const key_type& k) const
set/multiset Counts the number of elements whose key is k.
iterator<br> lower_bound(const key_type& k) const
set/multiset Finds the first element whose key is not less than k.
iterator<br> upper_bound(const key_type& k) const
set/multiset Finds the first element whose key is greater than k.
pair<iterator, iterator><br>equal_range(const key_type& k) const
set/multiset Finds a range containing all elements whose key is k.
bool operator==(const sorted_vector&,const sorted_vector&) vector Tests two sorted_vector for equality. This is a global function, not a member function. There is also an operator !=
bool operator<(const sorted_vector&, const sorted_vector&) vector Lexicographical comparison. This is a global function, not a member function. There are also operators <= >= and >
Cont& get_container() sorted_vector Returns a reference to the internal vector used to store the controlled sequence
void sort() sorted_vector Restores the sort order using key_compare
reference at(size_type i) sorted_vector Returns a reference to the element at *(begin()+i) with range check
const_reference at(size_type i) const sorted_vector Returns a reference to the element at *(begin()+i) with range check
const_reference operator[](size_type i) const sorted_vector Returns a reference to the element at *(begin()+i)
reference operator[](size_type i) sorted_vector Returns a reference to the element at *(begin()+i)
const_reference front() const sorted_vector Returns a reference to the first element
reference front() sorted_vector Returns a reference to the first element
reference back() sorted_vector Returns a reference to the last element
const_reference back() const sorted_vector Returns a reference to the last element
void pop_back() sorted_vector Removes the last element from the sorted_vector

Points of Interest

An interview with Alexander Stepanov (inventor of the STL) at http://www.stlport.org/resources/StepanovUSA.html , in which he proposed to implement a sorted container as an adapter of a std::container gave me the idea for this work. The real surprise to me was the unexpected good performance of sorted_vector compared to set/multiset. The outcome of my tests indicate, that the std::set/std::multiset has only one (important) advantage over a sorted vector, namely that iterators remain valid in a set/multiset, even when more elements are added.

The implementation work itself was easy and did not require any advanced C++/STL feature. It can be summarized as follows: Most of the member functions of sorted_vector forward the call to the vec_ data member, which is a std::vector. The set/multiset specific functions for inserting/erasing and locating elements mostly use STL algorithms to deal with the sorted sequence present in the vec_ data member. The low level interface consisting of the member functions get_container() (which returns a reference to the vec_ data member) and sort and stable_sort (which restore the sort order) are necessary to improve insertion performance (by allowing a temporary violation of the sorting order). Only one function was unexpectedly difficult to implement: The function unique, which must be called after sort and stable_sort in the case of a set. This function is part of the STL and requires a predicate as third argument. This predicate must return true, if the passed elements are equal. But the class sorted_vector only has access to a predicate which evaluates the < relation. In theory, it should be possible, to transform a predicate evaluating the < relation into another predicate evaluating ==, but in practice this is only possible, when the < predicate is implemented as object derived from std::binary_function. Ultimately, I had to implement unique myself because of that.

History

  • 1.0: November 19th, 2002; Initial release.
  • 1.1: November 28th, 2002; Documentation and code Update
    • changed namespace from std to codeproject
    • supports member templates for constructing/inserting from iterator range in case of VC++7.0

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Martin Holzherr
Switzerland Switzerland
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
AnswerOther versionmemberalexandre.hamez23 Oct '09 - 10:21 
If you are interested in a sorted vector, you can also check http://code.google.com/p/sorted-vector/[^].
GeneralRe: Other versionmemberMember 126055110 Jan '11 - 6:23 
I don't think this project is available any longer. I get an obtuse "Forbidden" error from Google when I follow this link.
Questionlinux compilation?membermaciekboni21 Apr '09 - 5:24 
From an amateur C++ programmer who needs the functionality of a sorted vector: is this supposed to compile on linux as well? it won't with
 
g++ -o -c main.o main.cpp
 
but is that because I need to include some libraries? If so, which ones?
 
Thanks very much.
Generalsort() requirement violates stable-state of objectmemberj337y d0nut30 Sep '05 - 7:43 
First, thank you. This class will be very useful.
 
I do need to point out, though, that if your class exposes the internal container, it also should detect when that container becomes "dirty" (i.e. if a non-const method is called on the underlying container). Then sorted_vector can call sort() internally (which clears the "dirty" flag) when any member function that relies on the internal order is called. This would, of course, require declaration of vec_ as mutable, since it may need to be resorted in const member functions (like iterator begin() const).
 
An alternative approach would be to assume that, if a large number of inserts are to be done consecutively, then the iterator-range version of insert() will be called. This function can be optimized with the multiple-push_back()-and-sort() algorithm.
 
We know we're reinventing the wheel. We're just trying to do it better.
GeneralRe: sort() requirement violates stable-state of objectmemberMartin.Holzherr2 Oct '05 - 21:34 
Thanks a lot for your remarks.
If find the idea to replace the low level interface by
an appropriate handling of the "insert range" function very appealing - on the other hand it will break the existing interface.
In any case I will provide a new version of the sorted vector which runs on the various MSVC versions.
Martin

GeneralMSVC 7.1memberGabor L. Ugray12 Oct '03 - 3:28 
Hi,
 
I found and tested your sorted_vector using MSVC7.1. There is one important change in template syntax: the obligatory use of the keyword typename in certain places. In order for the file to compile, the following lines need to be changed:
 
typedef typename Cont::allocator_type allocator_type;
typedef typename Cont::size_type size_type;
typedef typename Cont::difference_type difference_type;
typedef typename Cont::reference reference;
typedef typename Cont::const_reference const_reference;
typedef typename Cont::value_type value_type;
...
typedef typename Cont::iterator iterator;
typedef typename Cont::const_iterator const_iterator;
...
typedef typename Cont::const_reverse_iterator
const_reverse_iterator;
typedef typename Cont::reverse_iterator reverse_iterator;
 
Thanks for the article!
 
Gabor
GeneralUsing stl - Beginer questionmemberjeevanmanjunath31 Jul '03 - 15:45 
I have an object a
 
class object
{
private:
string streetname ;
vector listofhomes ;
}
 
i want to use STL template to build a program like
 
this class will have a
 
{ reed street,
list of houses in this street( names).
}
 
I want this to be collection. Like
{ streetname, list of houses)
(streetname, list of houses)...
 
How do i do it ?
 

Thanks in advance.
 
jeevan
GeneralReally BAD CODINGmemberhfghfghgfhfg14 Jan '03 - 16:57 
you have some of the worst coding i have ever seen.
go get a life!!Dead | X|
 

Arash

GeneralRe: Really BAD CODINGmemberChristian Graus14 Jan '03 - 17:41 
Such insight. Such wisdom. How helpful you are. I bow before you.
 
In other words, if you've got a comment to make, try to make it a helpful one, not an arrogant one. You're only making an idiot of yourself.

 
Christian
 
No offense, but I don't really want to encourage the creation of another VB developer.
- Larry Antram 22 Oct 2002

C# will attract all comers, where VB is for IT Journalists and managers - Michael
P Butler 05-12-2002


Again, you can screw up a C/C++ program just as easily as a VB program. OK, maybe not
as easily, but it's certainly doable.
- Jamie Nordmeyer - 15-Nov-2002
GeneralRe: Really BAD CODINGsussVikas K.11 Apr '03 - 1:01 
Hey, before you flame on others, go get yourself a good psychiatristMad | :mad:
GeneralCheck performance on insert/deletememberRay Virzi29 Dec '02 - 21:13 

This implementation is clever and very useful for appropriate situations. One must be careful however. In addition to the disadvantage of not preserving iterators over insert/delete operations, there should be a performance hit on insert/delete operations themselves, even if you use push_back() and then sort() afterward. The first call to sort() will work fine, but after that another sort() can do no better than O(N). This is because the container contents are already mostly sorted to begin with, causing a worst-case condition on the sort algorithm. If your code does a lot of inserts and deletes througout the program, it may be better to use a set/multiset. If you do use a sorted_vector, I think it is fastest to use push_back() and sort() when inserting many items in sequence, but use insert() when inserting one item or only a few. A "few" is defined as few in comparison to the number of items already sorted in the container.
 
As a performance check of this phenomenon, do the following. Fill the container with 100 items using push_back(), but do not sort them. Make sure they are in some random order. Now add one item with push_back() and call sort(). The time to insert the item and sort is the benchmark, and this first test should be the fastest. Make sure the inserted item lies somewhere in the middle of the sorted set. Now take the same 100 items but sort them to begin with. Then add the one item again with push_back() and call sort() again, timing only the insertion and sort. It should end up much slower. Now take the 100 items again in initial sorted order and add the one item using insert() only, and time the insert() call. If my hunch is correct, it will do better than the second test, and may even be closer to the first. I haven't tried this myself - its just theory.
 

 
Ray Virzi
GeneralRe: Check performance on insert/deletememberMartin Holzherr5 Jan '03 - 21:09 
Thank you for your comment, Ray Virzi.Smile | :)
It shows much insight into the problems of
STL containers and especially vectors, which
are usually implemented as one single memory
block, requiring memory moves or copies, when
inserting into the middle of the sequence.
I think, that a sorted sequence should not be
used, when insertions and retrieval operations
alternate each other, but think (in accordance with you)
that a sorted sequence is useful even for large sets,
provided that you initially build the set and then
later on onyl retreive from the set.
Regards Martin
GeneralDon't add to namespace stdmemberCraig Henderson26 Nov '02 - 21:49 
The std namespace is reserved for the standard C++ library and library implementor extensions. Anyone else adding to this namespace is prohibited by the standard.
 
From Section 17.4.3.1, para 1
"It is undefined for a C++ program to add declarations or definitions to namespace std or namespaces within namespace std unless otherwise specified. A program may add template specializations for any standard library template to namespace std. Such a specialization (complete or partial) of a standard library template results in undefined behavior unless the declaration depends on a userdefined name of external linkage and unless the specialization meets the standard library requirements for the original template."
GeneralRe: Don't add to namespace stdmemberCraig Henderson26 Nov '02 - 22:01 
Reading this back after submitting it, it sounds a bit curt. Sorry about that, it wasn't meant to be Smile | :)
Generaltemplate member functionsmemberShmulik Flint26 Nov '02 - 18:41 
Your interface describes some template members that take templated iterators, but the code only handle vector iterators, and doesn't have templated members.
 
You should either change your code, or your interface documentation.
GeneralRe: template member functionsmemberMartin Holzherr26 Nov '02 - 21:11 
Hi,
you are right, the members you are talking about were written by me as template members initially ,
but on some VC6 versions (depending on the SP) this resulted in an internal compiler error
(one of the constructors caused it).
template<class InIt>    
sorted_vector(InIt first, InIt last, const Pred& comp = Pred(), const A& al = A());
Because of that I changed the code to a non templated version.
On VC7, there was no such problem.
I think I will change the code in a way,
that templated members are used on VC7 but not on VC6 (using condidtional compilation).
Expect an update in the next days.
 
Recently there was a poll asking whether full standard C++ comformance is important.
I think it is important. Certainly, if you want to
write your own STL components and if you want to use
third party software.
Useful libraries like 'boost' or 'Loki' will
not compile at all under VC6 and must be modified to compile under VC7.
Regards Martin

GeneralRe: template member functionsmemberShmulik Flint26 Nov '02 - 21:51 
I know this is a problem in VC6 - this is why I was curious to see how you implemented the interface you gave.
 
I thinking sometime to put to use such vector instead of associative containers (map & set) following Scott Meyers advise in Item 23 of Effective STL. Meyers give a list of reasons why sorted vector is better - you can add it to your list of resources Wink | ;)
 
The only problem I have replacing map<> with sorted_vector > is the constness issue, where map<> iterator point to pair and sorted_vector > iterator points to pair. Do you have an idea how such a problem can be solved? I want the compiler to guard against invalid change of keys.
GeneralRe: template member functionsmemberMartin Holzherr26 Nov '02 - 22:19 
Intresting,
I really tried to preserve the constness of
the key, but was not able to get a program,
which compiles. I even think it is nearly impossible
to store a const element in a vector. But it
is possible to preserve the constness when using a tree, because you
construct the tree element once and never change
or move it later.
Please tell me, if you see a solution to this problem.
Regards Martin
GeneralRe: template member functionsmemberCraig Henderson26 Nov '02 - 21:57 
Martin Holzherr wrote:
Useful libraries like 'boost' or 'Loki' will
not compile at all under VC6 and must be modified to compile under VC7

 
It is true that a lot of Loki will not compile under MSVC6 or 7.0 because it uses partial template specialization which is long overdue is MSVC. However, a lot of Boost is compilable with these 6.0 and 7.0 because of a huge effort by the Boost developers to provide workarounds for the limitations.
 
FYI, MSVC7.1 (aka Everett) will compile the Boost libraries without any workarounds (it passes all the Boost regression tests). MSVC7.1 finally supports partial template specialization along with other missing standard language features Smile | :) . In fact, Boost was used as a test bed for early alpha releases of the compiler.
GeneralRe: template member functionsmemberNeville Franks30 Nov '02 - 10:10 
Craig Henderson wrote:
FYI, MSVC7.1 (aka Everett) will compile the Boost libraries without any workarounds (it passes all the Boost regression tests). MSVC7.1 finally supports partial template specialization along with other missing standard language features
 
Thanks for the info on Boost and MSVC7.1. Do you know if MSVC7.1 compiles Loki?
 
Neville Franks, Author of ED for Windows. www.getsoft.com

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 29 Nov 2002
Article Copyright 2002 by Martin Holzherr
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid