Introduction
sorted_vector
adapts a std::vector
to the interface required by
std::set/std::multiset
, thereby providing set/multiset and vector
functionality (random access) in one container.
Tests show that sorted_vector
's element retrieval (find) speed outperforms
that of std::set/std::multiset
by a factor of 2 (on most machines). On the downward side is
the poor performance of sorted_vector
's insert member function, because it
inserts an element somewhere in the middle of the sorted vector in order
to preserve the sort order. By using sorted_vector
's low level interface one can
solve this performance bottleneck by inserting a bunch of objects using a series of
push_back
's followed by a call to the member function sort
, which restores the sort
order.
sorted_vector
should be preferred over std::set/std::multiset
if many small elements
must be stored. Most STL implementations use a variant of a balanced tree to
implement set and multiset, which imposes a per element storage overhead of 3 pointers.
The most important reason to use a std::set/std::multiset
instead of
sorted_vector
is
the need to keep iterators into a set/multiset while inserting more elements into
the set. These iterators remain valid in the case of a set/multiset, but are invalidated
in the case of a sorted_vector
container.
Namespace
sorted_vector
resides in the namespace codeproject
.
Basic Usage
The following small table shows corresponding declarations of sorted_vector
's
and set
's/multiset
's.
STL concept | std library | sorted_vector |
---|
set | set<Key,Pred> | sorted_vector<Key,true,Pred> |
multiset | mutliset<Key,Pred> | sorted_vector<Key,Pred> |
(sorted_vector<Key,Pred,true>
means sorted_vector<Key,Pred,bNoDuplicates=true>
)
The various set/multiset insert
and erase
member functions as well as the
set/multiset functions count, lower_bound,upper_bound, equal_range
are part of
sorted_vector
's interface and behave the same as there set/multiset counterparts.
The following code snippet shows the use of a sorted_vector
to hold the
set intersection of the content of two std::vector
's.
#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t
BuildIntersection( const std::vector<int>& v0,const std::vector<int>& v1,
codeproject::sorted_vector<int,true>& svecIntersection)
{
codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
for(int i=0;i<v1.size();i++){
if( svec.find(v1[i])!=svec.end() ){
svecIntersection.insert(v1[i]);
}
}
return svecIntersection.size();
}
The code example shows the use of the member functions find
and insert
.
If you replace codeproject::sorted_vector<int,true>
by std::set<int>
you get exactly
the same result. This piece of code can be optimized for speed by replacing the insert calls in the loop
by calls to push_back
of the base container (a vector) followed by a call
to the member function sort
at the end of the loop.
#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t
BuildIntersection1( const std::vector<int>& v0,const std::vector<int>& v1,
codeproject::sorted_vector<int,true>& svecIntersection)
{
codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
codeproject::sorted_vector<int,true>::Cont& vInterSect
= svecIntersection.get_container();
for(int i=0;i<v1.size();i++){
if( svec.find(v1[i])!=svec.end() ){
vInterSect.push_back(v1[i]);
}
}
svecIntersection.sort();
return svecIntersection.size();
}
Interface of sorted_vector
Member
|
Coming from
|
Description
|
---|
Cont | sorted_vector | container type, the type of the container used to store the controlled sequence. |
value_type | vector | The type of object, T, stored in the set. |
key_type | set/multiset | The key type associated with value_type. |
key_compare | set/multiset | Function object that compares two keys for ordering. |
value_compare | set/multiset | Function object that compares two values for ordering. |
pointer | vector | Pointer to T. |
reference | vector | Reference to T |
const_reference | vector | Const reference to T |
size_type | vector | An unsigned integral type. |
difference_type | vector | A signed integral type. |
iterator | vector | Random access iterator used to iterate through a vector. |
const_iterator | vector | Random access const iterator used to iterate through a vector |
reverse_iterator | vector | Random access iterator used to iterate backwards through a vector.
|
const_reverse_iterator | vector | Random access const iterator used to iterate backwards through a vector. |
iterator begin() const | vector | Returns an iterator pointing to the beginning of the sorted_vector. |
iterator end() const | vector | Returns an iterator pointing to the end of the sorted_vector. |
reverse_iterator rbegin() const | vector | Returns a reverse_iterator pointing to the beginning of the reversed sorted_vector |
reverse_iterator rend() const | vector | Returns a reverse_iterator pointing to the end of the reversed sorted_vector. |
size_type size() const | vector | Returns the size of the sorted_vector. |
size_type max_size() const | vector | Returns the largest possible size of the sorted_vector. |
bool empty() const | vector | true if the sorted_vec's size is 0. |
key_compare key_comp() const | set/multiset | Returns the key_compare object used by the sorted_vector. |
value_compare value_comp() const | set/multiset | Returns the value_compare object used by the sorted_vector. |
sorted_vector() | set/multiset | Creates an empty sorted_vector. |
sorted_vector(const key_compare& comp) | set/multiset | Creates an empty sorted_vector, using comp as the key_compare object. |
VC++7.0: template <class InputIterator><br>sorted_vector(InputIterator f,<br> InputIterator l)
VC++6.0: sorted_vector(const_iterator f, const_iterator l)
| set/multiset | Creates a sorted_vector with a copy of a range. |
VC++7.0: template <class InputIterator><br>sorted_vector(InputIterator f,<br>InputIterator l,const key_compare& comp)
VC++6.0: sorted_vector(const_iterator f, const_iterator l,const key_compare& comp)
| set/multiset | Creates a sorted_vector with a copy of a range, using comp as the key_compare object. |
sorted_vector(const sorted_vector&) | set/multiset | The copy constructor. |
sorted_vector& operator=(const sorted_vector&) | set/multiset | The assignment operator (assigns other sorted_vector ) |
sorted_vector& operator=(const Cont&) | sorted_vector | The assignment operator (assigns other vector<T> ) |
void swap(sorted_vector&) | set/multiset | Swaps the contents of two sorted_vector's |
pair<iterator, bool>insert(const value_type& x) | set/multiset | Inserts x into the sorted_vector. |
iterator insert(iterator pos, const value_type& x) | set/multiset | Inserts x into the sorted_vector, using pos as a hint to where it will be
inserted.
|
VC++7.0: template <class InputIterator><br>void insert(InputIterator f, InputIterator l)
VC++6.0: void insert(const_iterator first f, const_iterator l)
| set/multiset | Inserts a range into the sorted_vector. |
void erase(iterator pos) | vector | Erases the element pointed to by pos. |
size_type erase(const key_type& k) | set/multiset | Erases the element whose key is k. |
void erase(iterator first, iterator last) | vector | Erases all elements in a range. |
void clear() | vector | Erases all of the elements. |
iterator<br> find(const key_type& k) const | set/multiset | Finds an element whose key is k. |
size_type<br> count(const key_type& k) const | set/multiset | Counts the number of elements whose key is k. |
iterator<br> lower_bound(const key_type& k) const | set/multiset | Finds the first element whose key is not less than k. |
iterator<br> upper_bound(const key_type& k) const | set/multiset | Finds the first element whose key is greater than k. |
pair<iterator, iterator><br>equal_range(const key_type& k) const | set/multiset | Finds a range containing all elements whose key is k. |
bool operator==(const sorted_vector&,const sorted_vector&) | vector | Tests two sorted_vector for equality. This is a global function, not a member function.
There is also an operator != |
bool operator<(const sorted_vector&, const sorted_vector&) | vector | Lexicographical comparison. This is a global function, not a member function.
There are also operators <= >= and > |
Cont& get_container() | sorted_vector | Returns a reference to the internal vector used to store the controlled sequence |
void sort() | sorted_vector | Restores the sort order using key_compare |
reference at(size_type i) | sorted_vector | Returns a reference to the element at *(begin()+i) with range check |
const_reference at(size_type i) const | sorted_vector | Returns a reference to the element at *(begin()+i) with range check |
const_reference operator[](size_type i) const | sorted_vector | Returns a reference to the element at *(begin()+i) |
reference operator[](size_type i) | sorted_vector | Returns a reference to the element at *(begin()+i) |
const_reference front() const | sorted_vector | Returns a reference to the first element |
reference front() | sorted_vector | Returns a reference to the first element |
reference back() | sorted_vector | Returns a reference to the last element |
const_reference back() const | sorted_vector | Returns a reference to the last element |
void pop_back() | sorted_vector | Removes the last element from the sorted_vector |
Points of Interest
An interview with Alexander Stepanov (inventor of the STL) at
http://www.stlport.org/resources/StepanovUSA.html
, in which he proposed to implement
a sorted container as an adapter of a std::container
gave me the idea for this work.
The real surprise to me was the unexpected good performance of sorted_vector
compared
to set/multiset. The outcome of my tests indicate, that the std::set/std::multiset
has only one (important) advantage over a sorted vector, namely that iterators remain
valid in a set/multiset, even when more elements are added.
The implementation work itself was easy and did not require any advanced C++/STL feature.
It can be summarized as follows:
Most of the member functions of sorted_vector
forward the call to the vec_
data member,
which is a std::vector
. The set/multiset specific functions for inserting/erasing and
locating elements mostly use STL algorithms to deal with the sorted sequence present
in the vec_
data member. The low level interface consisting of the member functions
get_container()
(which returns a reference to the vec_
data member)
and sort
and stable_sort
(which restore the sort order) are necessary to improve insertion performance (by allowing
a temporary violation of the sorting order).
Only one function was unexpectedly difficult to implement: The function unique
, which
must be called after sort
and stable_sort
in the case of a set.
This function is part of the STL and requires a predicate as third argument. This predicate must return true,
if the passed elements are equal. But the class sorted_vector
only has access to a predicate
which evaluates the < relation. In theory, it should be possible, to transform a predicate
evaluating the < relation into another predicate evaluating ==, but in practice this is only
possible, when the < predicate is implemented as object derived from std::binary_function
.
Ultimately, I had to implement unique
myself because of that.
History
- 1.0: November 19th, 2002; Initial release.
- 1.1: November 28th, 2002; Documentation and code Update
- changed namespace from
std
to codeproject
- supports member templates for constructing/inserting from iterator range in case of VC++7.0
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.