|
Introduction
sorted_vector adapts a std::vector to the interface required by
std::set/std::multiset, thereby providing set/multiset and vector
functionality (random access) in one container.
Tests show that sorted_vector's element retrieval (find) speed outperforms
that of std::set/std::multiset by a factor of 2 (on most machines). On the downward side is
the poor performance of sorted_vector's insert member function, because it
inserts an element somewhere in the middle of the sorted vector in order
to preserve the sort order. By using sorted_vector's low level interface one can
solve this performance bottleneck by inserting a bunch of objects using a series of
push_back's followed by a call to the member function sort, which restores the sort
order.
sorted_vector should be preferred over std::set/std::multiset
if many small elements
must be stored. Most STL implementations use a variant of a balanced tree to
implement set and multiset, which imposes a per element storage overhead of 3 pointers.
The most important reason to use a std::set/std::multiset instead of
sorted_vector is
the need to keep iterators into a set/multiset while inserting more elements into
the set. These iterators remain valid in the case of a set/multiset, but are invalidated
in the case of a sorted_vector container.
Namespace
sorted_vector resides in the namespace codeproject.
Basic Usage
The following small table shows corresponding declarations of sorted_vector's
and set's/multiset's.
| STL concept |
std library |
sorted_vector |
| set |
set<Key,Pred> |
sorted_vector<Key,true,Pred> |
multiset |
mutliset<Key,Pred> |
sorted_vector<Key,Pred> |
(sorted_vector<Key,Pred,true> means sorted_vector<Key,Pred,bNoDuplicates=true>)
The various set/multiset insert and erase member functions as well as the
set/multiset functions count, lower_bound,upper_bound, equal_range are part of
sorted_vector's interface and behave the same as there set/multiset counterparts.
The following code snippet shows the use of a sorted_vector to hold the
set intersection of the content of two std::vector's.
#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t
BuildIntersection( const std::vector<int>& v0,const std::vector<int>& v1,
codeproject::sorted_vector<int,true>& svecIntersection)
{
codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
for(int i=0;i<v1.size();i++){
if( svec.find(v1[i])!=svec.end() ){
svecIntersection.insert(v1[i]);
}
}
return svecIntersection.size();
}
The code example shows the use of the member functions find and insert.
If you replace codeproject::sorted_vector<int,true> by std::set<int> you get exactly
the same result. This piece of code can be optimized for speed by replacing the insert calls in the loop
by calls to push_back of the base container (a vector) followed by a call
to the member function sort at the end of the loop.
#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t
BuildIntersection1( const std::vector<int>& v0,const std::vector<int>& v1,
codeproject::sorted_vector<int,true>& svecIntersection)
{
codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
codeproject::sorted_vector<int,true>::Cont& vInterSect
= svecIntersection.get_container();
for(int i=0;i<v1.size();i++){
if( svec.find(v1[i])!=svec.end() ){
vInterSect.push_back(v1[i]);
}
}
svecIntersection.sort();
return svecIntersection.size();
}
Interface of sorted_vector
|
Member
|
Coming from
|
Description
|
Cont |
sorted_vector |
container type, the type of the container used to store the controlled sequence. |
value_type |
vector |
The type of object, T, stored in the set. |
key_type |
set/multiset |
The key type associated with value_type. |
key_compare |
set/multiset |
Function object that compares two keys for ordering. |
value_compare |
set/multiset |
Function object that compares two values for ordering. |
pointer |
vector |
Pointer to T. |
reference |
vector |
Reference to T |
const_reference |
vector |
Const reference to T |
size_type |
vector |
An unsigned integral type. |
difference_type |
vector |
A signed integral type. |
iterator |
vector |
Random access iterator used to iterate through a vector. |
const_iterator |
vector |
Random access const iterator used to iterate through a vector |
reverse_iterator |
vector |
Random access iterator used to iterate backwards through a vector.
|
const_reverse_iterator |
vector |
Random access const iterator used to iterate backwards through a vector. |
iterator begin() const |
vector |
Returns an iterator pointing to the beginning of the sorted_vector. |
iterator end() const |
vector |
Returns an iterator pointing to the end of the sorted_vector. |
reverse_iterator rbegin() const |
vector |
Returns a reverse_iterator pointing to the beginning of the reversed sorted_vector |
reverse_iterator rend() const |
vector |
Returns a reverse_iterator pointing to the end of the reversed sorted_vector. |
size_type size() const |
vector |
Returns the size of the sorted_vector. |
size_type max_size() const |
vector |
Returns the largest possible size of the sorted_vector. |
bool empty() const |
vector |
true if the sorted_vec's size is 0. |
key_compare key_comp() const |
set/multiset |
Returns the key_compare object used by the sorted_vector. |
value_compare value_comp() const |
set/multiset |
Returns the value_compare object used by the sorted_vector. |
sorted_vector() |
set/multiset |
Creates an empty sorted_vector. |
sorted_vector(const key_compare& comp) |
set/multiset |
Creates an empty sorted_vector, using comp as the key_compare object. |
VC++7.0: template <class InputIterator> sorted_vector(InputIterator f, InputIterator l)
VC++6.0: sorted_vector(const_iterator f, const_iterator l)
|
set/multiset |
Creates a sorted_vector with a copy of a range. |
VC++7.0: template <class InputIterator> sorted_vector(InputIterator f, InputIterator l,const key_compare& comp)
VC++6.0: sorted_vector(const_iterator f, const_iterator l,const key_compare& comp)
|
set/multiset |
Creates a sorted_vector with a copy of a range, using comp as the key_compare object. |
sorted_vector(const sorted_vector&) |
set/multiset |
The copy constructor. |
sorted_vector& operator=(const sorted_vector&) |
set/multiset |
The assignment operator (assigns other sorted_vector ) |
sorted_vector& operator=(const Cont&) |
sorted_vector |
The assignment operator (assigns other vector<T> ) |
void swap(sorted_vector&) |
set/multiset |
Swaps the contents of two sorted_vector's |
pair<iterator, bool>insert(const value_type& x) |
set/multiset |
Inserts x into the sorted_vector. |
iterator insert(iterator pos, const value_type& x) |
set/multiset |
Inserts x into the sorted_vector, using pos as a hint to where it will be
inserted.
|
VC++7.0: template <class InputIterator> void insert(InputIterator f, InputIterator l)
VC++6.0: void insert(const_iterator first f, const_iterator l)
|
set/multiset |
Inserts a range into the sorted_vector. |
void erase(iterator pos) |
vector |
Erases the element pointed to by pos. |
size_type erase(const key_type& k) |
set/multiset |
Erases the element whose key is k. |
void erase(iterator first, iterator last) |
vector |
Erases all elements in a range. |
void clear() |
vector |
Erases all of the elements. |
iterator find(const key_type& k) const |
set/multiset |
Finds an element whose key is k. |
size_type count(const key_type& k) const |
set/multiset |
Counts the number of elements whose key is k. |
iterator lower_bound(const key_type& k) const |
set/multiset |
Finds the first element whose key is not less than k. |
iterator upper_bound(const key_type& k) const |
set/multiset |
Finds the first element whose key is greater than k. |
pair<iterator, iterator> equal_range(const key_type& k) const |
set/multiset |
Finds a range containing all elements whose key is k. |
bool operator==(const sorted_vector&,const sorted_vector&) |
vector |
Tests two sorted_vector for equality. This is a global function, not a member function.
There is also an operator != |
bool operator<(const sorted_vector&, const sorted_vector&) |
vector |
Lexicographical comparison. This is a global function, not a member function.
There are also operators <= >= and > |
Cont& get_container() |
sorted_vector |
Returns a reference to the internal vector used to store the controlled sequence |
void sort() |
sorted_vector |
Restores the sort order using key_compare |
reference at(size_type i) |
sorted_vector |
Returns a reference to the element at *(begin()+i) with range check |
const_reference at(size_type i) const |
sorted_vector |
Returns a reference to the element at *(begin()+i) with range check |
const_reference operator[](size_type i) const |
sorted_vector |
Returns a reference to the element at *(begin()+i) |
reference operator[](size_type i) |
sorted_vector |
Returns a reference to the element at *(begin()+i) |
const_reference front() const |
sorted_vector |
Returns a reference to the first element |
reference front() |
sorted_vector |
Returns a reference to the first element |
reference back() |
sorted_vector |
Returns a reference to the last element |
const_reference back() const |
sorted_vector |
Returns a reference to the last element |
void pop_back() |
sorted_vector |
Removes the last element from the sorted_vector |
Points of Interest
An interview with Alexander Stepanov (inventor of the STL) at
http://www.stlport.org/resources/StepanovUSA.html
, in which he proposed to implement
a sorted container as an adapter of a std::container gave me the idea for this work.
The real surprise to me was the unexpected good performance of sorted_vector compared
to set/multiset. The outcome of my tests indicate, that the std::set/std::multiset
has only one (important) advantage over a sorted vector, namely that iterators remain
valid in a set/multiset, even when more elements are added.
The implementation work itself was easy and did not require any advanced C++/STL feature.
It can be summarized as follows:
Most of the member functions of sorted_vector forward the call to the vec_ data member,
which is a std::vector. The set/multiset specific functions for inserting/erasing and
locating elements mostly use STL algorithms to deal with the sorted sequence present
in the vec_ data member. The low level interface consisting of the member functions
get_container() (which returns a reference to the vec_ data member)
and sort and stable_sort
(which restore the sort order) are necessary to improve insertion performance (by allowing
a temporary violation of the sorting order).
Only one function was unexpectedly difficult to implement: The function unique, which
must be called after sort and stable_sort in the case of a set.
This function is part of the STL and requires a predicate as third argument. This predicate must return true,
if the passed elements are equal. But the class sorted_vector only has access to a predicate
which evaluates the < relation. In theory, it should be possible, to transform a predicate
evaluating the < relation into another predicate evaluating ==, but in practice this is only
possible, when the < predicate is implemented as object derived from std::binary_function.
Ultimately, I had to implement unique myself because of that.
History
- 1.0: November 19th, 2002; Initial release.
- 1.1: November 28th, 2002; Documentation and code Update
- changed namespace from
std to codeproject
- supports member templates for constructing/inserting from iterator range in case of VC++7.0
| You must Sign In to use this message board. |
|
| | Msgs 1 to 17 of 17 (Total in Forum: 17) (Refresh) | FirstPrevNext |
|
|
 |
|
|
First, thank you. This class will be very useful.
I do need to point out, though, that if your class exposes the internal container, it also should detect when that container becomes "dirty" (i.e. if a non-const method is called on the underlying container). Then sorted_vector can call sort() internally (which clears the "dirty" flag) when any member function that relies on the internal order is called. This would, of course, require declaration of vec_ as mutable, since it may need to be resorted in const member functions (like iterator begin() const).
An alternative approach would be to assume that, if a large number of inserts are to be done consecutively, then the iterator-range version of insert() will be called. This function can be optimized with the multiple-push_back()-and-sort() algorithm.
We know we're reinventing the wheel. We're just trying to do it better.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Thanks a lot for your remarks. If find the idea to replace the low level interface by an appropriate handling of the "insert range" function very appealing - on the other hand it will break the existing interface. In any case I will provide a new version of the sorted vector which runs on the various MSVC versions. Martin
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hi,
I found and tested your sorted_vector using MSVC7.1. There is one important change in template syntax: the obligatory use of the keyword typename in certain places. In order for the file to compile, the following lines need to be changed:
typedef typename Cont::allocator_type allocator_type; typedef typename Cont::size_type size_type; typedef typename Cont::difference_type difference_type; typedef typename Cont::reference reference; typedef typename Cont::const_reference const_reference; typedef typename Cont::value_type value_type; ... typedef typename Cont::iterator iterator; typedef typename Cont::const_iterator const_iterator; ... typedef typename Cont::const_reverse_iterator const_reverse_iterator; typedef typename Cont::reverse_iterator reverse_iterator;
Thanks for the article!
Gabor
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I have an object a
class object { private: string streetname ; vector listofhomes ; }
i want to use STL template to build a program like
this class will have a
{ reed street, list of houses in this street( names). }
I want this to be collection. Like { streetname, list of houses) (streetname, list of houses)...
How do i do it ?
Thanks in advance.
jeevan
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
Such insight. Such wisdom. How helpful you are. I bow before you.
In other words, if you've got a comment to make, try to make it a helpful one, not an arrogant one. You're only making an idiot of yourself.
Christian
No offense, but I don't really want to encourage the creation of another VB developer. - Larry Antram 22 Oct 2002
C# will attract all comers, where VB is for IT Journalists and managers - Michael P Butler 05-12-2002
Again, you can screw up a C/C++ program just as easily as a VB program. OK, maybe not as easily, but it's certainly doable. - Jamie Nordmeyer - 15-Nov-2002
|
| Sign In·View Thread·PermaLink | 5.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
This implementation is clever and very useful for appropriate situations. One must be careful however. In addition to the disadvantage of not preserving iterators over insert/delete operations, there should be a performance hit on insert/delete operations themselves, even if you use push_back() and then sort() afterward. The first call to sort() will work fine, but after that another sort() can do no better than O(N). This is because the container contents are already mostly sorted to begin with, causing a worst-case condition on the sort algorithm. If your code does a lot of inserts and deletes througout the program, it may be better to use a set/multiset. If you do use a sorted_vector, I think it is fastest to use push_back() and sort() when inserting many items in sequence, but use insert() when inserting one item or only a few. A "few" is defined as few in comparison to the number of items already sorted in the container.
As a performance check of this phenomenon, do the following. Fill the container with 100 items using push_back(), but do not sort them. Make sure they are in some random order. Now add one item with push_back() and call sort(). The time to insert the item and sort is the benchmark, and this first test should be the fastest. Make sure the inserted item lies somewhere in the middle of the sorted set. Now take the same 100 items but sort them to begin with. Then add the one item again with push_back() and call sort() again, timing only the insertion and sort. It should end up much slower. Now take the 100 items again in initial sorted order and add the one item using insert() only, and time the insert() call. If my hunch is correct, it will do better than the second test, and may even be closer to the first. I haven't tried this myself - its just theory.
Ray Virzi
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Thank you for your comment, Ray Virzi. It shows much insight into the problems of STL containers and especially vectors, which are usually implemented as one single memory block, requiring memory moves or copies, when inserting into the middle of the sequence. I think, that a sorted sequence should not be used, when insertions and retrieval operations alternate each other, but think (in accordance with you) that a sorted sequence is useful even for large sets, provided that you initially build the set and then later on onyl retreive from the set. Regards Martin
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
The std namespace is reserved for the standard C++ library and library implementor extensions. Anyone else adding to this namespace is prohibited by the standard.
From Section 17.4.3.1, para 1 "It is undefined for a C++ program to add declarations or definitions to namespace std or namespaces within namespace std unless otherwise specified. A program may add template specializations for any standard library template to namespace std. Such a specialization (complete or partial) of a standard library template results in undefined behavior unless the declaration depends on a userdefined name of external linkage and unless the specialization meets the standard library requirements for the original template."
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
Your interface describes some template members that take templated iterators, but the code only handle vector iterators, and doesn't have templated members.
You should either change your code, or your interface documentation.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hi, you are right, the members you are talking about were written by me as template members initially , but on some VC6 versions (depending on the SP) this resulted in an internal compiler error (one of the constructors caused it).
template<class InIt> sorted_vector(InIt first, InIt last, const Pred& comp = Pred(), const A& al = A());
Because of that I changed the code to a non templated version. On VC7, there was no such problem. I think I will change the code in a way, that templated members are used on VC7 but not on VC6 (using condidtional compilation). Expect an update in the next days.
Recently there was a poll asking whether full standard C++ comformance is important. I think it is important. Certainly, if you want to write your own STL components and if you want to use third party software. Useful libraries like 'boost' or 'Loki' will not compile at all under VC6 and must be modified to compile under VC7. Regards Martin
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I know this is a problem in VC6 - this is why I was curious to see how you implemented the interface you gave.
I thinking sometime to put to use such vector instead of associative containers (map & set) following Scott Meyers advise in Item 23 of Effective STL. Meyers give a list of reasons why sorted vector is better - you can add it to your list of resources 
The only problem I have replacing map<> with sorted_vector > is the constness issue, where map<> iterator point to pair and sorted_vector > iterator points to pair. Do you have an idea how such a problem can be solved? I want the compiler to guard against invalid change of keys.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Intresting, I really tried to preserve the constness of the key, but was not able to get a program, which compiles. I even think it is nearly impossible to store a const element in a vector. But it is possible to preserve the constness when using a tree, because you construct the tree element once and never change or move it later. Please tell me, if you see a solution to this problem. Regards Martin
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Martin Holzherr wrote: Useful libraries like 'boost' or 'Loki' will not compile at all under VC6 and must be modified to compile under VC7
It is true that a lot of Loki will not compile under MSVC6 or 7.0 because it uses partial template specialization which is long overdue is MSVC. However, a lot of Boost is compilable with these 6.0 and 7.0 because of a huge effort by the Boost developers to provide workarounds for the limitations.
FYI, MSVC7.1 (aka Everett) will compile the Boost libraries without any workarounds (it passes all the Boost regression tests). MSVC7.1 finally supports partial template specialization along with other missing standard language features . In fact, Boost was used as a test bed for early alpha releases of the compiler.
|
| Sign In·View Thread·PermaLink | 5.00/5 (1 vote) |
|
|
|
 |
|
|
Craig Henderson wrote: FYI, MSVC7.1 (aka Everett) will compile the Boost libraries without any workarounds (it passes all the Boost regression tests). MSVC7.1 finally supports partial template specialization along with other missing standard language features
Thanks for the info on Boost and MSVC7.1. Do you know if MSVC7.1 compiles Loki?
Neville Franks, Author of ED for Windows. www.getsoft.com
|
| Sign In·View Thread·PermaLink | 5.00/5 (1 vote) |
|
|
|
 |
|
|
General News Question Answer Joke Rant Admin
|