Click here to Skip to main content
Click here to Skip to main content

Fast Indexable Stacks

By , 19 Nov 2005
 

Introduction

The ootl::stack is a template class similar to std::vector which provides extremely fast element growth: O(1) complexity in both the average and worst cases, rather than the O(N) complexity of std::vector in the worst case. The ootl::stack template also provides fast indexing, outperforming std::deque by as much as 6x or more.

The reverse doubling ArrayList

The ootl::stack template is based on a data structure called the "reverse doubling array list" (which as far as I know is original, but if anyone knows otherwise please let me know. The reverse doubling array list is a linked list of arrays, each one twice the size of the next. When the array of the first node (or last node depending on how you want to envision it) is filled to capacity, a new node is created with twice the capacity as the previous. Since the original data isn't moved, only 2N memory is used at any given time, no deallocation occurs, and there is no need for the N destruction and copy-construction operations.

Quite clearly this data structure supports efficient growth, since there are no copies, and less wasted space. However what is somewhat counterintuitive is that this data structure provides, on an average, random element access with constant complexity. The key to achieving indexing with O(1) complexity is the list must be traversed from the biggest node to the smallest.

Analyzing the complexity of indexing

The reverse doubling array list has half of its elements in the first node, 1/4 of its elements in the second node, 1/8 in the third node, and so on until some lower bound is reached. On an average you will find a node you want after two elements, and in the worst case you'll find it in O(Log N) time complexity. Here's why: ignoring lookup operations, the mean number of nodes you need to visit to access an element using an index lookup is expressed by the following equation:

  (1/2)n + (2/4)n + (3/8)n + ... + (i/2^i)n / n

The factors of n in the quotient are the infinite series 1/2 + 2/4 + 3/8 + 4/16 + ... + (i/2^i). This series converges at the value of two. In order to understand why this is so, consider that it is equivalent to the following infinite sum of infinite sums:

  (1/2 + 1/4 + 1/8 + ...) + (1/4 + 1/8 + 1/16 + ...) + 
                              (1/8 + 1/16 + 1/32 + ...) + ...

These infinite series are very well-known and easily recognizable to be equal to:

  1 + 1/2 + 1/4 + ...

Benchmarks

I've run a few informal tests to compare the performance of the described stack class with the std::vector and std::vector implementations which come with the STLPort 4.6.2 implementation of the C++ standard library. The following results were achieved under Windows XP Service Pack 2, running on an Intel Celeron 1.80 GHZ machine with 512 MB of RAM, using the MinGW GCC 3.4 compiler with full optimizations:

// appending 12.8 million integers
std::vector = 1015 msec
std::deque = 734 msec
ootl::stack = 577 msec

// indexing speeds of 12.8 million integers
std::vector = 593 msec
std::deque = 6733 msec
ootl::stack = 1171 msec

These are fairly representative excerpts from the full benchmarking suite.

The interface

The ootl::stack template itself (which is listed later) is quite complex so first here is its simplified interface:

template<typename
class stack
{
public:
    // public
    typedef T value_type;
    typedef stack self;
    
    // 'structors
    stack();
    stack(self& x);
    stack(int nsize, const T& x = T());
    ~stack();
    
    // model the OOTL Indexable concept
    T& operator[](int n);
    int count();
    
    // model the OOTL Stackable concept
    void push(const T& x = T());
    void pop();
    bool is_empty();
    T& top();
    
    // model the OOTL Iterable concept
    tempate<Procedure>
    void for_each(Procedure& p);
};

The implementation

Here's the big ol' implementation:

// Public Domain by Christopher Diggins 
// http://www.ootl.org 

#ifndef OOTL_STACK_HPP
#define OOTL_STACK_HPP

#include <cstdlib>
#include <cassert>
#include <memory>
#include <algorithm>
#include <iostream>

namespace ootl {

    template<
      typename T, 
      int Factor_N = 2, 
      int Initial_N = 8, 
      typename Alloc = std::allocator<T> 
    >
    struct indexable_stack_impl
    {
    public:
      typedef T value_type;
      typedef indexable_stack_impl self;  
    private:
      // local buffer type
      struct buffer {
        buffer(int n, int i = 0, buffer* x = NULL) : 
          size(n), index(i), prev(x), begin(a.allocate(n)), end(begin + n)
        { }
        ~buffer() { 
          a.deallocate(begin, size); 
        }
        template<typename Procedure>
        void for_each(Procedure& proc, T* end) {
          if (prev != NULL) {
            prev->for_each(proc, prev->end);
          }
          T* p = begin;
          while (p != end) {
            proc(*p++);
          }
        }
        int index; 
        int size;
        T* begin;
        Alloc a; 
        T* end;
        buffer* prev;
      };  
    private:
      // begin fields 
      buffer* buff;
      T* last;
      int cnt;
      int cap;
      int init;
    public:
      indexable_stack_impl() { 
        initialize(Initial_N);
      }
      indexable_stack_impl(self& x) { 
        initialize(x.capacity());
        x.for_each(stacker(*this));
      }
      indexable_stack_impl(int nsize, const T& x = T()) { 
        if (nsize >= Initial_N) {
          initialize(nsize);
        }
        else {
          initialize(Initial_N);      
        }    
        last = buff->begin + nsize;
        while (cnt < nsize) {
          buff->a.construct(buff->begin + cnt++, x);
        }
        assert(last == buff->begin + cnt);
      }
      ~indexable_stack_impl() { 
        while (cnt > 0) {      
          pop();
        }
        assert(buff != NULL);
        delete buff; 
      } 
      // implementation of OOTL Indexable
      T& operator[](int n) {
        if (n >= buff->index) { 
          return buff->begin[n - buff->index]; 
        }
        buffer* curr = buff->prev;
      loop:
        if (n >= curr->index) { 
          return curr->begin[n - curr->index]; 
        }
        curr = curr->prev;
        goto loop;
      }
      int count() const {
        return cnt;
      }
      // implementation of OOTL Stack concept
      void push(const T& x = T()) {
        assert(last >= buff->begin);
        assert(last <= buff->end);
        if (last == buff->end) {
          add_buffer();
        }
        buff->a.construct(last++, x);
        ++cnt;
      }
      void pop() {        
        assert(last >= buff->begin);
        assert(last <= buff->end);
        if (last == buff->begin) {
          remove_buffer();
        }
        assert(buff != NULL);
        buff->a.destroy(--last);          
        --cnt;
      }
      bool is_empty() {
        return count() == 0;
      }
      T& top() {
        return *(top_pointer());
      }
      // implementation of OOTL Iterable concept 
      template<typename Procedure>
      void for_each(Procedure& proc) {
        buff->for_each(proc, last);
      }  
    private:
      void initialize(int ncap) {
        assert(ncap >= Initial_N);
        buff = new buffer(ncap); 
        cnt = 0;
        cap = ncap;
        last = buff->begin;
      }
      T* top_pointer() {
        assert(!is_empty());
        assert(last >= buff->begin);
        assert(last <= buff->end);
        if (last == buff->begin) {
          return buff->prev->end - 1; 
        }
        else {
          return last - 1;
        }    
      }
      void add_buffer() {
        buff = new buffer(buff->size * Factor_N, cap, buff);
        cap += buff->size;
        last = buff->begin;
        assert(count() < capacity());
      }    
      void remove_buffer() {
        buffer* tmp = buff;
        cap -= buff->size;
        buff = buff->prev;
        tmp->prev = NULL; 
        last = buff->end;
        delete(tmp);
      }    
      int capacity() const {
        return cap;     
      }
    };
    
    ///////////////////////////////////////////////////////////////
    // A stack_extension contains additional functions which 
    // can be easily derived from a stack
    
    template
    <
      typename Implementation,
      typename Inherited 
    >
    struct stack_extension : Inherited
    {
      // public typedef
      typedef Inherited inh;
      typedef Implementation impl;
      typedef typename inh::value_type value_type;
      
      // constructors
      stack_extension() : inh() { }
      stack_extension(impl& x) : inh(x) { }
      stack_extension(int nsize, const value_type& x = value_type()) : 
                                                       inh(nsize, x) { }
    
      // implementation of OOTL Growable concept
      void grow(int n = 1, const value_type& x = value_type()) {      
        while (n > 0) inh::push(x);
      } 
      // implementation of OOTL Shrinkable concept
      void shrink(int n = 1) {      
        while (n--) inh::pop();
      }  
      // implementation of OOTL Resizable concept  
      void resize(int n, const value_type& x = value_type()) {      
        while (n > inh::count()) inh::push(x);
        while (n < inh::count()) inh::pop();
      }    
      // implementation of OOTL Sortable concept
      bool lt(int i, int j) {
        return inh::operator[](i) < inh::operator[](j);
      }
      void swap(int i, int j) {
        std::swap(inh::operator[](i), inh::operator[](j));
      }
      // utility functions
      void clear() {
        while (!inh::is_empty()) {
          inh::pop();
        }
      }
      void dup() {
        inh::push(inh::top());
      }
      void reverse() {
        int max = inh::count() - 1;
        int n = inh::count() / 2;
        for (int i=0; i < n; ++i) {
          inh::swap(i, max - i);
        }
      }
    };
    
    ///////////////////////////////////////////////
    // The contract checks the preconditions and 
    // postconditions of the functions 
    
    template
    <
      typename Implementation,
      typename Inherited 
    >
    struct stack_contract : Inherited
    {
      // public typedef
      typedef Inherited inh;
      typedef Implementation impl;
      typedef typename inh::value_type value_type;
      
      // constructors
      stack_contract() : inh() { }
      stack_contract(impl& x) : inh(x) { }
      stack_contract(int nsize, value_type x = value_type()) 
        : inh(nsize, x) { }
      
      // public functions
      void pop() {
        int old = inh::count();
        assert(!inh::is_empty());
        inh::pop();
        assert(count() == old - 1);
      }  
      void push(const value_type& x) {
        int old = inh::count();
        inh::push(x);
        assert(inh::count() == old + 1);
      }
      value_type& top() {
        assert(!inh::is_empty());
        value_type& ret = inh::top();
        value_type& tmp = inh::operator[](inh::count() - 1);
        assert(&ret == &tmp);
        return ret;
      }
      int count() {
        int ret = inh::count();
        assert(ret >= 0);
        assert(ret > 0 ? !inh::is_empty() : inh::is_empty());
        return ret;
      }
      bool is_empty() {
        bool ret = inh::is_empty();
        assert(ret ? inh::count() == 0 : inh::count() > 0);
        return ret; 
      }
      value_type& operator[](int n) {      
        assert(n >= 0);
        assert(n < inh::count()); 
        return inh::operator[](n);    
      }
    };
    
    /////////////////////////////////////////////////////////
    // The final version of the stack
    
    #ifndef _NDEBUG
      template
      <
        typename T, 
        typename Implementation = indexable_stack_impl<T>, 
        typename Contract = stack_contract<Implementation, Implementation>,
        typename Extension = stack_extension<Implementation, Contract>
      >
      struct stack : Extension
      {     
        typedef Implementation impl;
        typedef Extension inh;
        stack() : inh() { }
        stack(impl& x) : inh(impl& x) { }
        stack(int nsize, const T& x = T()) : inh(nsize, x) { }
      };
    #else 
      template
      <
        typename T, 
        typename Implementation = indexable_stack_impl<T>,
        typename Extension = stack_extension<Implementation, Implementation>
      >
      struct stack : Extension
      {     
        typedef Implementation impl;
        typedef Extension inh;
        stack() : inh() { }
        stack(impl& x) : inh(x) { }
        stack(int nsize, const T& x = T()) : inh(nsize, x) { }
      };
    #endif
}
#endif

Final words

Unfortunately, the code will not work as-is for versions of Visual C++ earlier than 7.1. You'll have to do some trimming to make it work. The code is public domain, so feel free to do what you want with it. I'd always be appreciative though of a quick note on the forum, just to let me know what kind of application this code finds a home in, it'll make me happy to know. If you want to release a more portable version, I'm sure other readers will appreciate it greatly!

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication

About the Author

Christopher Diggins
Software Developer Autodesk
Canada Canada
Member
This article was written by Christopher Diggins, a computer science nerd who currently works at Autodesk as an SDK specialist.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionHow to limit stack size?memberfilippov.anton28 Jun '06 - 0:45 
Thanks for good aticle!
 
I've question.
How to limit size of stack?
In constructor I set size of preallocation (I think), but when I trying push() size of stack is increasing.
I've override push() method like this:

void push(const T& x = T())
{
if (count() > stack_size) // stack_size set in constructor.
peek_top();
 
Extension::push(x);
}
 
void peek_top()
{
buff->a.destroy(buff->begin); // or destroy(top());
}

 
hovewer stack size increased.
 
Thanks,
Anton.

GeneralBug (of sorts)memberStewart Tootill23 Nov '05 - 5:50 
indexable_stack_impl::operator[] has a bug in it. If you give it a -ve number it will walk off the beginning(?) of the list. I know it is STL convention to not range check operator[] (and therefore not throw), but in that case there should possibly be a throwing at() method.
 
If the argument to operator[] were unsigned (which I assume makes sense since the collection cannot hold a negative number of objects) then instead of walking off the beginning of the collection and dereferencing an internal structure pointer which doesn't exist, out of range calls to operator[] would instead return a reference to an object in the first block (in lookup order, last in logical order) which is beyond its bounds (similar to how the standard implementation of std::vector behaves), which would improve the discoverability of the bug since the inevitable access violation will manifest itself in the (buggy) callers code rather then the as (probably) documented library code (I assume that the documentation would say that behaviour of operator[] is undefined with an out of range index) so callers who hit this problem would blame their own code rather than yours.
 
Otherwise it seems like it would be an improvement over deque for some uses, however I would be interested to see how it performs with erasure, since that is how deque is typically used (in my code at lease)
GeneralRe: Bug (of sorts)memberChristopher Diggins23 Nov '05 - 7:01 
Stewart Tootill wrote:
indexable_stack_impl::operator[] has a bug in it. If you give it a -ve number it will walk off the beginning(?) of the list. I know it is STL convention to not range check operator[] (and therefore not throw), but in that case there should possibly be a throwing at() method.

 
During debug builds the call will be routed through the contract checking class which asserts the validity of the index. This class was simply not designed to throw exceptions, but as you mention it is trivial to add a throwing accessor function.

 
Christopher Diggins

GeneralPreallocation, virtual memory is your friend!memberPedroMC23 Nov '05 - 0:32 
The std::vector appending times can be significantly improved if a big enough chunk of memory is reserved at start. It is important to note that the parts of the reserved memory that are never "touched" will never have physical memory allocated to it and will only waste address space (at least in Linux and Windows).
 
This technique will keep memory fragmentation to a minimum and can significantly reduce memory (re)allocation overhead.
 
This has one limitation, if a program needs to use a big portion of the memory address space for effective memory usage then wasting address space is not a possibility.
 
Regards
 
PMC
GeneralNot understandmemberjesusdesantos22 Nov '05 - 12:56 
I don't understand clearly your intention. To me, you are implementing an std::deque that grows exponentially. I'm not sure if the standard indicates that a std::deque should grow constantly. I don't think so. So you have the efficiency of dequee: O(1) random access, and O(1) insertion (worst)
 
Have you tested your code against visualstudio2005?
GeneralRe: Not understandmemberChristopher Diggins23 Nov '05 - 4:57 
jesusdesantos wrote:
I don't understand clearly your intention.

 
To design a more efficient indexable stack data structure.
 
jesusdesantos wrote:
Have you tested your code against visualstudio2005?

 
No I haven't. I'd be appreciative to see other people's benchmark results.
 


 
Christopher Diggins

QuestionFaster indexing and is Destruct/CopyConstruct really necessary?memberSimon Hofverberg22 Nov '05 - 0:16 

Thanks for another interesting article!
 
I've got two comments though.
 
First, I can't really see why you're using a loop (with a goto, aAARGHHH!!! WTF | :WTF: ) in the operator[] code. The Factor_N and Initial_N are template parameters so you should be able to calculate in a single statement which buffer your item resides in.
But: you have a linked-list to the buffers so it's hard to index one of them. But again: you could replace that with a std::vector ... Wink | ;)
But 2: that single-line statement would need a log call, wouldn't it? And that's perhaps slower than the loop.
But 3: perhaps goto's are fashionable again ...
 

 

Second, you say (in one of the messages) that single-buffer arrays:
 ... have to call the destructor and copy-constructor for all N elements each time it grows. 
But is it really necessary to call first the copy-constructor for each element in the new buffer and then the destructor for each element in the old buffer?
 
From a strict OO view it is of course the correct way to do it, but what's the harm of doing a good old memcpy? Apart from the speed improvement and some ugliness, I can't see any particular harm done. You'll copy the vpointer along with everything else. As far as I can see there are only two situations where something can be corrupted:
1. One of the objects holds a pointer to another of the objects.
2. One of the objects holds a pointer to itself.
 
No 1 will be broken either way, and no 2 can always be replaced with an offset from the this pointer.
 
What do you think about that?
 

Anyway thanks again for a good article.
 
/Simon
 
This is not a signature.
AnswerRe: Faster indexing and is Destruct/CopyConstruct really necessary?memberChristopher Diggins22 Nov '05 - 4:06 

Simon Hofverberg wrote:
First, I can't really see why you're using a loop (with a goto, aAARGHHH!!! ) in the operator[] code. The Factor_N and Initial_N are template parameters so you should be able to calculate in a single statement which buffer your item resides in.
But: you have a linked-list to the buffers so it's hard to index one of them. But again: you could replace that with a std::vector ...
But 2: that single-line statement would need a log call, wouldn't it? And that's perhaps slower than the loop.
But 3: perhaps goto's are fashionable again ...

 
If you prefer, just replace
 
loop:
  ...
  goto loop;
 
with
 
  while (true) {
     ...
  }
 
They are both the same as far as I am concerned.
 
Your single liner would recall that pointers be stored in an indexable stack so a std::vector as you suggest would be a good alternative. That would introduce a bunch of extra code to link in and make templates of. I am not sure it would be faster, maybe you might like to give it a try?
 

Simon Hofverberg wrote:
But is it really necessary to call first the copy-constructor for each element in the new buffer and then the destructor for each element in the old buffer?

 
Well only for certain types. To provide a generic solution I had no choice.
 
Simon Hofverberg wrote:
No 1 will be broken either way, and no 2 can always be replaced with an offset from the this pointer.

 
Number 1 won't be broken in the copy/destruct sequence unless the constructor of the object is broken. The number 2 system would work, but it will be tricky to assure that objects use an offset pointer instead of a real pointer.
 
Simon Hofverberg wrote:
Anyway thanks again for a good article

 
You are very kind, thank you!

 
Christopher Diggins

GeneralRe: Faster indexing and is Destruct/CopyConstruct really necessary?memberSimon Hofverberg23 Nov '05 - 3:17 
Christopher Diggins wrote:
Number 1 won't be broken in the copy/destruct sequence unless the constructor of the object is broken. The number 2 system would work, but it will be tricky to assure that objects use an offset pointer instead of a real pointer.

 
Why won't no 1 be broken? If object A (located in the array) has a pointer to object B (another of the objects in the array) and they all will be relocated (either by memcpy or by copy construct followed by destroy) how can object A keep a valid pointer to object B?
To make that work you need some pretty elaborate stuff involving object B keeping a reference to object A (or anybody else pointing to it) and updating those objects whenever object B is relocated. But such a scheme won't work since object B is not relocated, it is simply destroyed, so you would need something even more elaborate than that to keep the reference in the copy/destruct sequence. Or am I missing something?
 
Can you see any other reason than the no 2 case (an object in the array having a pointer to some of its own members) that would mean that you can't use memcpy instead of copy/destruct?
 
Your class of course avoids this question altogether by not relocating any items, which is a very nice solution.
 
My point is that for arrays/containers that do relocate items you might actually get away with memcpy - something that is as bad as goto Smile | :)
 
/Simon
 
This is not a signature.
GeneralRe: Faster indexing and is Destruct/CopyConstruct really necessary?memberChristopher Diggins23 Nov '05 - 4:54 
Simon Hofverberg wrote:
Why won't no 1 be broken? If object A (located in the array) has a pointer to object B (another of the objects in the array) and they all will be relocated (either by memcpy or by copy construct followed by destroy) how can object A keep a valid pointer to object B?
To

 
Sorry I wasn't being clear. Normally in a design, there will be a copy constructor which takes care of making sure that all pointers are valid. I mistakenly assumed that an object without a valid copy constructor is broken. However I did not account for objects which are designed without any copy constructors.
 
Simon Hofverberg wrote:
Can you see any other reason than the no 2 case (an object in the array having a pointer to some of its own members) that would mean that you can't use memcpy instead of copy/destruct?

 
An object could have a pointer to one of its own members.
 
Simon Hofverberg wrote:
My point is that for arrays/containers that do relocate items you might actually get away with memcpy

 
It would be difficult to manage the offset pointers (pointers and references to other members of the array), as far as I can tell you would have to have extra requirements on the design of objects placed into the class.
 
Simon Hofverberg wrote:
- something that is as bad as goto

 
A) memcpy is not bad, misuse of memcpy is bad.
B) goto is not bad, misuse of goto is bad.
 

 
Christopher Diggins

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 19 Nov 2005
Article Copyright 2005 by Christopher Diggins
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid