Understanding STL

Jais Joy

Rate me:

3.00/5 (24 votes)

18 Dec 20039 min read

58.6K

This article explains STL internals.

Introduction

STL has a lot of powerful features which are undiscovered for many programmers. Complexity of templates stops people from discovering the scope of STL. Here, I am removing that complexity by explaining with template’s generated code with commonly used data types. This article intends to trigger you for exploring the internals of STL and use its powerful features. Once you understand the features of STL, you will get addicted of using it in your development. Explaining all the container and algorithm internals are out of the scope of this article. I choose vector and some sample algorithms to explain the usage of STL.

What is STL?

The Standard Template Library (STL) is a general-purpose C++ library of algorithms and data structures, originated by Alexander Stepanov and Meng Lee. The STL, based on a concept known as generic programming, is part of the standard ANSI C++ library. The STL is implemented by means of the C++ template mechanism, hence its name. While some aspects of the library are very complex, it can often be applied in a very straightforward way, facilitating reuse of the sophisticated data structures and algorithms it contains.

Why should I learn STL?

STL gives generic collections and algorithms which can be applied on those collections. STL code can be used across the platforms. Since STL containers and algorithms are C++ templates, you are able to use any data type, if it satisfies the requirement of the templates. You can make generic code which will accept all types of container classes. Examples of STL containers are deque, list, map, multimap, multiset, set, and vector; and examples of algorithms are find, copy, remove, max, sort, and accumulate.

Vector and Algorithm Relation

Let us try to have an idea of how algorithm ‘remove’ works on a vector. Vector is an STL container which grows dynamically and have similar features of an array.

#include <iostream>
#include <vector>
#include <deque>
#include <list>
#include <algorithm>
using namespace std;
void main()
{  
    typedef /*list/deque*/vector<int > intVect ;
    typedef intVect::iterator intVectIt ;
    intVectIt end ,it,last;
            
    intVect Numbers ;  
    Numbers.push_back ( 10 );Numbers.push_back ( 33 ); Numbers.push_back(50);
    Numbers.push_back(33);    
    cout << "Before calling remove" << endl ;
    end = Numbers.end();
    cout << "Numbers { " ;
    for( it = Numbers.begin(); it != end; it++)
        cout << *it << " " ;
    cout << " }\n" << endl ;
    last = remove(Numbers.begin(), end, 33) ;
    cout << "After calling remove" << endl ;
    cout << "Numbers { " ;
    for(it = Numbers.begin(); it != last; it++)
        cout << *it << " " ;
    cout << " }\n" << endl ;
}

Output of the program

Before calling remove
Numbers { 10 33 50 33  }
 
After calling remove
Numbers { 10 50  }

In the above example, we insert 10, 33, 50, and 33 into vector Numbers. Algorithm ‘remove’ is used to remove all the elements which has a value 33. ‘remove’ uses iterator of vector for this operation. iterators have similarities to pointers and are used to traverse an array of objects. In implementation of vector<int>, it's iterator is int*.

Now, change vector to deque or list in the first line. You can see our program is working exactly same as in the vector case! For changing the container implementation, we had to change only one line of code. This is one of the big benefits which is offered by STL. Now, take the above piece of code to some other operating system (I tested with Windows and Linux ), build and run it. Without any change, it works! This is another big advantage STL offers. You can try different data types instead of int for vector container.

Constructors of vector

Now, let us look at how default constructor and constructor with an integer argument work. I will be using object of the following class for inserting into the vector.

class A
{
    int iValue;
    A()
    {
        cout << "A()" << endl;
    }
};

Class vector<A> has four member variables: allocator, _First, _Last, and _End. allocator is of type std::allocator<A>, and _First, _Last, and _End are iterators which come out to be A*. Default constructor of vector<A> initializes allocator with std::allocator<A> object. _First, _Last, _End are initialized to null pointers.

Now, let us see how constructor vector(size_type _N, const _Ty& _V = _Ty(), const _A& _Al = _A()) works. By removing template variables for class A, prototype of the constructor becomes:

vector ( unsigned int _N , const A& _V = A(), 
  const std::allocator<A> & _A1 = std::allocator<A> () )

This constructor does the following things to initialize its member variables:

_First = allocator.allocate(_N, (void *)0);
_Ufill(_First, _N, _V);
_Last = _First + _N;
_End = _Last;

allocator.allocate calls template function _Allocate passing number of objects to be created (_N). For class A, this template function is generated to A* _Allocate ( int , A * ).

This function allocates _N * sizeof ( A ) memory and returns the starting address. This is assigned to _First. _Ufill does something like this:

for (; 0 < _N; --_N, ++_F)
    allocator.construct(_F, _X);

Here, _F ( of type A* ) points to the address where object to be constructed from _X through A ( const A & ).

_Ufill copies the one object constructed (const reference to object is _V. Refer line 2.) to all _N locations. allocator.construct calls template function _Consturct passing the same arguments. A new object is created at location _F from reference to object _X (new placement operator is used here). In line 3, _Last is assigned to address after last object. _End and _Last point to the same location.

Now, let us have a look at how the destructor works. Destructor of the vector does the following things:

_Destroy(_First, _Last);
allocator.deallocate(_First, _End - _First);
_First = 0, _Last = 0, _End = 0;

_Destroy calls allocator.destroy ( _F ) for each A* _F from _First through _Last. This function will call destructor ~A() explicitly to destroy the object (remember, we allocated using placement new operator). allocator.deallocate deletes the memory pointed to by (_First) which will free memory allocated for all the objects in the vector.

How push_back works?

Most complicated and most frequently used function in a vector is push_back. Let us analyze how push_back works? push_back will insert new data at the end of the vector. It calls insert ( end(), _X). (Member function end() returns _Last and _X is reference to the object (class A) to be added.) insert calls another overloaded function insert (_Last , 1 , _X).

For class A, this function definition becomes:

void insert(A* _P, unsigned int _M, const A& _X)
{
    if (_End - _Last < _M)
    {
        unsigned int _N = size() + (_M < size() ? size() : _M);
        A* _S = allocator.allocate(_N, (void *)0);
        A* _Q = _Ucopy(_First, _P, _S);
        _Ufill(_Q, _M, _X);
        _Ucopy(_P, _Last, _Q + _M);
        _Destroy(_First, _Last);
        allocator.deallocate(_First, _End - _First);
        _End = _S + _N;
        _Last = _S + size() + _M;
        _First = _S;
    }
    else if (_Last - _P < _M)
    {
        _Ucopy(_P, _Last, _P + _M);
        _Ufill(_Last, _M - (_Last - _P), _X);
        fill(_P, _Last, _X);
        _Last += _M; 
    }
    else if (0 < _M)
    {
        _Ucopy(_Last - _M, _Last, _Last);
        copy_backward(_P, _Last - _M, _Last);
        fill(_P, _P + _M, _X);
        _Last += _M; 
    }
}

Before explaining this code, we should look at the difference between _End and _Last. _End is the end of the buffer allocated to the vector, and _Last is the end of the last value inserted. To make it clear, take the case of pushing back object of A to a vector which currently contains three objects, and assume _End and _Last point to the same location. In this case, allocator will allocate memory for 3+3 objects even though we have to insert only one. This is to reduce reallocations. After push_back, _End will point to the end of 6th object, and _Last will point to the end of the 4th object. During push_back, insert checks whether reallocation is needed ((_End - _Last < _M)). If reallocation is needed, it allocates double size or size() + _M, whichever is higher. It copies all the data to the new buffer and removes the earlier buffer. It then inserts new data at the _Last position and reassigns _Last and _End. If we have enough space to insert a new element (i.e., _End - _Last > _M), new element is inserted at _Last, and _Last is reassigned.

STL Algorithms (how it works with a vector)

Before understanding algorithms, we will have to learn how function objects work. Class of a function object defines operator()(). The result is, a template function can not detect whether you passed a pointer to a function or object of a class having operator()(). Following example will give you a clear idea about what are function objects:

#include <iostream.h>
void  f ()
{
    cout << "f()" << endl;
}
 
class X
{
public:
    void operator()()
    {
        cout << "X::operator()" << endl;
    }
};
 
template < class T >
void test_func ( T f1 )
{
    f1();
}
void main()
{
    X a;
    test_func(f );
    test_func(a);
}

Output of the program is:

f()
X::operator()

Function objects can be classified as Generator (no argument), UnaryFunction (single argument), and BinaryFunction (takes two arguments). A special case of unary and binary functions are predicates (UnaryPredicate, BinaryPredicate) which simply means function returns a bool. STL has, in the header file <functional>, a set of templates that automatically create function objects for you. It is powerful not only because it’s a reasonably complete library of tools, but also because it provides a vocabulary for thinking about problem solutions, and because it is a framework for creating additional tools.

How copy works?

Problem: copy all the data of one vector<A> to another vector<A>.

Analysis: For class A, function definition becomes (you have to pass arguments which support operator ++ () and operator *(). You can apply * and ++ to iterators.):

copy ( A* first , A* last , A* x )

Function copy evaluates *(x+N) = * ( first + N ) for all N in the range [0, last-first]. It returns x+N. Consider vector<A> objects sourceA, DestA. For copying all data from sourceA to DestA, you have to call copy (sourceA.begin() , sourceA.end() , DestA.begin()).

Before calling copy, ensure that destination vector has enough space to accommodate all the data in source vector.

How transform works?

Problem: you have a vector<int> of three members and you want to store square of each element in another vector<int>.

Solution: following piece of code does the job for you:

int square_it ( const int x)
{
    return x*x;
}
void main ()
{
    vector < int > v1, v2(3);
    v1.push_back(2);
    v1.push_back(5);
    v1.push_back(83);
    transform ( v1.begin() , v1.end() , v2.begin() , square_it );
    for ( vector<int>::iterator it = v2.begin(); it != v2.end() ; it++ )
        cout << *it;
}

For the above code, function prototype generated for transform is:

int * trnsform ( int * First , int * Last , int* x, int (*f) (const int) )

transform evaluates *(x + N ) = square_it (*(First + N ) ) for all N in the range [0, Last - First]. Note: v2 has enough memory allocated (memory for three integers) before calling transform.

How generate works?

Problem: insert three ascending numbers in a vector without using push_back.

Solution: following piece of code does the job for you:

int f ()
{
    static int x = 1;
    return ++x;
}
void main ()
{
    vector < int > v2(3);
    generate ( v2.begin() , v2.end() , f );
    for ( vector<int>::iterator it = v2.begin(); it != v2.end() ; it++ )
        cout << *it;
}

For the above code, generated function prototype for ‘generate’ is:

void generate ( int* first , int* last , int (*f)() )

generate evaluates *(first + N ) = f ( ) for all N in the range [0 , last - first].

How replace_if works?

Problem: change all 0 to -1 in the vector<int>.

Solution: assume vector<int> v contains integers 1, 2, 0, 6, 0. Following code will change all the 0s to -1:

int f (const int x)
{
   if ( x ==  0 )
       return true;
   else
       return false;
}
replace_if( v.begin() , v.end() , f , -1);

Generated prototype for replace_if is:

void replace_if ( int * first , int * last , int (*f) ( const int) , -1 )

For all N in the range [0, last-first], replace_if evaluates to:

if ( f ( *(first+N) ) )
    *(first + N ) = -1;

How for_each works?

Problem: print all the members of vector.

Solution: following piece of code does the work for you:

void f (const int x)
{
    cout << x << endl;
}
for_each ( v.begin() , v.end() , f);

Prototype generated for the above code is:

void for_each ( int * first , int* last , void (*f) (const int ) );

for_each will evaluate to:

f ( *( first + N ) ) for all N in the range [0 , last - first ]

How fill works?

Problem: fill vector<int> v with value 10.

Solution: following code does the work for you:

fill ( v.begin() , v.end() , 10 );

Generated function prototype for the above invocation of fill is:

void fill(int * first, int * last, const int &) ;

fill evaluates *(first + N) = x once for each N in the range [0, last - first].

How count_if works?

Problem: count number of '2's in an integer array.

Solution: following code does the work for you:

bool f2 ( const int x )
{
    if ( x == 2 )
        return true;
    else
        return false;
}

Assume vector<int> v contains integers. Following code will return number of '2's in the vector:

int iNoof2s = count_if( v.begin() , v.end() , f2 );

Generated function prototype for the above code is:

unsigned int count_if (  int * first , int * last , bool (*fn) ( const int ) );

count_if sets a count n to zero. It then executes ++n for each N in the range [0, last - first] for which the predicate fn(*(first + N)) is true. It evaluates the predicate exactly last - first times.

Conclusion

You have got an idea about how vector and some of the algorithms work. Try using other containers and explore more algorithms. You will discover more interesting things. Happy coding with STL.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Written By

Jais Joy

Web Developer

United States

Close to 12 years in software industry.Currently working for IBM. ( Any of my article does not represent IBM but it is purely individual perspective. )

Looking forward to contribute to software industry.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.