Click here to Skip to main content
15,881,172 members
Articles / Programming Languages / C++
Article

Multi Dimensional Arrays using Template Meta Programming

Rate me:
Please Sign up or sign in to vote.
3.05/5 (8 votes)
20 Mar 2004CPOL3 min read 87K   465   17   20
A Real Multi-Dimensional Array

Introduction

Multi-dimensional arrays provide a nice mathematical data structure for mathematical and soft computing projects. Tensor is usually the data structure used to store data in multi-dimensional arrays. Helper classes helps in design subscripting operators to select rows, cols, and plates. Helper classes may be implemented by adding constant template arguments describing the state of the array, or sub-array. Refer to "Tensor templates", " 2D Matrix Container with [][] indexing", and "Template-Based Helper Classes: An application to Multidimensional Arrays" for more details. A tensor is a uni-dimensional array with indexing facility that allows manipulation as N-dimensional matrix. Computational cost for accessing a specific element grows exponentially as the number of dimensions increases. Helper classes force many constructors/destructors calls for temporary objects. Actually helper classes provide a nice solution for little projects with short loops.

In mathematics, finding a solution in some space costs a lot. One way to find fast solution is transforming the problem (actually the search space) into a larger space where more dimensions may lead to faster solution.

Now, consider the following piece of code:

int* pi1=new int[20];
int** pi2=new int*[4];
for(int i=0; i<4; i++)
    pi2[i]=new int[5];

Now, what do you think? How shall I access the element located in 2nd row and 3rd col? How much does it cost for each of them?

Obviously, pi2[2][3] is faster than pi1[2*4+3].

How about very large N-dimensional arrays with complex objects!? Can you imagine this!?

"Template-Based Helper Classes: An application to Multidimensional Arrays" provides a nice solution, but subscripting operators cost a lot for computations, costs much if repeated in a very long loop, uses lots of helper temporary classes, and fragments the memory.

As described in linear algebra, adding more dimensions fastens the access to data elements.

Problem Statement

How shall we develop a template class that generates multi-dimensional pointers!!?

int i; 
int* pi; 
int** ppi; 
int*** pppi; 
...

How shall we add number of dimensions as a template argument!?

Basic Solution

The basic idea makes use of Template Meta-Programming, implementing a template class with argument M indicating the number of nested pointers.

Consider the following class:

template<_sizetype M>
class multi_ptr
{
public:
    typedef double T;
    typedef multi_ptr<M-1>::pointer* pointer; 
};

class multi_ptr<0>
{
public:
    typedef double T;
    typedef T pointer;
};

Now, consider how to declare a multi-dimensional pointer:

multi_ptr<3>::pointer p3;

Now, more facilities are needed; we need to add class template argument T indicating the type of this array.

More Enhancement - Adding Class Template Argument

Template Meta-Programming works only for single argument template classes. Adding a class template argument cannot be achieved directly.

An outer class should be used to pass the template argument through. Consider the following multi_array class:

template<class T, _sizetype N>
class multi_array
{
public:
    template<_sizetype M>
    class multi_ptr
    {
    public:
        typedef multi_ptr<M-1>::pointer* pointer; 
    };
    
    class multi_ptr<0>
    {
    public:
        typedef T pointer;
    };

    typedef multi_ptr<N>::pointer pointer;
    typedef const pointer const_pointer;
};

Class template argument T is passed through the outer multi_array class with the number of dimensions N. The inner multi_ptr class makes use of the outer template argument T for type specialization.

Now the very important question is:

How shall we allocate this multi dimensional array!? We need to specialize nested allocating loops!!

How shall we build a varying number of nested allocating loops!?

We need a little addition for allocation!!

Little Enhancement - Adding Allocators

A simple compile-time loop provides a nice solution for generating a variable number of nested loops. An inner class called multi_allocator is responsible for allocating multi-dimensional pointers.

template<_sizetype M>
class multi_allocator
{
public:
    static multi_ptr<M>::pointer alloc(_sizetype* dims)
    {
        _sizetype i;
        multi_ptr<M>::pointer rtn;
        rtn=new multi_ptr<M-1>::pointer[dims[0]];
        for(i=0; i<dims[0]; i++)
            rtn[i]=multi_allocator<M-1>::alloc(dims+1);
        return rtn;
    }
};
    
class multi_allocator<1>
{
public:
    static multi_ptr<1>::pointer alloc(_sizetype* dims)
    {
        multi_ptr<1>::pointer rtn;
        rtn=new T[dims[0]];
        return rtn;
    }
};

Now, fine touches is needed to finalize the multi_array class. Constructors, casting operators, and data members should be added.

Finalizing multi_array - Adding constructors and casting operators

typedef unsigned int _sizetype;
    
template<class T, _sizetype N>
class multi_array
{
public:
    template<_sizetype M>
    class multi_ptr
    {
    public:
        typedef multi_ptr<M-1>::pointer* pointer; 
    };
    
    class multi_ptr<0>
    {
    public:
        typedef T pointer;
    };


    template<_sizetype M>
    class multi_allocator
    {
    public:
        static multi_ptr<M>::pointer alloc(_sizetype* dims)
        {
            _sizetype i;
            multi_ptr<M>::pointer rtn;
            rtn=new multi_ptr<M-1>::pointer[dims[0]];
            for(i=0; i<dims[0]; i++)
                rtn[i]=multi_allocator<M-1>::alloc(dims+1);
            return rtn;
        }
    };
    
    class multi_allocator<1>
    {
    public:
        static multi_ptr<1>::pointer alloc(_sizetype* dims)
        {
            multi_ptr<1>::pointer rtn;
            rtn=new T[dims[0]];
            return rtn;
        }
    };
    
    typedef multi_ptr<N>::pointer pointer;
    typedef const pointer const_pointer;
    typedef multi_allocator<N> allocator;

    
    operator pointer(){return _pdata;}
    
    multi_array(const _sizetype d1, ...)
    {
        memcpy(_pdims, &d1, N*sizeof(_sizetype));
        _pdata=allocator::alloc(_pdims);
    }
    
    multi_array(const _sizetype* pdims)
    {
        memcpy(_pdims, pdims, N*sizeof(_sizetype));
        _pdata=allocator::alloc(_pdims);
    }
    
protected:
    pointer _pdata;
    _sizetype _pdims[N+1];

};

Thank You!!

Thanx a lot for your time. I hope you enjoyed this idea!! Hear from you soon:).

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Engineer
Australia Australia
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Generalerror using g++ 4.4.5 Pin
axllaruse10-May-11 2:55
axllaruse10-May-11 2:55 
QuestionDeallocate? Pin
GregoryBook29-Sep-06 15:50
GregoryBook29-Sep-06 15:50 
AnswerRe: Deallocate? Pin
Mo Hossny23-Oct-06 18:27
Mo Hossny23-Oct-06 18:27 
Generalinvalid template argument 'N' Pin
SMJT17-Sep-04 7:31
SMJT17-Sep-04 7:31 
GeneralRe: invalid template argument 'N' Pin
SMJT17-Sep-04 7:38
SMJT17-Sep-04 7:38 
GeneralRe: invalid template argument 'N' Pin
Mo Hossny17-Sep-04 12:24
Mo Hossny17-Sep-04 12:24 
GeneralRe: invalid template argument 'N' Pin
SMJT17-Sep-04 21:35
SMJT17-Sep-04 21:35 
GeneralRe: invalid template argument 'N' Pin
Mo Hossny19-Sep-04 9:12
Mo Hossny19-Sep-04 9:12 
GeneralCompiler error on VC++.NET 2003 Pin
Rui A. Rebelo8-Apr-04 4:11
Rui A. Rebelo8-Apr-04 4:11 
GeneralRe: Compiler error on VC++.NET 2003 Pin
Mo Hossny10-Apr-04 22:01
Mo Hossny10-Apr-04 22:01 
GeneralRe: Compiler error on VC++.NET 2003 Pin
Mo Hossny3-Jun-04 1:56
Mo Hossny3-Jun-04 1:56 
Questionpi2[2][3] is faster than pi1[2*4+3] ?? Pin
sahn021-Mar-04 3:20
sahn021-Mar-04 3:20 
AnswerRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
Mo Hossny21-Mar-04 3:23
Mo Hossny21-Mar-04 3:23 
GeneralRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
Tim Smith21-Mar-04 6:14
Tim Smith21-Mar-04 6:14 
GeneralRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
Mo Hossny21-Mar-04 6:26
Mo Hossny21-Mar-04 6:26 
GeneralRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
Tim Smith21-Mar-04 7:16
Tim Smith21-Mar-04 7:16 
GeneralRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
Don Clugston21-Mar-04 15:28
Don Clugston21-Mar-04 15:28 
GeneralRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
Nitron22-Mar-04 3:16
Nitron22-Mar-04 3:16 
GeneralRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
John M. Drescher22-Mar-04 3:58
John M. Drescher22-Mar-04 3:58 
GeneralRe: pi2[2][3] is faster than pi1[2*4+3] ?? Pin
Mo Hossny22-Mar-04 22:58
Mo Hossny22-Mar-04 22:58 
sorry 4 being that late Blush | :O .

i guess
<br />
    int GetAt (int pi1 [20], int i, int j)<br />
    int GetAt (int pi2 [4][5], int i, int j)<br />


are the same, multi-dimensional static arrays r actually implemented as a single uni-dimensional array. Thats why u cant pass an int[4][5] into an int** parameter.

I still find ur comparison unfair!! U ve to compare m/c code for an int** vs int*.

And i agree with u!! LEA instruction captures the effective address in one shot Blush | :O .

However consider this for a highr dimensional array. I work in movie enhancement. This means working in at least 4-dimensional arrays (2-dims for spatial coordinates in each frame, 1-dim for color components, and additional 1-dim for timing). Now consider one more dimension to work on multiple movies!!! Confused | :confused: Unsure | :~ Eek! | :eek: Sigh | :sigh:

Acessing a single color component in some frame in ... costs toooo much as i thought!!

I guess multi-dimensional arrays provides acess via 5 hops!!

Now consider this trial Smile | :)
<br />
int getat(int* p1, int i, int j, int k, int l)<br />
{<br />
    return p1[l*60+k*12+i*4+j];<br />
}<br />
<br />
DEBUG CODE GENERATION<br />
	mov	eax, DWORD PTR _l$[ebp]<br />
	imul	eax, 60					; 0000003cH<br />
	mov	ecx, DWORD PTR _k$[ebp]<br />
	imul	ecx, 12					; 0000000cH<br />
	add	ecx, DWORD PTR _j$[ebp]<br />
	add	ecx, eax<br />
	mov	edx, DWORD PTR _i$[ebp]<br />
	lea	eax, DWORD PTR [ecx+edx*4]<br />
	mov	ecx, DWORD PTR _p1$[ebp]<br />
	mov	eax, DWORD PTR [ecx+eax*4]<br />
<br />
<br />
RELEASE OPTIMIZED CODE GENERATION<br />
	mov	eax, DWORD PTR _l$[esp-4]<br />
	mov	ecx, DWORD PTR _k$[esp-4]<br />
	lea	edx, DWORD PTR [ecx+eax*4]<br />
	mov	ecx, DWORD PTR _i$[esp-4]<br />
	add	eax, edx<br />
	lea	edx, DWORD PTR [ecx+eax*2]<br />
	mov	ecx, DWORD PTR _j$[esp-4]<br />
	add	eax, edx<br />
	lea	edx, DWORD PTR [ecx+eax*4]<br />
	mov	eax, DWORD PTR _p1$[esp-4]<br />
	mov	eax, DWORD PTR [eax+edx*4]<br />


and a multi-dimensional array like
<br />
int getat(int**** p2, int i, int j, int k, int l)<br />
{<br />
    return p2[i][j][k][l];<br />
}<br />
<br />
DEBUG CODE GENERATION<br />
	mov	eax, DWORD PTR _i$[ebp]<br />
	mov	ecx, DWORD PTR _p2$[ebp]<br />
	mov	edx, DWORD PTR [ecx+eax*4]<br />
	mov	eax, DWORD PTR _j$[ebp]<br />
	mov	ecx, DWORD PTR [edx+eax*4]<br />
	mov	edx, DWORD PTR _k$[ebp]<br />
	mov	eax, DWORD PTR [ecx+edx*4]<br />
	mov	ecx, DWORD PTR _l$[ebp]<br />
	mov	eax, DWORD PTR [eax+ecx*4]<br />
<br />
RELEASE OPTIMIZED CODE GENERATION<br />
	mov	eax, DWORD PTR _i$[esp-4]<br />
	mov	ecx, DWORD PTR _p2$[esp-4]<br />
	mov	edx, DWORD PTR [ecx+eax*4]<br />
	mov	eax, DWORD PTR _j$[esp-4]<br />
	mov	ecx, DWORD PTR [edx+eax*4]<br />
	mov	edx, DWORD PTR _k$[esp-4]<br />
	mov	eax, DWORD PTR [ecx+edx*4]<br />
	mov	ecx, DWORD PTR _l$[esp-4]<br />
	mov	eax, DWORD PTR [eax+ecx*4]<br />


u got my point!!

i know nothing about cache optimization, but i guess multi-dimensional arrays are more intuitive, faster to compile and optimize, and faster in terms of cpu clock tics at high orders.

thx alot for time, salam Smile | :) .

Yours, M. Hossny

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.