Click here to Skip to main content
15,898,740 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
How can i further optimise the for loop in the codes below?is there also a way to avoid declarations inside the loop ie int x0,x1, etc?

C#
void tom::add(void* btr)
{

      __declspec(align(16))short* b =(short*)btr;
     int j;

    
        
        for(j = 0; j < 16; j += 4)
        {
            /// 1st stage transform.
            int x0 = (int)(b[j]     + b[j+3]);
            int x3 = (int)(b[j]     - b[j+3]);
            int x1 = (int)(b[j+1] + b[j+2]);
            int x2 = (int)(b[j+1] - b[j+2]);

            /// 2nd stage transform.
            b[j]        = (short)(x0 + x1);
            b[j+2]  = (short)(x0 - x1);
            b[j+1]  = (short)(x2 + (x3 << 1));
            b[j+3]  = (short)(x3 - (x2 << 1));
        }
Posted

Yeah, put the declarations outside the loop.

You'd also save a lot of time by not casting in the loop. If you need an int for the calculations, cast the incoming parameter to an int before you start the loop, do the loop, and then cast the result back to a short after the loop.
 
Share this answer
 
Declaring the int's inside the loop does not affect performance. The allocation for that object is done on the stack when you enter the function. Also an int has no constructor, which is the crucial part. Declaring an int does not generate any code.

Objects having non-trivial constructors is another issue, and should be given special concern when dealing with loops that requires performance.

Also, watch out for overflow in statements like this
(int)(b[j] + b[j+3]);


If you have nothing better to do, you could also unroll the loop.

Edit:
There is something else bothering me with your code. Why do you use a void* as argument to your function if you only accept a short* implementing it? Would it not be better to have a short* as argument and let the caller, who probably knows your void pointers content a lot better than the add() function, do the conversion?
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900