Click here to Skip to main content
12,550,751 members (48,128 online)
Rate this:
Please Sign up or sign in to vote.
See more: C++

I have a 3d structure that contains a Byte variable cCtrlVal.
I would like to set this at runtime to 255 as fast as possible.
There are 5760 instances of cCtrlVal.

Currently I am looping through the matrix with 3 nested for loops.

This there a better quicker way to quick set this var?
Thanks in advance.


struct typControlMatrix
	BOOL bEnabled; 
	BOOL bChanged;
	BOOL LastChange;	
	BOOL AdjStripNo;
        BYTE cCtrlVal; 
	enum enControlType eCtrlType;
	DWORD lCtrlNo;
	CString sCtrlVal;
	BOOL bDescChanged;
	int TrackPage; 
	int CompRatioIndex ;
	DWORD lStripNo;
	int lParamNo;
	} m_tControlMatrix[128][5][9];
Posted 21-Nov-12 16:33pm
Ron Anders12.2K
Updated 21-Nov-12 18:45pm
Mohibur Rashid 22-Nov-12 0:48am
i don't think you have other choice
Agree; this is easy to make worse, but hard to improve.
By the way, using ++index instead of usual index++ of "for" statement usually makes it faster -- try it...
Rate this: bad
Please Sign up or sign in to vote.

Solution 2

With nv3's suggested 'flat approach', using pointers I obtained a somewhat surprising result (about 3x speed improvement):

register int i, j, k;
  //-> 3 loops
  for (i=0; i<128; i++)
    for (j=0; j<5; j++)
      for (k=0; k<9; k++)
        m_tControlMatrix[i][j][k].cCtrlVal = 255;
  //-> flat pointers
  register BYTE * p = &m_tControlMatrix[0][0][0].cCtrlVal;
  register BYTE * q = p + 5760 * sizeof(typControlMatrix);
  while (p < q)
    *p = 255;
    p += sizeof(typControlMatrix);
  CString s;
  s.Format("3 loops: %I64d flat pointers: %I64d speed ratio %g ", (t[1].QuadPart-t[0].QuadPart), (t[3].QuadPart-t[2].QuadPart), ((double)(t[1].QuadPart-t[0].QuadPart))/(t[3].QuadPart-t[2].QuadPart));
  MessageBox(s, "Test");

The output:
3 loops: 195327 flat pointers: 67743 speed ratio 2.88335 
nv3 22-Nov-12 8:05am
Thanks for implementing the idea and doing the performance measurement! A factor of almost 3 is indeed a nice result and was probably so high because the loop overhead in the 3 nested loops outweighs the loop body, which is only a single byte copy. +5
CPallini 22-Nov-12 8:49am
Thank you.
I suppose like you that improvement reasons are due to loops overhead.
BTW you already got my 5 for the good original hypothesis.
nv3 22-Nov-12 9:06am
Thank you!
Ron Anders 22-Nov-12 9:44am
Nice! Thank you so much.
CPallini 22-Nov-12 11:23am
You are welcome.
Please note I've used #define MAX_SCRIBBLE_STRIP_DESC 6, since I had no idea of its actual value. I don't know how much this had biased the result.
VuNic 14-Dec-12 8:00am
Why threads are not considered? Flat pointer on multiple threads feels perfect fit for his performance requirement.
CPallini 14-Dec-12 8:20am
I tried with threads, but got longer execution times.
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

You can flatten the loop to a 1D affair by taking the address of the first element of the 3D array and then do 128*5*9 iterations.

Today's optimizing compilers might do the same and in that case this technique would not yield any speed advantage. On the other hand, it's worth a try.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month

Advertise | Privacy | Mobile
Web02 | 2.8.161021.1 | Last Updated 22 Nov 2012
Copyright © CodeProject, 1999-2016
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100