As you probably know, especially if you've found this article through Google :), the ATL
CImage class' pixel access performance is terrible. Despite the fact that it's a fairly popular problem, I have not found the simplest (certainly not the best :) ) solution to the problem anywhere. Based on the Bitmap usage extension library, I've created a simple wrapper class that provides pixel access by directly accessing the bitmap bits. With a minor refactoring, it could be easily separated to also provide pixel access to standalone DIBs.
The way the class is designed is to allow easy optimization or extension of software projects already using the
CImage class, or using
CImage in new projects, without worrying about pixel access performance.
The wrapper class public interface and usage
CImagePixelAccessOptimizer( CImage* _image );
CImagePixelAccessOptimizer( const CImage* _image );
COLORREF GetPixel( int _x, int _y ) const;
void SetPixel( int _x, int _y, const COLORREF _color );
If you need fast per pixel access in your code, all you have to do is create a temporary stack variable of the
CImagePixelAccessOptimizer class and then change/add calls to the
GetPixel methods so that they use the temporary optimizer object, and not the
CImage object directly. An example from my turf is a trivial image rotation:
CImagePixelAccessOptimizer tempImageOpt( pTmpImage );
CImagePixelAccessOptimizer currImageOpt( pCurrentImage );
for( unsigned x=0; x < uOrgWidth; ++x )
for( unsigned y=0; y < uOrgHeight; ++y )
tempImageOpt.SetPixel( uOrgHeight - y - 1, x, currImageOpt.GetPixel( x, y ) );
It's probably not the fastest way to rotate images, but it works, and shows the point quite well.
The class encapsulates simple methods found here and there that let you access pixel information directly from the DIB table(s) based on their native format. The fact of using a temporary class object gives the ability to keep the original code as simple as possible, but at the same time, giving you all the needed areas for optimization. Each operation that's constant between the
SetPixel calls is performed and remembered in the constructor of the
CImagePixelAccessOptimizer class. Calculating the row width of the DIB table, or obtaining the palette table and image dimensions, is done only once.
Thanks to this, the
SetPixel methods may be really fast, coming down to just a single
switch statement and a quite simple table indirection or two.
inline COLORREF CImagePixelAccessOptimizer::GetPixel( int _x, int _y ) const
ASSERT( PositionOK( _x, _y ) );
FOR_GET_SET_PIXEL_ASSERT( const COLORREF color = m_image->GetPixel( _x, _y ) );
const RGBQUAD* rgbResult = NULL;
switch( m_bitCnt )
rgbResult = &m_colors[ *(m_bits + m_rowBytes*_y + _x/8) &
(0x80 >> _x%8) ];
rgbResult = &m_colors[ *(m_bits + m_rowBytes*_y + _x/2) &
((_x&1) ? 0x0f : 0xf0) ];
rgbResult = &m_colors[ *(m_bits + m_rowBytes*_y + _x) ];
WORD dummy = *(LPWORD)(m_bits + m_rowBytes*_y + _x*2);
tempRgbResult.rgbBlue = (BYTE)(0x001F & dummy);
tempRgbResult.rgbGreen = (BYTE)(0x001F & (dummy >> 5));
tempRgbResult.rgbRed = (BYTE)(0x001F & dummy >> 10 );
rgbResult = &tempRgbResult;
rgbResult = (LPRGBQUAD)(m_bits + m_rowBytes*_y + _x*3);
rgbResult = (LPRGBQUAD)(m_bits + m_rowBytes*_y + _x*4);
ASSERT( false );
const COLORREF rgbResultColorRef = RGB( rgbResult->rgbRed,
rgbResult->rgbGreen, rgbResult->rgbBlue );
GET_SET_PIXEL_ASSERT( rgbResultColorRef == color );
If you find issues with the code where the colors are set badly or in the wrong places, try un-commenting the below:
It will enable checks in which the optimized results will be compared with the behavior provided by the
CImage class itself - please report any issues that you find.
The code used for the checks may be seen in the above example. If
ENABLE_GET_SET_PIXEL_VERIFICATION is defined, then
GET_SET_PIXEL_ASSERT becomes a "standard"
ASSERT ( ;) ) statement, and
FOR_GET_SET_PIXEL_ASSERT becomes just the enclosed statement. If
ENABLE_GET_SET_PIXEL_VERIFICATION is not defined, then both defines give empty statements. Thanks to this, you can enable additional code and assertions using that code with a single define while keeping the code clean and simple at the same time (no three line
CImagePixelAccessOptimizer does not use the additional verification, as it would bring us back where we started performance wise :).
Success story ;)
I have optimized out practically all pixel access performance issues from my simple image viewing and bad-pixel detecting program called ImageViewer, using this method - from a major usability issue, the pixel access performance became a no issue in a matter of hours - and now, it will be seconds for you. :)
To be true, it's probably not the best idea to use the built-in
CImage class at all, but if you're already there or don't want to install/link some third party libraries into your project, then this simple wrapper located in a single header may be just the thing you need. You get it for free with one exception :) - while running the code from the Bitmap usage extension library, I've found an issue that caused a "memory can't be read" problem - the code copied the whole
RGBQUAD structure from the end of the 24bit DIB table - the reserved member of the
RGBQUAD structure was outside the memory allocated for the DIB. If you find anything like this or images on which the code does not work correctly, please let me know.