Click here to Skip to main content
15,891,423 members
Please Sign up or sign in to vote.
2.50/5 (2 votes)
See more:
I am trying to speed up a routine that generate fractal objects and displays them. I want it to take no more that 40 msec. I am presently at 800 msec.

I am calculating pixel color values and inserting them into a GDI bitmap of the whole client area, as shown in the simplified code below.

As far as I can tell by measuring lapsed time with GetCurrentTime() about half the time is spent calculating pixel color (400 msec). Half the time is spent writing the pixel color into the bitmap(400 msec). Painting the bitmap to the client area takes less 1 msec.

Will someone please advise me as to how I can put the color data in the bitmap in less time than SetPixel takes. SetPixel is notoriously slow when writing directly to the screen. I understand it has to do a color match with the display adapter each time it is executed, about 500,000 times on my mid-resolution display adapter. I was hoping there would be no color matching when writing to the bitmap. And I was expecting some delay painting the bitmap to the screen. Looks like I was wrong on both counts .

I am using a slightly more complicated code than this but it illustrates the problem

Variables declared earlier in the program:
C++
static double  maxFractalX, minFractalX, maxFractalY, minFractalY; // fractal set dimensions
static  int        maxX,maxY,maxI;   //client area dimensions in pixels 


C++
case WM_PAINT:

    int x,y;
    HDC hdc,hdcMem;
    HBITMAP hBitMap; 
    double FractalX, FractalY;

    hdc = BeginPaint(hWnd, &ps);
    hdcMem=CreateCompatibleDC(hdc);
    hBitMap=CreateCompatibleBitmap(hdc, maxX, maxY);
    SelectObject(hdcMem,hBitMap); 

    for (y = 0; y < maxy; y++)
    {
        FractalY = minFractalY + (maxFractalY-minFractalY)*y/(maxY-1);
        for (x = 0; x < maxx; x++)
        {
            FractalX = minFractalX + (maxFractalX-minFractalX)*x/(maxX-1);
            i = ValidateMandelbrotElement(FractalX, FractalY, maxI);
            //determines if the fractal element represented by this pixel diverges. If so, how fast?
            SetPixel(hdcMem, x, y, RGB(a,b,c)); // a,b,c keyed to I 
            //How can I replace this statement with something much faster?
        }
    }
    BitBlt(hdc,0,0,maxX,maxY,hdcMem,0,0,SRCCOPY); //Surprisingly fast
    DeleteDC(hdcMem);
    DeleteObject(hBitMap);
    EndPaint(hWnd, &ps);
    break;
Posted
Updated 9-Mar-13 10:16am
v3
Comments
SoMad 9-Mar-13 16:26pm    
I formatted your code (you can do this yourself by selecting the text and choosing the appropriate language from the "code" drop-down menu), but I also made a few adjustments since they looked wrong, such as "for(y=0; y<maxy;>".

Anyway, with your current code, you are only recalculating whenever the window has to be painted. Is that really what you want?
You can go the DirectX way as suggested in Solution 1 (I agree DirectX is faster than GDI), but maybe you should try experimenting with having a copy of the bitmap in memory, which gets updated independently of the draw events (a separate timer or whatever you may wish to use) and then just copy the bitmap from memory whenever the WM_PAINT occurs.

Soren Madsen
Roland Anderson 11-Mar-13 16:59pm    
Soren, These are snippets to be read to understand the problem, not complete code. Heaven only knows what you "solved by yourself"

Gday, just been playing with fractals myself for a couple of weeks, 4d julia sets in my case - the problems you mention are one of the definite gotchas in the pursuit.

The way to do it is to allocate a buffer that will hold the rgba values of the pixels you wish to plot. You can access the pixels in virtually no time.

In order to further speed up your program, you should use worker threads. Each thread completes a portion of the picture before signalling that it's done. Each thread can use the same screen buffer, which you then blast to screen in a single fell swoop.

Here's the basic code:

C++
void *screenBuffer;
long imageWidth, imageHeight;

void setupBuffer(int w, int h)
{
    screenBuffer = (long*)malloc(w * h * 4); //sizeof(long));
    imageWidth = w;
    imageHeight = h;
}

void myPixel(int x, int y, long rgbaVal)
{
    ((long*)(screenBuffer))[y*imageWidth+x] = rgbaVal; //myPos[y*imageWidth + x] =  rgbaVal;
}

// uncomment as needed
//#define BYTE unsigned char
//#define RGBA(r,g,b,a) ((unsigned int)((BYTE)(b)|((BYTE)(g) << 8)|((BYTE)(r) << 16)|(BYTE)(a)))

// blit a 32bit buffer to a window
void videoput32(void *buf, int imgBufWidth, int imgBufHeight, HWND destHwnd)
{
    char bibuf[ sizeof(BITMAPINFOHEADER)+12 ];
    BITMAPINFO &bi = *(BITMAPINFO*)&bibuf;
    BITMAPINFOHEADER &bih = bi.bmiHeader;
    bih.biSize = sizeof(bih);
    bih.biWidth = imgBufWidth;
    bih.biHeight = -imgBufHeight;
    bih.biPlanes = 1;
    bih.biBitCount = 32;
    bih.biCompression = BI_BITFIELDS;
    bih.biSizeImage = 0;
    bih.biXPelsPerMeter = 0;
    bih.biYPelsPerMeter = 0;
    bih.biClrUsed = 0;
    bih.biClrImportant=0;
//  ((unsigned long*)bi.bmiColors)[0] = 0;
    ((unsigned long*)bi.bmiColors)[0]=0x00FF0000;
    ((unsigned long*)bi.bmiColors)[1]=0x0000FF00;
    ((unsigned long*)bi.bmiColors)[2]=0x000000FF;
    RECT r;
    HDC dc=GetDC(destHwnd);
    GetClientRect(destHwnd, &r);
    StretchDIBits(dc, 0, 0, r.right, r.bottom, 0, 0, imageWidth, imageWidth, buf, &bi, DIB_RGB_COLORS, SRCCOPY);
    ReleaseDC(destHwnd,dc);
    ValidateRect(destHwnd, &r);
}



And a small section from the DialogProc:
C++
case WM_COMMAND:
            {
                switch(LOWORD(wParam))
                {
                case IDC_RENDER_BTN:
                    threadsRemaining = onRenderButtonClick();
                    SetDlgItemInt(hwndDlg, IDC_THREADS_REMAINING_TXT, threadsRemaining, true);
                    break;
                }
            }
            return TRUE;

        case WM_THREAD_COMPLETE:
            threadsRemaining--;
            SetDlgItemInt(hwndDlg, IDC_THREADS_REMAINING_TXT, threadsRemaining, true);
            if (threadsRemaining == 0)
            {
                 drawZbufferToImage();
                 SendMessage(outputWnd, WM_SET_DATA, 0, (LPARAM) &bufferSpec);

                 threadsRemaining = onRenderButtonClick();
            }
            break;


And another - a snippet of the thread function and it telling the main window that it's done.
C++
#define WM_THREAD_COMPLETE WM_USER + 1
void threadFunction2(void *threadData)
{
//  printf("threadFunction2\n");
  myThreadData_t *tmp = (myThreadData_t*)threadData;

  drawIntoZbuffer3( frameNum, (RECT){tmp->left,tmp->top,tmp->right,tmp->bottom} );
  SendMessage(mainHwnd, WM_THREAD_COMPLETE, 0, 0);
}
 
Share this answer
 
v4
Comments
SoMad 10-Mar-13 0:44am    
Nice! Very, very nice!

Soren Madsen
enhzflep 10-Mar-13 1:13am    
Thanks Soren. I certainly have been having fun coding it. Did't think I'd have a party looking for the same sort of code I'd been playing with, so soon. Sounds like win-win to me. :-)
Simon
Roland Anderson 11-Mar-13 21:42pm    
I'm having trouble thanking the person who started with Gday because I'm really new to this website. Any thanks. As I look at my coding, it looks like I'v been using 32 bit colors so I shoould be able to use you suggestions pretty much verbatim. Thanks.
enhzflep 11-Mar-13 22:19pm    
No dramas. :) If you want to add a comment to an answer, you click the 'Have a Question of Comment?' button that follows their post (left side of this column) - it may be immediately after or it may follow comments to that answer (that's what you did with your comment).

If, on the other hand, you wish to reply to a comment, you click the ( <- reply ) button at the top-right of the comment..

Something I else I can point out, is that the code uses 32bit pixels for basically 1 reason - speed. It's much easier to multiply by 4 than 3 (i.e X shl 2 is faster than X mul 3 - the compiler will use 'shift left 2' to multiply by 4, but be forced to use '* 3' for 24bit buffers). - Though the time that SHL/SHR instuctions and MUL instructions may well have changed since I last looked at instruction timings some years ago. There may indeed be no speed improvement in terms of the multiplication method - although, being 4 byte elements means that they may all be aligned in memory, which will equate to faster access.

Unless you're drawing to an offscreen 32 bit DIB, only 24 bits of the data is used anyway. In fact, when I use that code, I just pass 0 for the alpha value.
you could use directx to plot pixels to a backbuffer. This would be done on the GPU (graphics processing unit) which would be significantly faster than setpixel. You would really just have to initialize directx, create a window, set d3dpresent parameters, lock the backbuffer and set the pixels using the array notation.

GDI is SLOW.

if you wanted to render images to the backbuffer, you can also use the GDI for this.

just a suggestion.

if you want fast pixel plotting, I recommend pixel toaster too. google it and you will find.
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900