 |
|
 |
No doubt, this is a great article. Though i am not with a DSP background, yet i was able to understand almost all of this article. I highly appreciate Chesnokov Yuriy for writing such a great article and hope see more from him in future.
My question is as follows:
I have a HBITMAP and want to resize it. After resize, i want to have another HBITMAP or an object of CxImage, CBitmapEx etc. Can you plz tell me how to do this?
Regards,
|
|
|
|
 |
|
 |
Many thanks.
All you need is to get the raw pixel data from HBITMAP handle, or any other object
Чесноков
|
|
|
|
 |
|
 |
Thanks for the reply.
Actually my problem is to reconstruct the image after calling resize on pixel data. As told in article, i m accessing down sampled image using the following functions:
char** getr() const
char** getg() const
char** getb() const
But i do not know how to correctly combine individual colors arrays into a BITMAP again.
What i m doing is to take a screen capture of desktop (1366 X 768 ) and scale down it to 640 x 480. Following is the code snipped that i m using:
//following 4 lines gets the desktop screenshot
HDC hDesktopCompatibleDC=CreateCompatibleDC(hDesktopDC);
HBITMAP DesktopCompatibleBitmap=CreateDIBSection(hDesktopDC,&bmpInfo,DIB_RGB_COLORS,&pBits,NULL,0);
SelectObject(hDesktopCompatibleDC,hDesktopCompatibleBitmap);
BitBlt(hDesktopCompatibleDC,0,0,nWidth,nHeight,hDesktopDC,0,0,SRCCOPY|CAPTUREBLT);
//After capturing desktop screenshot, pBits contains the pixels
//Now, i m resizing the captured bitmap with the help of ur code.
unsigned int steps = 20;
unsigned int width = 1366;
unsigned int height = 768;
ImageResize resize;
resize.init(width, height, 0.5f);
for (unsigned int i = 0; i < steps; i++)
{
resize.resize((const unsigned char *)pBits );
}
char** pr = resize.getr();
char** pb = resize.getb();
char** pg = resize.getg();
DWORD *ptr=new DWORD[1366/2*480/2];
memset(ptr,0,1366/2*480/2);
for(unsigned int y = 0; y < height * 0.125f; y++)
{
for(unsigned int x = 0; x < width * 0.125f; x++)
{
ptr[y* ((int)(height * 0.5f)) + x]=RGB(pr[y][x]+128 ,pg[y][x]+128,pb[y][x]+128);
}
}
HBITMAP h=::CreateBitmap(1366/2,480/2,1,32,ptr);
CxImage abc;
abc.CreateFromHBITMAP(h);
if (abc.IsValid())
abc.Save("c:\\scaleddown.bmp",1);
when i open "c:\\scaleddown.bmp", i do not see a scaled dwon bitmap but a mix of colors.
Looking for ur kind help.
Regards,
|
|
|
|
 |
|
 |
Dear Chesnokov Yuriy,
I am Waiting for your reply. I shall be thankful if you can reply at your first preference.
Thanks in advance.
|
|
|
|
 |
|
 |
Hi, Sir:
When I try to use your code under Wince, I find these code hase cmpile error, many MMX function and type can not found.
How to deal with this problem?
Thanks.
wolf
|
|
|
|
 |
|
 |
simple enough write non MMX one haar transform.
chesnokov
|
|
|
|
 |
|
 |
Hi,
My question is not going to be very interesting, since I am not doing anything with DSP, especially not in postgraduate level. But, as a common people trying to get benefit from your research, I am wondering how long does it take to resize CIF into QCIF in 30fps?
And, with your code, what are the measures need to be taken? (implementation)
Thanks a bunch.
~God Bless The Internet~
|
|
|
|
 |
|
 |
//your bitmap data goes in that fashion //unsigned char* pBGR = new unsigned char[width*height*3]; unsigned int width = 352; unsigned int height = 288; float zoom = 0.5; ImageResize resize; resize.init(width, height, zoom); //keep resizing incoming data after initialization. resize.resize(pBGR); //the resized image in separate channels char** r = getr(); //get Red channel char** g = getg(); //get Green channel char** b = getb(); //get Blue channel The RGB resizing takes about 5ms on 2Ghz -> 200fps chesnokov
|
|
|
|
 |
|
 |
Hi Yuriy,
Thanks for your response.
This is what I did so far:
if(isVideoSend)
{
//change from CIF to QCIF format
unsigned int steps = 20;
unsigned int width = IMAGE_WIDTH_LOCAL; //352
unsigned int height = IMAGE_HEIGHT_LOCAL; //288
float zoom = 0.5;
unsigned char* pBGR = new unsigned char[width * height * 3];
pBGR = (unsigned char*) data; //pRGB is the Entry point
ImageResize resize;
resize.init(width, height, zoom); //downsized into half
wprintf(L" downsampling 352x288 RGB image to 176x144 image\n\n");
__int64 ms = 0;
for (unsigned int i = 0; i < steps; i++) {
tic();
resize.resize(pBGR);
ms += toc();
}
wprintf(L"\n avrg time: %dms for 640x480 image", int(float(ms) / float(steps)));
//log.WriteString("\n Converting to YUV format..");
//Convert the data from rgb format to YUV format
ConvertRGB2YUV(QCIF_WIDTH,QCIF_HEIGHT,data,yuv); //Yuriy, what's the output point of your source code?
// Reset the counter
count=0;
//Compress the data...to h263
cparams.format=CPARAM_QCIF;
cparams.inter = CPARAM_INTRA;
cparams.Q_intra = 8;
cparams.data=yuv; // Data in YUV format...
CompressFrame(&cparams, &bits);
// Transmit the compressed frame
//log.WriteString("Transmitting the frame");
dvideo.SendVideoData(cdata,count);
}
but, it gave me error LNK2019: unresolved external symbol "public: __int64 __thiscall CVideoNetDlg::toc(void)"
AND
error LNK2019: unresolved external symbol "public: void __thiscall CVideoNetDlg::tic(void)"
I have added __64 toc() and tic()in ".h" and ".cpp" of this respective file.
What went wrong? And anything else I have overlooked?
~God Bless The Internet~
|
|
|
|
 |
|
 |
what do toc() and tic() do?
~God Bless The Internet~
|
|
|
|
 |
|
 |
Hi Yuriy!
I look at the basefwt.cpp and ImageResize.cpp, and I found out that each of red, green, and blue data from pBRG are stored and later downsized using the function below,
int BaseFWT2D::trans(unsigned int scales, unsigned int th)
{
if (m_status <= 0)
return -1;
J = scales;
TH = th;
unsigned int w = m_width;
unsigned int h = m_height;
for (unsigned int j = 0; j < J; j++) {
transrows(tspec2d, spec2d, w, h);
transcols(spec2d, tspec2d, w, h);
w /= 2;
h /= 2;
TH /= 4;
}
return 0;
}
Here you took red data in J variable and scales in th variable. //isn't it?
How to put them together in pBRG after they've been downsized into 176x144? Does Haar Algorithm solve upsizing?
PS: Hey, get a broadband soon will ya!
~God Bless The Internet~
|
|
|
|
 |
|
 |
All I can say 'I can not help' you with your programing learning.
You do not need to Resize.init() multiple time once you initialized it to specific width height.
J is the scale, th is threshold.
To put everything together use getr(), getg(), getb() read them to RGB buffer.
chesnokov
|
|
|
|
 |
|
 |
OK Yuriy,
But, I still don't understand about your main(), since I want to sample until I stop video capturing button in MFC.
And another thing is I notice that this will make RGB to fit YUV422 conversion, what should I do to make it to fit YUV420 conversion?
Thanks a million.
~God Bless The Internet~
|
|
|
|
 |
|
 |
// Convert from RGB24 to YUV420
//
int ConvertRGB2YUV(int w,int h,unsigned char *bmp,unsigned int *yuv)
{
/*
unsigned int *u,*v,*y,*uu,*vv;
unsigned int *pu1,*pu2,*pu3,*pu4;
unsigned int *pv1,*pv2,*pv3,*pv4;
unsigned char *r,*g,*b;
int i,j;
*/
uu_uu=new unsigned int[w*h];
vv_vv=new unsigned int[w*h];
if(uu_uu==NULL || vv_vv==NULL)
return 0;
y_y=yuv;
u_u=uu_uu;
v_v=vv_vv;
// Get r,g,b pointers from bmp image data....
ImageResize imres;
float zoom = 0.5;
if(count==0)
imres.init(IMAGE_WIDTH_LOCAL, IMAGE_HEIGHT_LOCAL, zoom); //downsized into half
__int64 ms = 0;
for (unsigned int i = 0; i < sizeof(yuv); i++)
{
tic();
imres.resize(bmp);
ms += toc();
}
r_temp = imres.getr();
g_temp = imres.getg();
b_temp = imres.getb();
// the original containers
//r_r=bmp;
//g_g=bmp+1;
//b_b=bmp+2;
//check values of each r , g, b
if(count<2)
{
Msg(TEXT("r_r=0x%d, g_g=0x%d, b_b=0x%d"), r_temp, g_temp, b_temp);
count++;
}
My question is, why r_temp, g_temp and b_temp holds 0x000000 address after iteration 1 (count = 0)?
~God Bless The Internet~
|
|
|
|
 |
|
|
 |
|
 |
Hi,
I'm working on Face Detection using Haar-like features and/or convolutional NN. What's your approach?
Ihor
|
|
|
|
 |
|
 |
Hi Image down sampling -> Skin and motion detection -> rough NN, SVM non-faces rejection -> PCA projection, NN classification on image pyramid. You're talking about Viola and Johns approach. Is it better than standard PCA? My code SSE optimized detects at 15fps rate on 640x480 on 2.2Ghz. I'm planning to post it about a week later or so. chesnokov
|
|
|
|
 |
|
 |
Hi,
Thank you for your answer.
>Image down sampling -> Skin and motion detection -> >rough NN, SVM non-faces rejection -> PCA >projection, NN classification on image pyramid.
What detection rate/false alarms have you achieved on CMU test set?
What NN's architecture?
>You're talking about Viola and Johns approach. Is >it better than standard PCA? My code SSE >optimized detects at 15fps rate on 640x480 on 2.2Ghz.
|
|
|
|
 |
|
 |
I did not test it on CMU database. Currently I have not got broadband and I trained it only my images collected over the years and some friends. It detects me in real time without skipping a frame undetected. I tested on some girls pics from inet and provided about 95% correct rate from the pics I downloaded chesnokov
|
|
|
|
 |
|
 |
Every serious article on FD presents detection rate / false positives on CMU test set. This makes very easy to evaluate different FD methods.
Just follow the link
http://vasc.ri.cmu.edu/idb/images/face/frontal_images/images.tar
download and try this set.
I think it would be interesting to you (and me) to compare your results with others.
Ihor
|
|
|
|
 |
|
 |
Absolutly agree, though this is not the face detection article yet.
I'm not able currently to download more than couple of megs with GPRS mobile connection, and their http://vasc.ri.cmu.edu/idb/images/face/frontal_images/images.html link does not work.
By the time I'll post the paper you may evaluate it yourself on their work, dont forget to retrain it on 200meg CBCL database.
chesnokov
|
|
|
|
 |
|
 |
sorry, something went wrong.
please see "broken link" link above.
double(U)
|
|
|
|
 |
|
 |
>You're talking about Viola and Johns approach. Is >it better than standard PCA? My code SSE optimized >detects at 15fps rate on 640x480 on 2.2Ghz.
15fps with motion detection + skin color segmentation + ...?
Viola & Jones achieved 64ms on 320x240 image.
Det. rate on CMU test set = 76%/10 false alarms, 88%/31FA, 91%/50FA
Ihor
|
|
|
|
 |
|
 |
15fps at the least. It does not depend on image size as it is downscaled to about 80x60 pic. 19x19 face, -> 361 dimensional vector -> motion detectio + skin segmentation -> 2 vectors SVM or 361 2 1 ANN for a prefilter -> PCA projection 361 -> 40 -> final ANN classification 40 20 10 1 WIth that scheme it provides 15-25fps with SSE optimization. Since I use floats and Viola integers their speed is close to mine. Do they provide integer approximations for ANN? By the time I post the code you may test it on CMU or yourself in real time. However you should retrain ANN and SVM on CBCL database as it provides a lot of face data I can not currently download about 200meg I've got only GPRS connection only. On my faces(~1700) nonfaces(~34000) collected samples PCA projected data ANN rate train set: 893 17154 se: 99.55 sp: 100.00 pp: 100.00 np: 99.98 ac: 99.98 er: 0.000519 validation set: 447 8578 se: 96.42 sp: 99.90 pp: 97.95 np: 99.81 ac: 99.72 er: 0.001824 test set: 447 8578 se: 97.32 sp: 99.90 pp: 97.97 np: 99.86 ac: 99.77 er: 0.001746 chesnokov
|
|
|
|
 |