|
Hi All,
I have an urgent requirement to create a crawler by which i can be able to fetch data from a url, the ide should be vc++.
Thanks A Ton
Ash_VCPP
|
|
|
|
|
very good.. now what is the problem??
|
|
|
|
|
Wow, starting the working day with a smile is very good, my five.
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
Do you have any idea about crawler if yes then please provide me the way to start working its urgent......
Thanks A Ton
Ash_VCPP
|
|
|
|
|
Ash_VCPP wrote: Do you have any idea about crawler
Yes.
Ash_VCPP wrote: then please provide me the way to start working its urgent......
Sorry, *urgent* questions automatically falls to the bottom of the stack (just a bit above *very urgent* questions).
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
then can you please provide me any code , guidelines or any url where i can get some useful things.......
Thanks A Ton
Ash_VCPP
|
|
|
|
|
I need to decide many things before starting the project,coz i am the only one responsible to make this project, so please tell me the initial guideline to start with,Like what i should use....win32 exe,win32 dll,com etc....which inter process communication logic i should use.....
Thanks A Ton
Ash_VCPP
|
|
|
|
|
By definition:
Crawler-a person who tries to please someone in order to gain a personal advantage
Do you need it to please someone for some personal advantage?
Did you try to meet the requirements. Go get the IDE...
You need to google first, if you have "It's urgent please" mentioned in your question.
_AnShUmAn_
|
|
|
|
|
As you may have seen from your response, it's not a very good question.
1/ You haven't actually asked a question - you've just told us you have work to do. While we are, of course, very happy for you, there's not much to answer.
2/ You've got quite a bit challenge, especially if your starting from scratch.
3/ You can break it down into several challenges... Handling delays, timeouts, gettinf HTPP pages, parsing them into links, etc.
I've attached below some code I wrote years ago, grabbing a certain page from a specific URL every hour or so - an early RSS reader, essentially. It may help you with your search terms.
There are other articles on codeproject grabbing information from web pages. John Simmons wrote one recently scraping information from a codeproject page.
Good luck with your task!
Iain.
DWORD WINAPI UpdatePageThread ( LPVOID lpParameter )
{
HWND hWnd = (HWND)lpParameter;
DWORD dw, dwDelay = 100;
HINTERNET hInternet, hIConnect, hIRequest;
BOOL bSuccess;
DWORD dwStatus, dwSize, dwIndex;
PCHAR AcceptTypes [] = { "text/*", NULL };
hInternet = NULL;
hIConnect = NULL;
hIRequest = NULL;
hInternet = ::InternetOpen ("OC UK Notify", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
if (hInternet)
hIConnect = ::InternetConnect (hInternet, "www.overclock-uk.net", INTERNET_DEFAULT_HTTP_PORT, "user", "pass", INTERNET_SERVICE_HTTP, 0, 1);
if (hIConnect)
{
hIRequest = ::HttpOpenRequest (hIConnect, NULL, "update.ocuk", NULL, NULL, (const char **)AcceptTypes,
INTERNET_FLAG_NO_CACHE_WRITE | INTERNET_FLAG_NO_COOKIES | INTERNET_FLAG_NO_UI | INTERNET_FLAG_RELOAD | INTERNET_FLAG_NO_AUTH,
1);
}
if (!hIRequest)
return 1;
char buf [4096];
std::string Page;
while (1)
{
dw = WaitForSingleObject (g_hEventStop, dwDelay);
if (dw != WAIT_TIMEOUT)
break;
dwDelay = 90 * 60000;
bSuccess = ::HttpSendRequest (hIRequest, NULL, 0, NULL, 0);
if (!bSuccess)
continue;
dwSize = sizeof (DWORD);
dwIndex = 0;
bSuccess = ::HttpQueryInfo (hIRequest, HTTP_QUERY_STATUS_CODE | HTTP_QUERY_FLAG_NUMBER, &dwStatus, &dwSize, &dwIndex);
if (!bSuccess)
continue;
dwStatus /= 100;
if (dwStatus != 2)
continue;
Page.erase ();
while (1)
{
memset (buf, 0, sizeof (buf));
bSuccess = ::InternetReadFile (hIRequest, buf, sizeof (buf), &dwSize);
if (dwSize == 0)
break;
if (!bSuccess)
break;
Page.append (buf, dwSize);
}
int nFind = 0;
std::string Temp;
EnterCriticalSection (&g_CS_Updates);
g_UpdateArray.clear ();
while (1)
{
nFind = Page.find ("<p>");
if (nFind == std::string::npos)
nFind = Page.find ("<p>");
if (nFind == std::string::npos)
break;
Page.erase (0, nFind + 3);
nFind = Page.find ("</p>");
if (nFind == std::string::npos)
nFind = Page.find ("</p>");
if (nFind == std::string::npos)
break;
Temp = Page;
Temp.erase (nFind, Temp.size ());
Page.erase (0, nFind + 4);
g_UpdateArray.push_back (Temp);
}
LeaveCriticalSection (&g_CS_Updates);
PostMessage (hWnd, WM_USER + 1, 0, 0);
}
if (hIRequest) ::InternetCloseHandle (hIRequest);
if (hIConnect) ::InternetCloseHandle (hIConnect);
if (hInternet) ::InternetCloseHandle (hInternet);
return 0;
}
Codeproject MVP for C++, I can't believe it's for my lounge posts...
|
|
|
|
|
Hi Iain,
Thanks for providing this important information and code, now i will try in this way and if found any difficulties then i will let you know...once again thanks for the reply..
Thanks A Ton
Ash_VCPP
|
|
|
|
|
The website / page this code pointed to has long since gone, by the way!
And take the error checking with heavy skepticism...
Iain.
Codeproject MVP for C++, I can't believe it's for my lounge posts...
|
|
|
|
|
Hi Ash,
You still need the code? If yes then please let me know.
|
|
|
|
|
hi sandeep,
Actually with code i also need to do some planning as i have to start the project from the scratch.....so please provide me the idea as well with the code that which way would be the better one.......
Thanks A Ton
Ash_VCPP
|
|
|
|
|
Ash_VCPP wrote: I have an urgent requirement to create a crawler...
Care to define this?
"Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
|
|
|
|
|
i got your point till some extent but i would be pleased if you can explain it more...
Thanks A Ton
Ash_VCPP
|
|
|
|
|
Ash_VCPP wrote: ...i would be pleased if you can explain it more...
I believe that was the question I posed to you. The term "crawler" can take on several different meanings. What is yours?
"Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
|
|
|
|
|
basically i need an exe which can fetch data from any url and dump it to data base.....
Thanks A Ton
Ash_VCPP
|
|
|
|
|
Ash_VCPP wrote: ...fetch data from any url...
Such as URLDownloadToFile() ?
"Old age is like a bank account. You withdraw later in life what you have deposited along the way." - Unknown
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
|
|
|
|
|
i am not sure that it will work...coz i remember that before few months i used it to download an xml file from server and icons.....
Thanks A Ton
Ash_VCPP
|
|
|
|
|
Hi,
I am able to draw vertical line on dialog, line will move from left to right for the period of time. while moving i am erasing the previous line with previous pixels and drawing new line with new pixels...
but while the line is moving, if i have opened any other application on dialog box, the prevois line is not erasing and it is continue with the new line..
can any one help me why it is happening like this...?
Thanks in advance..
Any advice will be appriciated..
|
|
|
|
|
What are you doing exactly in the OnPaint handler ?
What you have to think is that windows doesn't remember the drawings for your window. So it means that whenever it needs to repaint the window (for instance because you have another window on top), then it will ask your window to repaint itself (by sending a WM_PAINT message). If you don't redraw your window properly, then you will loose some data.
Also, the ebest way to erase a previous line your example is to simply request a new repainting of your window and draw the line at the new position (as everything is cleared, your previous line won't be visible anymore).
|
|
|
|
|
Hi cedric,
Thanks for your reply.. actually i am drawing line along with the slider positions on dialog, i mean slider will move from left to right, that time along with slider positions the line will also move.
I have written the below code in one thread..
m_pDlg->GetClientRect( &deflatedClientRect );
deflatedClientRect.DeflateRect( TB_WIDTH, TB_WIDTH );
m_pDlg->m_slider_bar1.SetPos(slide_pos);
m_pDlg->m_slider_bar2.SetPos(slide_pos);
m_pDlg->m_slider_bar1.GetThumbRect( &thumbRect );
m_pDlg->m_slider_bar1.ClientToScreen( &thumbRect );
m_pDlg->ScreenToClient( &thumbRect );
ptStart.x = thumbRect.CenterPoint().x;
ptStart.y = TB_WIDTH;
//ptEnd.x = ptStart.x;
//ptEnd.y = deflatedClientRect.bottom;
m_pDlg->x1 = ptStart.x;
m_pDlg->x2 = ptStart.x;
//m_pDlg->x2 = ptEnd.x;
//m_pDlg->y2 = 280;
m_pDlg->InvalidateRect(deflatedClientRect,TRUE);
using this code i will get sliders pixels information, and i am calling InavalidateRect from here.
In Onpaint i am using below code snippet..
int nOldmode=dc.SetROP2(R2_NOTXORPEN);
dc.MoveTo(old_x1,old_y1);
dc.LineTo(old_x2,old_y2);
dc.MoveTo(x1,y1);
dc.LineTo(x2,y2);
old_x1 = x1;
old_y1 = y1;
old_x2 = x2;
old_y2 = y2;
plase check the code and let me know that is the problem..>
thnaks in advance..
|
|
|
|
|
Did you read my previous message ?
Why are you repainting the old line ?
venki502 wrote: I have written the below code in one thread..
You are doing that in a separate thread ?
I wouldn't do that: instead simply handle the slider moved event and update the position of your line at that point. Why do you need a thread for that ?
|
|
|
|
|
Hi,
Actualli i am setting the slider position in OnTimer function, Okay can you please tell me the event fired at silder movement, so that i will try with that..
Thanks.
|
|
|
|
|
This is getting painful to read.
1/ In your thumbtrack handler, just call InvalidateRect () to cause the window to redraw.
2/ In the OnPaint routine, get the thumb positions, calculate your two end points for the line. Draw the line.
Go home and sleep.
It's not a complex problem. I have no idea why you're bringing in threads, etc.
If you want, you could be clever and use R2_NOT to undraw a line, and draw it in the new position, but that seems a little over complex at the moment.
Good luck,
Iain.
Codeproject MVP for C++, I can't believe it's for my lounge posts...
|
|
|
|
|