Click here to Skip to main content
15,881,801 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi. I'm scrapping some data from some website which have a pagination. I traverse through all pages in the pagination, but when some page is load from the pagination the page is full loaded but the code is waiting a little more before my code start scraping the data.

My question is: Is there is some better implementation then this one below, when the page is completely loaded and quickly after that to start the scrapping?

C#
webBrowser1.Navigate("javascript:changeCar('"+br+"')");
while (webBrowser1.ReadyState != WebBrowserReadyState.Complete)
{
    Application.DoEvents();
}


The problem is with Application.DoEvents() which make the code to wait....

Thanks, Cheers
Posted

For Web scraping, you should rather use the class System.Net.HttpWebRequest . Well, it is actually millions times better, as a Web browser is majorly irrelevant to the problem.

Please see my past answers:
How to get the data from another site[^],
get specific data from web page[^].

—SA
 
Share this answer
 
Comments
Joezer BH 10-Sep-13 3:29am    
5ed!
Sergey Alexandrovich Kryukov 10-Sep-13 10:39am    
Thank you.
—SA
You would be better off using the WebBrowser.DocumentCompleted[^] event - then you would not need to perform any loop while waiting for the page to load.

Best regards
Espen Harlinn
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 9-Sep-13 16:55pm    
As OP mentioned the purpose — Web scraping, using a Web browser is in general pointless, HttpWebRequest should be used; please see my answer.
—SA
@Espen I will put DocumentCompleted inside my code, because sometimes the app with Application.DoEvents() broke up.

@Sergey I also use httpwebrequest sometimes, but I'm a little more comfortable when use the internet explorer wrapper, and the speed of my app here is not a problem. :)

Thanks both. New ideas here from everyone are welcome.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 9-Sep-13 16:57pm    
First of all, this is not an answer, should not be here.
IE wrapper could not speed up anything, but it will be much, much slower, and it's just pointless. Isn't it obvious? If you need to scrape, scrape it. You really need to use HttpWebRequest.
—SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900