Click here to Skip to main content
Sign Up to vote bad
good
See more: C#
I need some help, finding a C# .Net Solution for scraping an Ajax website.
Anyone ??
Posted 27 Nov '12 - 7:40

Comments
Plyswthsqurles - 27 Nov '12 - 15:30
What exactly are you trying to grab from these websites? You have a number of options. 1) XPath to load an HTML document, can be tricky with malformed HTML 2) Selenium (Browser automation but has .net capbabilities) 3) Html Agility pack to load a website, it also handles malformed html A non-c# solution, still browser automation related is watir...its ruby.
Paw Jershauge - 27 Nov '12 - 15:35
Well im not the big website building anymore, i stopped at asp classic ;) im more in winforms. So lets see if i can explain myself, here goes: I have a website that posts status on some systems. The status message and assosicated information are posted back via ajax, and therefor the normal HTMLElements wont hold the correct text in the innerText property. hope that makes sence ;)
ryanb31 - 27 Nov '12 - 15:55
AJAX can easily return strings. What exactly is coming back from the AJAX call that can't go into the html elements? Something doesn't seem right here.
Paw Jershauge - 27 Nov '12 - 15:58
ryanb31 its not that that ajax cant return the data, it does. and i can view the message in my browser, but when i look into the Html source code of the site, the message is not there, its only a {{message}} variable or somethinf thats in the place where the message text should be.

1 solution

Well i belive i found a workaround solution for this issue.
I just use the WebBrowser instead of the WebClient and have the WebBrowser render the hole site before extracting the HtmlDocument. takes time, but it works.
 
heres the code
        public HtmlDocument GetHtmlAjax(Uri uri, int AjaxTimeLoadTimeOut)
        {
            using (WebBrowser wb = new WebBrowser())
            {
                wb.Navigate(uri);
                while (wb.ReadyState != WebBrowserReadyState.Complete)
                    Application.DoEvents();
                Thread.Sleep(AjaxTimeLoadTimeOut);
                Application.DoEvents();
                return wb.Document;
            }
        }
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Your Filters
Interested
Ignored
     
0 Sergey Alexandrovich Kryukov 414
1 Arun Vasu 253
2 OriginalGriff 200
3 CPallini 163
4 Aarti Meswania 158
0 Sergey Alexandrovich Kryukov 10,169
1 OriginalGriff 7,749
2 CPallini 4,181
3 Rohan Leuva 3,482
4 Maciej Los 3,089


Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 27 Nov 2012
Copyright © CodeProject, 1999-2013
All Rights Reserved. Terms of Use
Layout: fixed | fluid