Click here to Skip to main content
11,411,840 members (48,836 online)
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C#
I need some help, finding a C# .Net Solution for scraping an Ajax website.
Anyone ??
Posted 27-Nov-12 8:40am
Comments
Plyswthsqurles at 27-Nov-12 15:30pm
   
What exactly are you trying to grab from these websites? You have a number of options.

1) XPath to load an HTML document, can be tricky with malformed HTML
2) Selenium (Browser automation but has .net capbabilities)
3) Html Agility pack to load a website, it also handles malformed html

A non-c# solution, still browser automation related is watir...its ruby.
Paw Jershauge at 27-Nov-12 15:35pm
   
Well im not the big website building anymore, i stopped at asp classic ;) im more in winforms. So lets see if i can explain myself, here goes:
I have a website that posts status on some systems. The status message and assosicated information are posted back via ajax, and therefor the normal HTMLElements wont hold the correct text in the innerText property. hope that makes sence ;)
ryanb31 at 27-Nov-12 15:55pm
   
AJAX can easily return strings. What exactly is coming back from the AJAX call that can't go into the html elements? Something doesn't seem right here.
Paw Jershauge at 27-Nov-12 15:58pm
   
ryanb31 its not that that ajax cant return the data, it does. and i can view the message in my browser, but when i look into the Html source code of the site, the message is not there, its only a {{message}} variable or somethinf thats in the place where the message text should be.

1 solution

Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Well i belive i found a workaround solution for this issue.
I just use the WebBrowser instead of the WebClient and have the WebBrowser render the hole site before extracting the HtmlDocument. takes time, but it works.

heres the code
        public HtmlDocument GetHtmlAjax(Uri uri, int AjaxTimeLoadTimeOut)
        {
            using (WebBrowser wb = new WebBrowser())
            {
                wb.Navigate(uri);
                while (wb.ReadyState != WebBrowserReadyState.Complete)
                    Application.DoEvents();
                Thread.Sleep(AjaxTimeLoadTimeOut);
                Application.DoEvents();
                return wb.Document;
            }
        }
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 243
1 Sergey Alexandrovich Kryukov 200
2 Sascha Lefévre 130
3 ProgramFOX 130
4 Maciej Los 90
0 Sergey Alexandrovich Kryukov 8,955
1 OriginalGriff 7,158
2 Maciej Los 3,480
3 Abhinav S 3,248
4 Peter Leow 3,059


Advertise | Privacy | Mobile
Web01 | 2.8.150414.5 | Last Updated 27 Nov 2012
Copyright © CodeProject, 1999-2015
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100