Click here to Skip to main content
12,079,489 members (63,764 online)
Rate this:
 
Please Sign up or sign in to vote.
See more: C# HTML
Hello, I want a simple html analyzer(in c#) which can get the contents of an html element. Let me explain: I want to download a page, get the contents of ".class1 .class2 #id1 div" and then display it to the user. Do you have any leads(besides System.Net.WebClient)?

P.S. So far I have found HTML agility pack which uses an xPath to get an element.
Posted 30-Aug-12 12:22pm
Edited 30-Aug-12 23:35pm
v3
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 2

Hi,

I think jQuery can get all the information from your given html control. you can get InnerHtml content from particular Div/Table through the ID/it's Associated Class.

Suppose you have HTML content as ,
<div class="demo-container">
  <div class="demo-box">Demonstration Box</div>
</div>

Then you can extract Inner Div using,
$('div.demo-container').html();
And Result would be
<div class="demo-box">Demonstration Box</div>
(Above code is taken from jQuery[^])

But this can be possible if you already have hierarchy information to navigate in the html.

Hope i answered your query,
Thanks
-Amit Gajjar.
  Permalink  
Comments
mostwanted4 31-Aug-12 4:28am
   
Thanks, this is the exact thing I want to do, but in c#(I forgot to mention it)
@amitgajjar 31-Aug-12 4:29am
   
if you have web application then you can get using jQuery, update hidden field with this value, and get it from C# :)
mostwanted4 31-Aug-12 4:34am
   
I don't have a web application I have a simple Windows Form in which I download the page as a string(using WebClient) and then I process it. It would be awesome to run jQuery in these conditions.
@amitgajjar 31-Aug-12 4:36am
   
check http://jint.codeplex.com/
mostwanted4 31-Aug-12 4:38am
   
Cool! Thanks ;)
@amitgajjar 31-Aug-12 4:39am
   
this works for javascript but don't have idea about jquery. but give me some time i am posting some other solution.
@amitgajjar 31-Aug-12 4:43am
   
You can use WebBrowser control and execute your html page along with jQuery in that control. WebBrowser.DocumentText property will give you your desired result.
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 1

You can use regex to parse a file if you know the tags you're looking for. XmlDocument also works, if its XHTML.
  Permalink  
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 3

Use LINQ to XML to achieve this.
For example:
string htmlString = @"<html>
                        <body>
                          <p>hi</p>
                          <table>...</table>
                        </body>
                     </html>";
 
XDocument htmlDoc = XDocument.Parse(htmlString);
Now htmlDoc contains the DOM elements as XNode.
XNode (html tag) --> XNode (body tag) --> XNode(p tag), XNode(table tag)
  Permalink  
Comments
mostwanted4 31-Aug-12 4:43am
   
Nice, but I'm afraid I'm dealing with HTML not XHTML
pramodhegde88 31-Aug-12 4:58am
   
This still works with HTML.
mostwanted4 31-Aug-12 9:11am
   
I tried it and it throws errors for tags like <link type="" rel="" href=""> where the tag is never closed.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


Advertise | Privacy | Mobile
Web03 | 2.8.160212.1 | Last Updated 31 Aug 2012
Copyright © CodeProject, 1999-2016
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100