Click here to Skip to main content
15,887,821 members
Please Sign up or sign in to vote.
4.50/5 (2 votes)
See more:
Could anyone tell me how to parse HTML in .NET?

I have read many articles online, so many people seem to recommend and also criticize Regex, MSHTML, Html Agility Pack, others SGMLReader.

What I basically need is just to extract href value, from <a> tag and get the tag text, e.g

<a href="www.somesite.com">click here</a>


What I need in this case is the href value and the text "click here"

thanks
Posted
Updated 5-Jun-11 21:34pm
v6
Comments
Dalek Dave 6-Jun-11 3:34am    
Edited for Grammar and Readability.

 
Share this answer
 
v3
Comments
Cool Smith 5-Jun-11 14:39pm    
i already mentioned that i have used that, but i still have problems, maybe you could give some example code
Manfred Rudolf Bihy 5-Jun-11 14:46pm    
No you have not! You said that you've read articles and collected opinions. Nothing about trying HAP and neither about your problems.
Kim Togo 5-Jun-11 14:52pm    
See updated answer with a link to code example.
And you have not mentioned anything about that you have tried out HAP.
Manfred Rudolf Bihy 5-Jun-11 14:43pm    
Exactly what I'd use! +5
Kim Togo 6-Jun-11 3:02am    
Thanks Manfred
HtmlAgilityPack is the way to go. What you're looking for is what that tool was practically made for. Can't supply a sample just now as I'm writing this on a mobile device.

Cheers!

--MRB
 
Share this answer
 
Comments
Kim Togo 5-Jun-11 14:47pm    
My 5 for the HtmlAgilityPack.
Wonde Tadesse 5-Jun-11 15:53pm    
5+
arindamrudra 6-Jun-11 3:22am    
+5
Dalek Dave 6-Jun-11 3:35am    
Good Call.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900