Click here to Skip to main content
15,896,446 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi All,

I am designing a code in C# to download the content of a webpage, but it should not have the html tags.

How can i download it or otherwise remove it after downloading the whole page.

Thanks in Advance.
govardhan
Posted
Updated 23-May-11 0:29am
v2

1 solution

Check out HttpWebRequest.GetResponse[^].

There is a nice example on how to get content from a web site.
About stripping out HTML tags, it is somewhat more complex to do.

One way is to use Regular Expression to replace HTML tags, but using Regular Expression on a HTML is not good and can be very tricky.
 
Share this answer
 
v2
Comments
Dalek Dave 23-May-11 8:55am    
Good link and Sage Advice!
Kim Togo 23-May-11 9:17am    
Thanks
Sergey Alexandrovich Kryukov 23-May-11 11:37am    
The sample on the reference page shows what to do. My 5.
--SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900