65.9K
CodeProject is changing. Read more.
Home

Remove all the HTML tags and display a plain text only inside (in case XML is not well formed)

starIconstarIconstarIconstarIconstarIcon

5.00/5 (2 votes)

Feb 15, 2012

CPOL
viewsIcon

11804

I think the following Regex and HtmlDecode would do:string html = ...;string textonly = HttpUtility.HtmlDecode( Regex.Replace(html, @"|", ""));Any HTML construct that would not be stripped off properly by this?

I think the following Regex and HtmlDecode would do:
string html = ...;
string textonly = HttpUtility.HtmlDecode(
         Regex.Replace(html, @"<!--[\S\s]*?-->|<(?:"".*?""|'.*?'|[\S\s])*?>", ""));
Any HTML construct that would not be stripped off properly by this?