65.9K
CodeProject is changing. Read more.
Home

Remove all the HTML tags and display a plain text only inside (in case XML is not well formed)

starIconstarIconstarIconstarIconstarIcon

5.00/5 (2 votes)

Dec 27, 2010

CPOL
viewsIcon

10111

NOTE: If you're really wanting plain text, then you should also be sure to decode the HTML entities (System.Web.HttpUtility.HtmlDecode()) on the resulting text, or you'll wind up with HTML/XML character entity text in your output, such as & and [ If you're going to immediately output the...

NOTE: If you're really wanting plain text, then you should also be sure to decode the HTML entities (System.Web.HttpUtility.HtmlDecode()) on the resulting text, or you'll wind up with HTML/XML character entity text in your output, such as & and [ If you're going to immediately output the text to a browser, however, then you won't need to.
using System.Web;
 . . .
class foo {
   public void bar() {
      string ss = "Remove tags & HTML Entities";
      Regex regex = new Regex("\\<[^\\>]*\\>");
      Response.Write(String.Format("Before: '{0}'\n", ss));
      ss = regex.Replace(ss, String.Empty);
      ss = HttpUtility.HtmlDecode(ss);
      Response.Write(String.Format("After: '{0}'\n", ss));
   }
}