Click here to Skip to main content
13,090,426 members (34,990 online)
Click here to Skip to main content
Add your own
alternative version


2 bookmarked
Posted 17 Dec 2010

Remove all the HTML tags and display a plain text only inside (in case XML is not well formed)

, 19 Dec 2010
How about loading it into an XmlDocument and getting the InnerText? (Provided the HTML is well-formed XML, of course.)


Members may post updates or alternatives to this current article in order to show different approaches or add new features.

20 Dec 2010
Consider using the open source HTML Agility Pack library ( lets you use XPATH queries to access very specific parts of an HMTL document, and the HTML does not have to be valid, well-formed XML. In addition to accessing the raw inner text of an element you can...
5 Jan 2011
NOTE: If you're really wanting plain text, then you should also be sure to decode the HTML entities (System.Web.HttpUtility.HtmlDecode()) on the resulting text, or you'll wind up with HTML/XML character entity text in your output, such as & and [ If you're going to immediately output the...
18 Jan 2011
Sorry, but I have to vote this way down. Your regular expression (or @Chris's) is not robust enough for what I would consider "real world" data. Especially if this is used on any kind of public web site, I would be afraid of JavaScript injection attacks and other things (depending on its usage)....
15 Feb 2012
Andreas Gieriet
I think the following Regex and HtmlDecode would do:string html = ...;string textonly = HttpUtility.HtmlDecode( Regex.Replace(html, @"|", ""));Any HTML construct that would not be stripped off properly by this?


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Software Developer (Senior)
United States United States
BSCS 1992 Wentworth Institute of Technology

Originally from the Boston (MA) area. Lived in SoCal for a while. Now in the Phoenix (AZ) area.

OpenVMS enthusiast, ISO 8601 evangelist, photographer, opinionated SOB, acknowledged pedant and contrarian


"Using fewer technologies is better than using more." -- Rico Mariani

"Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’" -- Steve McConnell

"Every time you write a comment, you should grimace and feel the failure of your ability of expression." -- Unknown

"If you need help knowing what to think, let me know and I'll tell you." -- Jeffrey Snover [MSFT]

"Typing is no substitute for thinking." -- R.W. Hamming

"I find it appalling that you can become a programmer with less training than it takes to become a plumber." -- Bjarne Stroustrup

ZagNut’s Law: Arrogance is inversely proportional to ability.

"Well blow me sideways with a plastic marionette. I've just learned something new - and if I could award you a 100 for that post I would. Way to go you keyboard lovegod you." -- Pete O'Hanlon

"linq'ish" sounds like "inept" in German -- Andreas Gieriet

"Things would be different if I ran the zoo." -- Dr. Seuss

"Wrong is evil, and it must be defeated." –- Jeff Ello

"A good designer must rely on experience, on precise, logical thinking, and on pedantic exactness." -- Nigel Shaw

“It’s always easier to do it the hard way.” -- Blackhart

“If Unix wasn’t so bad that you can’t give it away, Bill Gates would never have succeeded in selling Windows.” -- Blackhart

"Use vertical and horizontal whitespace generously. Generally, all binary operators except '.' and '->' should be separated from their operands by blanks."

"Omit needless local variables." -- Strunk... had he taught programming

| | Privacy | Terms of Use | Mobile
Web02 | 2.8.170813.1 | Last Updated 19 Dec 2010
Article Copyright 2010 by PIEBALDconsult
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid