Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

HTML white-listing

0.00/5 (No votes)
21 Aug 2009 1  
Prevent XSS vunerabilities when rendering user-editable HTML

Introduction

This code removes potential XSS vulnerabilities from HTML code by using a white-list definition of safe elements and attributes. The white-list definition is based on a sub-set of TinyMCE editor's (WISIWIG javascript HTML editor) valid-elements definition. 

http://wiki.moxiecode.com/index.php/TinyMCE:Configuration/valid_elements

Background

I originally wrote this project in VB.Net (way back in 2003) to work as server-side validation for TinyMCE. Since writing, my coding style, commenting and general programming skills have improved dramatically. I'm sure many of you will point out a lot of areas for improvement.

Using the code

A brief description of how to use the article or code. The class names, the methods and properties, any tricks or tips.

Blocks of code should be set as style "Formatted" like this:

//
// A very basic call to clean XSS and other unwanted tags/attributes
//
string CleanHtml;
string DirtyHtml = "<a href=\"www.google.com\">google link<a><br><script>function dosomethingbad(){alert('gotcha');}</script>";
string WhiteList = "a[id|href|target=_blank|class=myclass],b,p[align],br,i,ul,li,~script,div";
CleanHtml = AntiXssSafeHtmlClean.AntiXssSafeHtmlCleanString(DirtyHtml , WhiteList); 

There are also other features such as:

  • Remove whitespace from output.
  • Apply CSS class to inbound links
  • Apply CSS class to outbound links
  • Set maximum lenght for hyperlink text
  • Converting links to anchor tags 

Points of Interest

If you find any bugs, or have any suggestions, please message me.

History

V 1.0 (alpha) - First release to public 

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here