Click here to Skip to main content
Click here to Skip to main content

A managed wrapper for the HTML Tidy library

By , 12 Jan 2007
 

(For the latest changes, please see the history section at the end of the article)

Introduction

This is a small library in its initial creation state to provide a native .NET way in accessing the functions of the HTML Tidy library.

HTML Tidy is an open source C library for checking and generating clean XHTML/HTML. In other words: You can throw a misformatted HTML to the library and it will do its best to repair the errors and clean unnecessary items/tags from the HTML.

The Library

There already does exist a way to access the library from .NET, namely through the ATL wrapper of Charles Reitzel (SourceForge CVS repository of the sources here). But you need to register the COM ActiveX control first.

To get rid of this registration limitation, I created a C++/CLI wrapper of the original C library of HTML Tidy. This wrapper is a normal library that you can use in your .NET applications by simply adding a reference to the library.

Please note that my created library currently does not deserve to be called "library", because it really just consists of one single function until now.

The reason why I still do publish it here and now is that I want to provide the basic idea as early as possible to anyone being in the same situation than me (by needing a .NET wrapper for HTML Tidy). It's rather ease to take my library as a starting point and add the required functions you need. I did the core work, you simply add the functions you like.

Of course I gradually will add more functions to the library, as my requirements grow. And I also do encourage you to enhance it by yourself and send me your code so that I can include it.

The underlying C library

It was a pleasure to compile the original HTML Tidy C library. After first starting the provided Visual Studio .NET project file, compiled it for debug and release, received no errors, no warnings. Amazing! I never had such a seamless experience with compiling foreign C/C++ libraries.

Using my .NET library

The library currently has one function to call:

public string CleanHtml( string html );

Simply pass a string and get a cleaned up string back. Easy, isn't it?

An example usage could be:

using ( HtmlTidy tidy = new HtmlTidy() )
{
  string html =
    @"
    <html>
      <head>
        <meta http-equiv=""Content-Type"" content=""text/html; charset=utf-16"">
      </head>
      <body>
        <p>Hello, <b><i>With German</b></i>: ÄÖÜ. Some Chinese: &#35754;.</p>
      <body>
    </html>
    ";

 
  string s = tidy.CleanHtml(
    html,
    HtmlTidyOptions.ConvertToXhtml );

 
  Console.WriteLine( s );
}

As you see, simply pass the string to the function. There is an overload with one option (currently, will be enhanced in the future, too).

Redistributing

In order to redistribute the library, please ensure that the Microsoft CRT runtime DLLs "msvcr80.dll", "msvcm80.dll" and "msvcp80.dll" are also being distributed. The libraries are usually being found in the folder "C:\Program Files\Microsoft Visual Studio 8\VC\redist\x86\Microsoft.VC80.CRT".

History

  • 2007-01-14
    Added the section about redistributing the CRT library.
     
  • 2007-01-12
    First version published.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Uwe Keim
Chief Technology Officer Zeta Producer Desktop CMS
Germany Germany
Member
Uwe does programming since 1989 with experiences in Assembler, C++, MFC and lots of web- and database stuff and now uses ASP.NET and C# extensively, too. He has also teached programming to students at the local university.
 
In his free time, he does climbing, running and mountain biking. Recently he became a father of a cute boy.
 
Some cool, free software from us:
 
Free Test Management Software - Intuitive, competitive, Test Plans. Download now!  
Homepage erstellen - Intuitive, very easy to use. Download now!  
Send large Files online for free by Email
Some random fun stuff in German

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
Hint: For improved responsiveness ensure Javascript is enabled and choose 'Normal' from the Layout dropdown and hit 'Update'.
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Questionfine Jobmemberyves3142 May '13 - 23:28 
GeneralTargeting 64-bitmemberNick Higgs23 Aug '07 - 23:56 
GeneralThe dll cannot run in ASP.NET Medium trust applicationmembertsandl13 Mar '07 - 10:37 
GeneralRe: The dll cannot run in ASP.NET Medium trust applicationsitebuilderUwe Keim13 Mar '07 - 19:22 
GeneralRe: The dll cannot run in ASP.NET Medium trust applicationmembertsandl13 Mar '07 - 19:52 
GeneralRe: The dll cannot run in ASP.NET Medium trust applicationsitebuilderUwe Keim13 Mar '07 - 20:28 
GeneralRe: The dll cannot run in ASP.NET Medium trust applicationmembertsandl14 Mar '07 - 9:28 
GeneralBugmembermike_mmmm1 Mar '07 - 3:22 
GeneralGreat tool!memberKoen_V28 Feb '07 - 21:39 
GeneralHave a look at tidyfornetmemberFrederik198421 Feb '07 - 9:50 
GeneralRe: Have a look at tidyfornetsitebuilderUwe Keim21 Feb '07 - 18:17 
GeneralRe: Have a look at tidyfornetmembermike_mmmm1 Mar '07 - 2:39 
Generalbug - with sample filemembermike_mmmm17 Feb '07 - 15:04 
GeneralRe: bug - with sample filesitebuilderUwe Keim17 Feb '07 - 19:30 
GeneralRe: bug - with sample filemembermike_mmmm17 Feb '07 - 21:38 
GeneralRe: bug - with sample filemembermike_mmmm17 Feb '07 - 21:40 
GeneralHe does it again...adminChris Maunder13 Feb '07 - 18:40 
GeneralRe: He does it again...sitebuilderUwe Keim13 Feb '07 - 19:05 
GeneralNice jobmemberClickok13 Feb '07 - 15:03 
GeneralRe: Nice jobsitebuilderUwe Keim13 Feb '07 - 18:13 
Generalbig problem with tidymemberMember #81265710 Feb '07 - 19:11 
GeneralRe: big problem with tidymembermike_mmmm10 Feb '07 - 19:15 
GeneralRe: big problem with tidysitebuilderUwe Keim10 Feb '07 - 20:17 
GeneralRe: big problem with tidymemberMichael Sandrock21 Jun '07 - 17:03 
GeneralRe: big problem with tidysitebuilderUwe Keim21 Jun '07 - 17:58 
GeneralRe: big problem with tidymemberdavidgrover4 Jul '07 - 3:06 
GeneralThank you !!membertobyh31 Jan '07 - 6:34 
GeneralAnother .Net TidymemberMember #300317218 Jan '07 - 2:04 
GeneralGreat stuff...memberHoyaSaxa9312 Jan '07 - 10:00 
GeneralRe: Great stuff...sitebuilderUwe Keim12 Jan '07 - 10:16 
GeneralRe: Great stuff...memberreza shirazi2 Sep '11 - 1:08 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 12 Jan 2007
Article Copyright 2007 by Uwe Keim
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid