You could split a problem in two big cases: if original file a well-formed XML or not. It it is (if does not even matter if it is compliant with XHTML schema or not), the problem is solved very easy: use the classes
System.Xml.XmlTextReader
/
System.Xml.XmlTextWriter
or
System.Xml.XmlDocument
. If it is not, this is a boring manual task without 100% guarantee of the result. Search for something like this:
http://en.lmgtfy.com/?q=html+tidy[
^].
—SA