Click here to Skip to main content
15,881,882 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am trying to put together as security demonstration for work and I want to demonstrate why DTD entity processing can be dangerous, and therefore why we want to prevent it. I am trying to get it working in a larger sense but here is the code I am having difficulty with.

C#
String str = "<?xml version=\"1.0\"?><!DOCTYPE foo [<!ELEMENT foo ANY> <!ENTITY xxe    SYSTEM \"http://www.google.com\" >]><root>&xxe;</root>";
        String encoded = System.Web.HttpUtility.UrlEncode(str);
        System.Xml.XmlDocument xDoc = new System.Xml.XmlDocument();
        xDoc.LoadXml(str)


The difficulty is that every time I load my xml document I get an exception that the DOCTYPE token was not expected. How do I enable loading a DTD in C# with LoadXml, parse my DTD with entity expansion? without an exception?
Posted
Comments
CdnSecurityEngineer 25-Sep-13 15:24pm    
I think you didn't really understand or solve my question. My question was specifically how do I get xml document to load using XmlDocument to Load. Given the DTD attached. What I specifically want to konw, is why when I do XmlDocument.Load. I don't get entity expansion, can you answer that?

1 solution

First of all, there is no such thing as DTD separate from XML. There is a DOCTYPE which can be defined using separate "external entity" file.

DOCTYPE parsing is very unusual thing. Most of XML parsers use DOCTYPE for validation but don't provide access to parsed DOCTYPE elements. First of all, there is no a DOM standard for DOCTYPE as it is defined for the rest of XML. When I needed DOCTYPE parsing (to create a metadata, schema), I had to develop my own parser. There is a number of Java parsers, but I never heard of anything for .NET. You can try to find some on the Web.
Moreover, with the standardization of XML Schema, the situation became only worse, because the use of DOCTYPE has been greatly reduced. You, too, should think about migration to Schema:
http://en.wikipedia.org/wiki/XML_Schema_%28W3C%29[^].

I found that, reportedly, DOCTYPE structure can be parsed using a SGML parser. Please see:
http://stackoverflow.com/questions/3760220/how-do-i-parse-a-dtd-file[^],
http://stackoverflow.com/questions/1148083/sgml-parser-net-recommendations[^],
https://github.com/MindTouch/SGMLReader[^],
http://en.wikipedia.org/wiki/Standard_Generalized_Markup_Language[^].

—SA
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900