Click here to Skip to main content
15,885,365 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
hi how to use the valid part of an xml present inside a broken xml

Ex:
HTML
<nodea>
<nodeb>DATA</nodeb
<nodec> DATA </nodec>

at above since the nodea is not getting closed so this is a broken xml,but the nodeb & nodec are the valid Xmls present inside nodea. so can i retrive the data inside them or not using libxml2.
Posted
Updated 3-Apr-12 23:53pm
v2

1 solution

I don't know the specifics of libxml2 but if it's an event driven XML parser like expat then yes, it'll parse the data. If it's a DOM one it might but flag up an error while doing it.

Event driven parsers will work as they only read the document one tag at a time and call you back as they encounter it. So in your example you'll get a sequence of calls like:

- Hi, I'm opening nodea!
- Hi, I'm opening nodeb!
- Ooo, nodeb contains DATA
- Hi, I'm closing nodeb
- Hi, I'm opening nodec
- Ooo, nodec contains DATA
- Hi, I'm closing nodec
- err, I've finished processing and nodea hasn't been closed! Panic, world in flames!

A DOM parser on the other hand could say something like:

- I'm opening nodea, let's find the end of it
- Oh no! It's not there! I give up.

As I said I have no idea whether libxml2 can act as an event driven parser, but if it can't then try expat if you can use it in your project.

Cheers,

Ash

PS: Looks like libxml2 is a SAX (an informal standard for event parser) parser so it should be able to cope with incomplete XML.
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900