What do you mean by the "difference" highly depends on how you define it, in particular, in XML schema uses and its mapping onto the "difference" file.
First of all, you need to find difference not between files, but between logical structures of the XML. The same logical structure could be represented by text in many slightly different ways. This is one of the key ideas about XML: to abstract out physical entities from the logical structure. In particular, you could parse both files to be compared into DOM to compare DOMs, not files. But how to express the result of comparison?
Now, about mapping of differences to some schema. I'll give you a simple example:
File #1:
="1.0"="UTF-8"
<top>
<a>
<b><![CDATA[</b>
<b>1</b>
<b>2</b>
</a>
</top>
File#2:
="1.0"="UTF-8"
<top>
<a>
<b><p>Same thing</p></b>
<b>2</b>
<b>3</b>
</a>
</top>
Some data is in both files, some in one of them only, some in the second one only.
Let's think how it can be represented. For example, like this:
="1.0"="UTF-8"
<top>
<a>
<both_files>
<b><b><p>Same thing</p></b></b>
<b>2</b>
</both_files>
<first_file_only>
<b>1</b>
</first_file_only>
<second_file_only>
<b>3</b>
</second_file_only>
</a>
</top>
First, pay attention that the first
a
element is identical in both files despite of different ways it's written.
Now, what a schema for the "difference" file could be? This is something not automatically defined by the schema of the files under comparison. My sample uses some arbitrary syntax I made up — in one of many possible ways. And I did not even touch such a difficult problem as the ordering of elements. How to express the different ordering? Well, it's feasible, too, but…
The problem does not have one general solution. Did I make it clear?
[EDIT]
Some links, if you need to understand better what I'm talking about questioning the "mapping":
http://en.wikipedia.org/wiki/Bijection[
^],
http://en.wikipedia.org/wiki/Data_mapping[
^].
[END EDIT]
So the question as such does not have exact meaning.
—SA