This Article introduces a "Delta Calculation Component" (DCC
for simplicity) that provides delta calculation logic of XML document using
DCC gets a XML document as an input. This XML document
contains two different sections. The sections present two different XML
structures. DCC runs the delta calculation logic to extract the changes between
these two sections. The result of the DCC component is presented in one XML section
containing the consolidated changes (similarities are ignored) between the
input’s sections. The output includes flags to annotate the deleted, added and
updated elements according to the input file. These flags’ values are
configurable in the component’s configuration file.
The input XML document contains the first section which is
called the "AS IS" structure and the second section which is called the "TO BE"
structure. The two sections wrapped separately by the same parent element name
(whatever the user defined/configured element name). Sample of the input file and
more detailed description will be demonstrated in the next sections.
The DCC implementation in XSLT language compares these two
sections upon the configuration.xml file which contains some defined configurations
by the user. XSLT language is faster than other programming languages which can
provide the same logic, which influences positively on the user application
performance. The result is constructed in one section wrapped by "
XML element. This section contains the difference between the elements in the
input sections. The flag (action attribute) is attached to the xml element
according to "AS IS-TO BE" changes. The changes could be: deleted element from
"AS IS", added element to "TO BE", changed element between "AS IS" and "TO BE".
The identification verification of the ASIS and TOBE elements depends on the
defined primary attribute per each input element. All details of these points
and configuration will be described later in the following sections.
DCC Functionality Details
In Figure 1, the input consists of two "structure" elements
under the same "ABC" parent element. These two "structure" elements will be considered
in delta calculation of DCC component. Under each "structure" component, there
are books elements (and could be anything else upon user input file). These
elements under "structure" will be evaluated. Each element as described later
has a primary attribute key. Let’s say in the case of book, the "
the primary key, so the books elements will be compared through this attribute
value. The primary key per each element is defined by the user. If there is an
element that the user has not defined such primary key, any element in the ASIS
will match with same element in TOBE under the same hierarchy even if they have
different other details (other different attributes).
Figure 1: Delta Calculation Sample Input
After doing that logic to be able to match the elements in
two separate "structure" elements, DCC defines what the added or removed
elements are. Any parent that has added or removed child will be considered as
The similar elements that have no updates will be absent in
the generated output, DCC presents only the changes.
According to Figure 1, DCC will mention in its output the
Language" book is removed with all its child elements.
Language" book is added with all its child elements.
Language" book is updated with "Java_New_Author" is added and "Java_Old_Author"
is removed ("Java_Author" will not be mentioned at all, no changes in it nor in
Figure 2: Delta Calculation
As described in figure 2, DCC
includes 3 types of artifacts:
blocks: user defined input file and generated output file.
blocks: user exposed artifacts. They are a configuration file and a DCC entry
point xsl file.
blocks: DCC internal artifacts that contain the delta calculation logic. These
files should not be touched by the user.
Input XML file is the user defined input
file. As in figure 1, the input file contains two "structure" elements. "Structure"
element as it will described later, is the wrapper of the input "ASIS" and
"TOBE" sections. Underneath this element, the structure follows the user
specific problem. Also the elements out of that element are out of delta
calculation logic scope. Example of these outer elements is "DEF" element in
Out_XXX XML file (where XXX is the
input file’s name) is the output
generated file. As presented in figure 3, it contains "DeltaCalculationOutputWrapper"
element. Under this element, the changed/added/removed elements will appear.
Any similar inputs will not be shown in the output. Any added element will be
marked as added, any removed element will be marked as removed. And in both
cases, all the predecessor elements are marked as updated. As shown in figure
3, the book that has at least one added or removed author is handled as updated
3: Delta Calculation sample output xml file
DCC_EntryPoint XSL file is the entry point for DCC.
The user should call this XSL file after setting the path of the input file.
This call could be managed through implementation (Java implementation for
example) or using a tool like "Altova XML spy". The user doesn’t need to do
anything with that file, just he needs to call it.
DCC_Configuration XML file is a configuration file that
should take some attention from the user. The user here defines many things as
- "DC-ComparedElement" is the entry to define the
input wrapper element name ("structure" in our sample scenario).
- "Component", here the user defines all the
components that need to be compared. In each "component" element, the user
should state information: the user problem related element name and the
attribute of this element that will be used to know either two elements are
identical or not. This "
states something like the primary key in DB to differentiate between rows.
DC-UpdatedElement" is the feature that the user
could define the flag that will mark the updated elements in the output file.
DC-RemovedElement" is the feature that the user
could define the flag that will mark the removed elements in the output file.
DC-AddedElement" is the feature that the user
could define the flag that will mark the added elements in the output file.
4: DCC_Configuration xml sample file
XSL file is an
internal xsl file, contains the needed logic for the calculation. This file
should not be modified by the user.
DCC_Utility XSL file is an internal xsl file,
contains the needed logic utility for the calculation. This file should not be modified
by the user.
Getting started to
use DCC without user interface
When you would like to use the component internally in your
application, you should follow the following steps:
"DCC_Configuration.xml" file according to you input schema as mentioned before
in this article.
all the DCC files (XSLs and configuration XML) in the same directory.
your implementation to call "DCC_EntryPoint.xsl" file directly attaching the
input file (without any specified template call).
the output of the call; it is the delta calculation logic output.
Getting started to
use DCC with a user interface
Figure 5: Delta Calculation GUI
In figure 5, this is the first form appears when start running
the delta calculation component jar. This form consists of three fields, the
first one is XSLT file path which should contain the correct path of the DCC_EntryPoint.xslt
file, the second field should contain the correct path of the input file, and
the third one contains the path of the folder you want to encompass your output
Figure 6: Delta calculation filled form with correct
Figure 6, this form contains the three fields which were
filled with the correct paths.
Figure 7: Delta
calculation completed successfully
After filled the paths with the correct paths and click
start button the delta calculation starts working then a message box will
appear with a successful message. Now the generated file is ready for checking.
Figure 8: Delta
calculation failed scenario
If an incorrect input is used, a message box will appear
with a failure message describe what is wrong (as in figure 8).
Figure 9: Delta
Calculation with missing field
As in figure 9, if
start button is clicked and there was a path that has not been filled, a
message box will appear to ask for completing it.
DCC component is apparently
fit with middleware applications that needs such feature to be implemented
using XSLT. The logic of delta calculation is easily to be implemented using
any object oriented/structural programming language, but the really new idea here
is implementing such complex logic using XSLT language which provides better
performance than any normal programming language (Java, .NET ….).
Although DCC really fits in
middleware application, it could also be used in desktop, web, and enterprise
application (under the constraint that, this application has the needed library
to execute XSLT language like Saxon jars).
The GUI part of DCC (the jar
file) could be used from any non-technical user to get the same logic service
through the GUI presentation and with the support of the configuration file.