Introduction
This is a beginner's tutorial on XSLT and XML. Some knowledge of XML, XSLT, and XPath is required, so read some tutorials
if necessary.
Selecting a structure for your data in XML is completely arbitrary. You
can represent the same data in several different ways. Below is XML that
represents the same data four different ways. The XML represents a Census
record. A Census record has a country, a year, a small size, and a large
size. This information can be represented by elements or elements and
attributes.
- TYPE1 uses attributes and makes a very small footprint.
- TYPE2 uses only elements and because of formatting for readability it uses
a little more screen real-estate.
- TYPE3 and TYPE4 use a combination of elements and attributes.
="1.0" ="utf-8"
<STUFF>
<TYPE1>
<CENSUS COUNTRY="USA" YEAR="1930">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">27x19</PAGE>
</CENSUS>
<CENSUS COUNTRY="USA" YEAR="1880">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">19x25</PAGE>
</CENSUS>
<CENSUS COUNTRY="UK" YEAR="1871">
<PAGE SIZE="SMALL">9.5x15</PAGE>
<PAGE SIZE="LARGE">9.5x15</PAGE>
</CENSUS>
<CENSUS COUNTRY="UK" YEAR="1891">
<PAGE SIZE="SMALL">11x16</PAGE>
<PAGE SIZE="LARGE">11x16</PAGE>
</CENSUS>
</TYPE1>
-->
<TYPE2>
<CENSUS>
<COUNTRY>USA</COUNTRY>
<YEAR>1930</YEAR>
<PAGE>
<SIZE>
<SMALL>17x11</SMALL>
<LARGE>27x19</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>USA</COUNTRY>
<YEAR>1880</YEAR>
<PAGE>
<SIZE>
<SMALL>17x11</SMALL>
<LARGE>19x25</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>UK</COUNTRY>
<YEAR>1871</YEAR>
<PAGE>
<SIZE>
<SMALL>9.5x15</SMALL>
<LARGE>9.5x15</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>UK</COUNTRY>
<YEAR>1891</YEAR>
<PAGE>
<SIZE>
<SMALL>11x16</SMALL>
<LARGE>11x16</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</TYPE2>
-->
<TYPE3>
<CENSUS>
<USA YEAR="1930">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">27x19</PAGE>
</USA>
<USA YEAR="1880">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">19x25</PAGE>
</USA>
<UK YEAR="1871">
<PAGE SIZE="SMALL">9.5x15</PAGE>
<PAGE SIZE="LARGE">9.5x15</PAGE>
</UK>
<UK YEAR="1891">
<PAGE SIZE="SMALL">11x16</PAGE>
<PAGE SIZE="LARGE">11x16</PAGE>
</UK>
</CENSUS>
</TYPE3>
-->
<TYPE4>
<CENSUS>
<COUNTRY>
USA
<YEAR>
1930
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">27x19</SIZE>
</PAGE>
</YEAR>
<YEAR>
1880
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">19x25</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
<COUNTRY>
UK
<YEAR>
1871
<PAGE>
<SIZE TYPE="SMALL">9.5x15</SIZE>
<SIZE TYPE="LARGE">9.5x15</SIZE>
</PAGE>
</YEAR>
<YEAR>
1891
<PAGE>
<SIZE TYPE="SMALL">11x16</SIZE>
<SIZE TYPE="LARGE">11x16</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</CENSUS>
</TYPE4>
</STUFF>
Points of Interest
Often an XML document needs to be converted to a new structure. That is
where XSLT comes in. There are lots of good tutorials on XSLT. I found that
there weren''t very many examples that covered more than a few aspects of
XSLT. Thus I decided to write an XSLT for each type in my XML document. So, I
wrote some XSLT to convert each type into all of the other types. Some things
I had to overcome where converting attributes into elements, element values
into attributes, selecting nodes back up the tree (the parent in my case),
stripping off white space, and adding white space.
The following XSLT converts TYPE2, TYPE3, and TYPE4 into TYPE1.
Notice the following:
="1.0" ="_TF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<TYPE1>
<xsl:for-each select="STUFF/TYPE2/CENSUS">
<CENSUS>
<xsl:attribute name="COUNTRY">
<xsl:value-of select="COUNTRY"/>
</xsl:attribute>
<xsl:attribute name="YEAR">
<xsl:value-of select="YEAR"/>
</xsl:attribute>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/SMALL"/>
</PAGE>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>LARGE</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/LARGE"/>
</PAGE>
</CENSUS>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE3/CENSUS">
<xsl:for-each select="child::*">
<CENSUS>
<xsl:attribute name="COUNTRY">
<xsl:value-of select="name()"/>
</xsl:attribute>
<xsl:copy-of select="@*"/>
<xsl:copy-of select="PAGE"/>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE4/CENSUS">
<xsl:for-each select="COUNTRY">
<xsl:for-each select="YEAR">
<CENSUS>
<xsl:attribute name="COUNTRY">
<xsl:value-of select="normalize-space(../text())"/>
</xsl:attribute>
<xsl:attribute name="YEAR">
<xsl:value-of select="normalize-space(text())"/>
</xsl:attribute>
<xsl:copy-of select="PAGE"/>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</TYPE1>
</xsl:template>
</xsl:stylesheet>
The following XSLT converts TYPE1, TYPE3, and TYPE4 into TYPE2.
Notice the following:
- Selecting the value of an attribute of a node:
<xsl:value-of
select="SOMENODE[@SOMEATTRIBUTE='SOMEVALUE']"/>
- Converting a node's name into a node's value:
<MYNODE><xsl:value-of select="name()"/></MYNODE>
- Selecting a node's value that has white space around it:
<xsl:value-of select="normalize-space(text())"/>
OR selecting the parent node's value that has white space around
it:
<xsl:value-of select="normalize-space(../text())"/>
="1.0" ="UTF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<TYPE2>
<xsl:for-each select="STUFF/TYPE1/CENSUS">
<CENSUS>
<COUNTRY><xsl:value-of select="@COUNTRY"/></COUNTRY>
<YEAR><xsl:value-of select="@YEAR"/></YEAR>
<PAGE>
<SIZE>
<SMALL>
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</SMALL>
<LARGE>
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE3/CENSUS">
<xsl:for-each select="child::*">
<CENSUS>
<COUNTRY><xsl:value-of select="name()"/></COUNTRY>
<YEAR><xsl:value-of select="@YEAR"/></YEAR>
<PAGE>
<SIZE>
<SMALL>
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</SMALL>
<LARGE>
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE4/CENSUS">
<xsl:for-each select="child::COUNTRY">
<xsl:for-each select="YEAR">
<CENSUS>
<COUNTRY><xsl:value-of select="normalize-space(../text())"/>
</COUNTRY>
<YEAR>
<xsl:value-of select="normalize-space(text())"/>
</YEAR>
<PAGE>
<SIZE>
<SMALL>
<xsl:value-of select="PAGE/SIZE[@TYPE='SMALL']"/>
</SMALL>
<LARGE>
<xsl:value-of select="PAGE/SIZE[@TYPE='LARGE']"/>
</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</TYPE2>
</xsl:template>
</xsl:stylesheet>
The following XSLT converts TYPE1, TYPE2, and TYPE4 into TYPE3.
Notice the following:
- Create an element name from an attribute OR
convert an attribute into an element or node:
<xsl:element name="{@ATTRIBUTENAME}">
- To copy an element without change is easy with:
<xsl:copy-of select="SOMENODE"/>
="1.0" ="UTF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<TYPE3>
<CENSUS>
<xsl:for-each select="STUFF/TYPE1/CENSUS">
<xsl:element name="{@COUNTRY}">
-->
<xsl:attribute name="YEAR">
<xsl:value-of select="@YEAR"/>
</xsl:attribute>
<xsl:copy-of select="PAGE"/>
</xsl:element>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE2/CENSUS">
<xsl:element name="{COUNTRY}">
<xsl:attribute name="YEAR">
<xsl:value-of select="YEAR"/>
</xsl:attribute>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/SMALL"/>
</PAGE>
<PAGE>
<xsl:attribute name="LARGE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/LARGE"/>
</PAGE>
</xsl:element>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE4/CENSUS/COUNTRY/YEAR">
<xsl:element name="{normalize-space(../text())}">
<xsl:attribute name="YEAR">
<xsl:value-of select="normalize-space(./text())"/>
</xsl:attribute>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE[@TYPE='SMALL']"/>
</PAGE>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>LARGE</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE[@TYPE='LARGE']"/>
</PAGE>
</xsl:element>
</xsl:for-each>
</CENSUS>
</TYPE3>
</xsl:template>
</xsl:stylesheet>
The following XSLT converts TYPE1, TYPE2, and TYPE3 into TYPE4.
TYPE4 has formatting to make it human readable. Adding formatting to XML
is not as easy as I had hoped. It seems advisable to always add formatting by
placing it in an xsl:text node.
Notice the following:
- Adding a carriage return:
<xsl:text>
</xsl:text>
- Adding a line feed:
<xsl:text>
</xsl:text>
- Adding a tab:
<xsl:text>	</xsl:text>
I only perform the formatting for the conversion of TYPE1 into TYPE4. I
found that the formatting statements cluttered the conversion statements.
="1.0" ="UTF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
-->
<xsl:template match="/">
<CENSUS>
<xsl:for-each select="STUFF/TYPE1/CENSUS">
<COUNTRY>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:value-of select="@COUNTRY"/>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
<YEAR>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:value-of select="@YEAR"/>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:text>	</xsl:text> -->
<PAGE>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:text>	</xsl:text> -->
<SIZE>
<xsl:attribute name="SMALL">
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</xsl:attribute>
</SIZE>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:text>	</xsl:text> -->
<SIZE>
<xsl:attribute name="LARGE">
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</xsl:attribute>
</SIZE>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
<xsl:text>	</xsl:text> -->
</PAGE>
<xsl:text>
</xsl:text> -->
<xsl:text>	</xsl:text> -->
</YEAR>
<xsl:text>
</xsl:text> -->
</COUNTRY>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE2/CENSUS">
<COUNTRY>
<xsl:value-of select="COUNTRY"/>
<YEAR>
<xsl:value-of select="YEAR"/>
<PAGE>
<SIZE>
<xsl:attribute name="SMALL">
<xsl:value-of select="PAGE/SIZE/SMALL"/>
</xsl:attribute>
</SIZE>
<SIZE>
<xsl:attribute name="LARGE">
<xsl:value-of select="PAGE/SIZE/LARGE"/>
</xsl:attribute>
</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE3/CENSUS/*">
<COUNTRY>
<xsl:value-of select="name()"/>
<YEAR>
<xsl:value-of select="@YEAR"/>
<PAGE>
<SIZE>
<xsl:attribute name="SMALL">
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</xsl:attribute>
</SIZE>
<SIZE>
<xsl:attribute name="LARGE">
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</xsl:attribute>
</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</xsl:for-each>
</CENSUS>
</xsl:template>
</xsl:stylesheet>
The C# code necessary to perform the XSL transformations is extremely
simple with .NET. Save my XML data with the four types into a file called
data.xml. Save the XSLT that you want to run as style.xslt.
static void Main(string[] args)
{
string fileName = "data.xml";
FileStream fs = new FileStream(fileName,FileMode.Open,FileAccess.Read);
XmlTextReader reader = new XmlTextReader(fs);
s.Seek(0,SeekOrigin.Begin);
reader = new XmlTextReader(fs);
TestXSLT(reader,"style.xslt");
}
static void TestXSLT(XmlTextReader reader, string fileName)
{
System.Console.WriteLine(fileName);
XslTransform xslt = new XslTransform();
xslt.Load(fileName);
XPathDocument xdoc = new XPathDocument(reader);
XmlTextWriter writer = new XmlTextWriter(Console.Out);
writer.Formatting=Formatting.Indented;
xslt.Transform(xdoc, null, writer, null);
}
Conclusion
I hope this has helped you understand XSLT a little better. Study each of
the types of Census data in my XML at the top of the document. Look at how
they differ. Think of the issues of converting one type to another. Then look
at the XSLT that does the conversion.
Master Degree in C.S. .NET, Unix, Macintosh (OS X, 9, 8...), PC server side, and MFC. 17 years experience. Graphics, Distributed processing, Object Oriented Methods and Models.
Java, C#, C++. Webservices. XML. Real name is Geoffrey Slinker.