Introduction
This is a beginner's tutorial on XSLT and XML. Some knowledge of XML, XSLT, and XPath is required, so read some tutorials if necessary.
Selecting a structure for your data in XML is completely arbitrary. You can represent the same data in several different ways. Below is XML that represents the same data in four different ways. The XML represents a Census record. A Census record has a country, a year, a small size, and a large size. This information can be represented by elements or elements and attributes.
TYPE1
uses attributes and makes a very small footprint. TYPE2
uses only elements and because of formatting for readability, it uses a little more screen real-estate. TYPE3
and TYPE4
use a combination of elements and attributes.
="1.0"="utf-8"
<STUFF>
<TYPE1>
<CENSUS COUNTRY="USA" YEAR="1930">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">27x19</PAGE>
</CENSUS>
<CENSUS COUNTRY="USA" YEAR="1880">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">19x25</PAGE>
</CENSUS>
<CENSUS COUNTRY="UK" YEAR="1871">
<PAGE SIZE="SMALL">9.5x15</PAGE>
<PAGE SIZE="LARGE">9.5x15</PAGE>
</CENSUS>
<CENSUS COUNTRY="UK" YEAR="1891">
<PAGE SIZE="SMALL">11x16</PAGE>
<PAGE SIZE="LARGE">11x16</PAGE>
</CENSUS>
</TYPE1>
<TYPE2>
<CENSUS>
<COUNTRY>USA</COUNTRY>
<YEAR>1930</YEAR>
<PAGE>
<SIZE>
<SMALL>17x11</SMALL>
<LARGE>27x19</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>USA</COUNTRY>
<YEAR>1880</YEAR>
<PAGE>
<SIZE>
<SMALL>17x11</SMALL>
<LARGE>19x25</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>UK</COUNTRY>
<YEAR>1871</YEAR>
<PAGE>
<SIZE>
<SMALL>9.5x15</SMALL>
<LARGE>9.5x15</LARGE>
</SIZE>
</PAGE>
</CENSUS>
<CENSUS>
<COUNTRY>UK</COUNTRY>
<YEAR>1891</YEAR>
<PAGE>
<SIZE>
<SMALL>11x16</SMALL>
<LARGE>11x16</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</TYPE2>
<TYPE3>
<CENSUS>
<USA YEAR="1930">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">27x19</PAGE>
</USA>
<USA YEAR="1880">
<PAGE SIZE="SMALL">17x11</PAGE>
<PAGE SIZE="LARGE">19x25</PAGE>
</USA>
<UK YEAR="1871">
<PAGE SIZE="SMALL">9.5x15</PAGE>
<PAGE SIZE="LARGE">9.5x15</PAGE>
</UK>
<UK YEAR="1891">
<PAGE SIZE="SMALL">11x16</PAGE>
<PAGE SIZE="LARGE">11x16</PAGE>
</UK>
</CENSUS>
</TYPE3>
<TYPE4>
<CENSUS>
<COUNTRY>
USA
<YEAR>
1930
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">27x19</SIZE>
</PAGE>
</YEAR>
<YEAR>
1880
<PAGE>
<SIZE TYPE="SMALL">17x11</SIZE>
<SIZE TYPE="LARGE">19x25</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
<COUNTRY>
UK
<YEAR>
1871
<PAGE>
<SIZE TYPE="SMALL">9.5x15</SIZE>
<SIZE TYPE="LARGE">9.5x15</SIZE>
</PAGE>
</YEAR>
<YEAR>
1891
<PAGE>
<SIZE TYPE="SMALL">11x16</SIZE>
<SIZE TYPE="LARGE">11x16</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</CENSUS>
</TYPE4>
</STUFF>
Points of Interest
Often, an XML document needs to be converted to a new structure. That is where XSLT comes in. There are lots of good tutorials on XSLT. I found that there weren''t very many examples that covered more than a few aspects of XSLT. Thus I decided to write an XSLT for each type in my XML document. So, I wrote some XSLT to convert each type into all of the other types. Some things I had to overcome were converting attributes into elements, element values into attributes, selecting nodes back up the tree (the parent in my case), stripping off white space, and adding white space.
The following XSLT converts TYPE2
, TYPE3
, and TYPE4
into TYPE1
.
Notice the following:
="1.0"="_TF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<TYPE1>
<xsl:for-each select="STUFF/TYPE2/CENSUS">
<CENSUS>
<xsl:attribute name="COUNTRY">
<xsl:value-of select="COUNTRY"/>
</xsl:attribute>
<xsl:attribute name="YEAR">
<xsl:value-of select="YEAR"/>
</xsl:attribute>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/SMALL"/>
</PAGE>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>LARGE</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/LARGE"/>
</PAGE>
</CENSUS>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE3/CENSUS">
<xsl:for-each select="child::*">
<CENSUS>
<xsl:attribute name="COUNTRY">
<xsl:value-of select="name()"/>
</xsl:attribute>
<xsl:copy-of select="@*"/>
<xsl:copy-of select="PAGE"/>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE4/CENSUS">
<xsl:for-each select="COUNTRY">
<xsl:for-each select="YEAR">
<CENSUS>
<xsl:attribute name="COUNTRY">
<xsl:value-of select="normalize-space(../text())"/>
</xsl:attribute>
<xsl:attribute name="YEAR">
<xsl:value-of select="normalize-space(text())"/>
</xsl:attribute>
<xsl:copy-of select="PAGE"/>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</TYPE1>
</xsl:template>
</xsl:stylesheet>
The following XSLT converts TYPE1
, TYPE3
, and TYPE4
into TYPE2
.
Notice the following:
- Selecting the value of an attribute of a node:
<xsl:value-of select="SOMENODE[@SOMEATTRIBUTE='SOMEVALUE']"/>
- Converting a node's name into a node's value:
<MYNODE><xsl:value-of select="name()"/></MYNODE>
- Selecting a node's value that has white space around it:
<xsl:value-of select="normalize-space(text())"/>
OR selecting the parent node's value that has white space around it:
<xsl:value-of select="normalize-space(../text())"/>
="1.0"="UTF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<TYPE2>
<xsl:for-each select="STUFF/TYPE1/CENSUS">
<CENSUS>
<COUNTRY><xsl:value-of select="@COUNTRY"/></COUNTRY>
<YEAR><xsl:value-of select="@YEAR"/></YEAR>
<PAGE>
<SIZE>
<SMALL>
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</SMALL>
<LARGE>
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE3/CENSUS">
<xsl:for-each select="child::*">
<CENSUS>
<COUNTRY><xsl:value-of select="name()"/></COUNTRY>
<YEAR><xsl:value-of select="@YEAR"/></YEAR>
<PAGE>
<SIZE>
<SMALL>
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</SMALL>
<LARGE>
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
<xsl:comment> ----------------- </xsl:comment>
<xsl:for-each select="STUFF/TYPE4/CENSUS">
<xsl:for-each select="child::COUNTRY">
<xsl:for-each select="YEAR">
<CENSUS>
<COUNTRY><xsl:value-of select="normalize-space(../text())"/>
</COUNTRY>
<YEAR>
<xsl:value-of select="normalize-space(text())"/>
</YEAR>
<PAGE>
<SIZE>
<SMALL>
<xsl:value-of select="PAGE/SIZE[@TYPE='SMALL']"/>
</SMALL>
<LARGE>
<xsl:value-of select="PAGE/SIZE[@TYPE='LARGE']"/>
</LARGE>
</SIZE>
</PAGE>
</CENSUS>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</TYPE2>
</xsl:template>
</xsl:stylesheet>
The following XSLT converts TYPE1
, TYPE2
, and TYPE4
into TYPE3
.
Notice the following:
="1.0"="UTF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<TYPE3>
<CENSUS>
<xsl:for-each select="STUFF/TYPE1/CENSUS">
<xsl:element name="{@COUNTRY}">
<xsl:attribute name="YEAR">
<xsl:value-of select="@YEAR"/>
</xsl:attribute>
<xsl:copy-of select="PAGE"/>
</xsl:element>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE2/CENSUS">
<xsl:element name="{COUNTRY}">
<xsl:attribute name="YEAR">
<xsl:value-of select="YEAR"/>
</xsl:attribute>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/SMALL"/>
</PAGE>
<PAGE>
<xsl:attribute name="LARGE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE/LARGE"/>
</PAGE>
</xsl:element>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE4/CENSUS/COUNTRY/YEAR">
<xsl:element name="{normalize-space(../text())}">
<xsl:attribute name="YEAR">
<xsl:value-of select="normalize-space(./text())"/>
</xsl:attribute>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>SMALL</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE[@TYPE='SMALL']"/>
</PAGE>
<PAGE>
<xsl:attribute name="SIZE">
<xsl:text>LARGE</xsl:text>
</xsl:attribute>
<xsl:value-of select="PAGE/SIZE[@TYPE='LARGE']"/>
</PAGE>
</xsl:element>
</xsl:for-each>
</CENSUS>
</TYPE3>
</xsl:template>
</xsl:stylesheet>
The following XSLT converts TYPE1
, TYPE2
, and TYPE3
into TYPE4
.
TYPE4
has formatting to make it human readable. Adding formatting to XML is not as easy as I had hoped. It seems advisable to always add formatting by placing it in an xsl:text
node.
Notice the following:
- Adding a carriage return:
<xsl:text>
</xsl:text>
- Adding a line feed:
<xsl:text>
</xsl:text>
- Adding a tab:
<xsl:text>	</xsl:text>
I only perform the formatting for the conversion of TYPE1
into TYPE4
. I found that the formatting statements cluttered the conversion statements.
="1.0"="UTF-8"
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<CENSUS>
<xsl:for-each select="STUFF/TYPE1/CENSUS">
<COUNTRY>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
<xsl:value-of select="@COUNTRY"/>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
<YEAR>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
<xsl:text>	</xsl:text>
<xsl:value-of select="@YEAR"/>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
<xsl:text>	</xsl:text>
<PAGE>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
<xsl:text>	</xsl:text>
<xsl:text>	</xsl:text>
<SIZE>
<xsl:attribute name="SMALL">
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</xsl:attribute>
</SIZE>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
<xsl:text>	</xsl:text>
<xsl:text>	</xsl:text>
<SIZE>
<xsl:attribute name="LARGE">
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</xsl:attribute>
</SIZE>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
<xsl:text>	</xsl:text>
</PAGE>
<xsl:text>
</xsl:text>
<xsl:text>	</xsl:text>
</YEAR>
<xsl:text>
</xsl:text>
</COUNTRY>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE2/CENSUS">
<COUNTRY>
<xsl:value-of select="COUNTRY"/>
<YEAR>
<xsl:value-of select="YEAR"/>
<PAGE>
<SIZE>
<xsl:attribute name="SMALL">
<xsl:value-of select="PAGE/SIZE/SMALL"/>
</xsl:attribute>
</SIZE>
<SIZE>
<xsl:attribute name="LARGE">
<xsl:value-of select="PAGE/SIZE/LARGE"/>
</xsl:attribute>
</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</xsl:for-each>
</CENSUS>
<xsl:comment> --------------- </xsl:comment>
<CENSUS>
<xsl:for-each select="STUFF/TYPE3/CENSUS/*">
<COUNTRY>
<xsl:value-of select="name()"/>
<YEAR>
<xsl:value-of select="@YEAR"/>
<PAGE>
<SIZE>
<xsl:attribute name="SMALL">
<xsl:value-of select="PAGE[@SIZE='SMALL']"/>
</xsl:attribute>
</SIZE>
<SIZE>
<xsl:attribute name="LARGE">
<xsl:value-of select="PAGE[@SIZE='LARGE']"/>
</xsl:attribute>
</SIZE>
</PAGE>
</YEAR>
</COUNTRY>
</xsl:for-each>
</CENSUS>
</xsl:template>
</xsl:stylesheet>
The C# code necessary to perform the XSL transformations is extremely simple with .NET. Save my XML data with the four types into a file called data.xml. Save the XSLT that you want to run as style.xslt.
static void Main(string[] args)
{
string fileName = "data.xml";
FileStream fs = new FileStream(fileName,FileMode.Open,FileAccess.Read);
XmlTextReader reader = new XmlTextReader(fs);
s.Seek(0,SeekOrigin.Begin);
reader = new XmlTextReader(fs);
TestXSLT(reader,"style.xslt");
}
static void TestXSLT(XmlTextReader reader, string fileName)
{
System.Console.WriteLine(fileName);
XslTransform xslt = new XslTransform();
xslt.Load(fileName);
XPathDocument xdoc = new XPathDocument(reader);
XmlTextWriter writer = new XmlTextWriter(Console.Out);
writer.Formatting=Formatting.Indented;
xslt.Transform(xdoc, null, writer, null);
}
Conclusion
I hope this has helped you understand XSLT a little better. Study each of the types of Census data in my XML at the top of the document. Look at how they differ. Think of the issues of converting one type to another. Then look at the XSLT that does the conversion.
License
This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below. A list of licenses authors might use can be found here.
Master Degree in C.S. .NET, Unix, Macintosh (OS X, 9, 8...), PC server side, and MFC. 17 years experience. Graphics, Distributed processing, Object Oriented Methods and Models.
Java, C#, C++. Webservices. XML. Real name is Geoffrey Slinker.