XSD Tutorial Parts
- Elements and Attributes
- Conventions and Recommendations
- Extending Existing Types
- Namespaces
- Other Useful bits...
Introduction
It is often useful to be able to take the definition for an existing entity, and extend it to add more specific information. In most development languages, we would call this inheritance or sub classing. The same concepts also exist in the XSD standard. This allows us to take an existing type
definition and extend it. We can also restrict an existing type
(although this behavior has no real parallel in most development languages).
Extending an Existing ComplexType
It is possible to take an existing <xs:complexType>
and extend it. Let's see how this may be useful with an example.
Looking at the AddressType
that we defined earlier (in part 1), let's assume our company has now gone international and we need to capture country specific addresses. In this case, we need specific information for UK addresses (County
and Postcode
), and for US addresses (State
and ZipCode
).
So we can take our existing definition of address and extend it as follows:
<xs:complexType name="UKAddressType">
<xs:complexContent>
<xs:extension base="AddressType">
<xs:sequence>
<xs:element name="County" type="xs:string"/>
<xs:element name="Postcode" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="USAddressType">
<xs:complexContent>
<xs:extension base="AddressType">
<xs:sequence>
<xs:element name="State" type="xs:string"/>
<xs:element name="Zipcode" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
This is clearer when viewed graphically. But basically, it is saying - we are defining a new <xs:complexType>
called "USAddressType
", this extends the existing type
"AddressType
", and adds to it a sequence containing the element
s "State
", and "Zipcode
".
There are 2 new things here the <xs:extension>
element and the <xs:complexContent>
element; we'll get to these shortly.

We can now use these new type
s as follows:
<xs:element name="UKAddress" type="UKAddressType"/>
<xs:element name="USAddress" type="USAddressType"/>
Some sample XML for these element
s may look like this:
<UKAddress>
<Line1>34 thingy street</Line1>
<Line2>someplace</Line2>
<County>somerset</County>
We are defining a new type
"InternalAddressType
". The <xs:restriction>
element
says we are restricting the existing type
"AddressType
" , and we are only allowing the existing child element
"Line1
" to be used in this new definition.
Note: Because we are restricting an existing type
, the only definitions that can appear in the <xs:restriction>
are a subset of the ones defined in the base type
"AddressType
". They must also be enclosed in the same compositor (in this case, a sequence) and appear in the same order.
We can now use this new type
as follows:
<xs:element name="InternalAddress" type="InternalAddressType"/>
Some sample XML for this element may look like this:
<InternalAddressType>
<Line1>Desk 4, Second Floor/<Line1>
</InternalAddressType>
Note: The <xs:complexContent>
element is just a container for the extension or restriction - we can largely ignore it for now.
Use of Extended/Restricted Types
We have just shown how we can create new type
s based on existing one. This in itself is pretty useful, and will potentially reduce the amount of complexity in your schemas, making them easier to maintain and understand. However, there is an aspect to this that has not yet been covered. In the above examples, we created 3 new type
s (UKAddressType
, USAddressType
and InternalAddressType
), all based on AddressType
.
So, if we have an element
that specifies it is of type
UKAddressType
, then that is what must appear in the XML document. But if an element specifies it is of type
"AddressType
", then any of the 4 type
s can appear in the XML document (UKAddressType
, USAddressType
, InternalAddressType
or AddressType
).
The thing to consider now is, how will the XML parser know which type
you meant to use, surely it needs to know otherwise it cannot do proper validation?
Well, it knows because if you want to use a type
other than the one explicitly specified in the schema (in this case AddressType
) then you have to let the parser know which type
you are using. This is done in the XML document using the xsi:type
attribute.
Let's look at an example.
<xs:element name="Person">
<xs:complexType>
<xs:sequence>
<xs:element name="Name" type="xs:string" />
<xs:element name="HomeAddress" type="AddressType" />
</xs:sequence>
</xs:complexType>
</xs:element>

This sample XML is the kind of thing you would expect to see.
="1.0"
<Person>
<Name>Fred</Name>
<HomeAddress>
<Line1>22 whatever place, someplace</Line1>
<Line2>sometown, ss1 6gy </Line2>
</HomeAddress>
</Person>
But the following is also valid.
="1.0"
<Person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Name>Fred</Name>
<HomeAddress xsi:type="USAddressType">
<Line1>234 Lancaseter Av</Line1>
<Line2>SmallsVille</Line2>
<State>Florida</State>
<Zipcode>34543</Zipcode>
</HomeAddress>
</Person>
Let's look at that in more detail.
- We have added the
attribute
xsi:type="USAddressType
"
to the "HomeAddress
" element
. This tells the XML parser that the element
actually contains data described by "USAddressType
". - The
xmlns:xsi
attribute
in the root element
(Person
) tells the XML parser that the alias xsi
maps to the namespace "http://www.w3.org/2001/XMLSchema-instance
". - The
xsi:
part of the xsi:type
attribute
is a namespace qualifier. It basically says the attribute
"type
" is from the namespace aliased by "xsi
" which was defined earlier to mean "http://www.w3.org/2001/XMLSchema-instance
". - The "
type
" attribute
in this namespace is an instruction to the XML Parser to tell it which definition to use to validate the element
.
But more about namespaces in the next section.
Extending Simple Types
There are 3 ways in which a simpleType
can be extended; Restriction
, List
or Union
. The most common is Restriction
, but we will cover the other 2 as well.
Restriction
Restriction
is a way to constrain an existing type
definition. We can apply a restriction
to the built in data type
s xs:string
, xs:integer
, xs:date
, etc. or ones we create ourselves.
Here, we are defining a restriction
the existing type
"string
", and applying a regular expression to it to limit the values it can take.
<xs:simpleType name="LetterType">
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z]"/>
</xs:restriction>
</xs:simpleType>
Shown graphically in Liquid XML Studio as follows:

|  |
Let's go through this line by line:
- A
<simpleType>
tag is used to define our new type
, we must give the type
a unique name - in this case, "LetterType
" - We are restricting an existing
type
- so the tag is <restriction>
(you can also extend an existing type
- but more about this later). We are basing our new type
on a string
so type="xs:string"
- We are applying a restriction in the form of a Regular expression, this is specified using the
<pattern>
element
. The regular expression means the data must contain a single lower or upper case letter a through to z. - Closing tag for the
restriction
- Closing tag for the simple
type
Restriction
s may also be referred to as "Facets
". For a complete list, see the XSD Standard, but to give you an idea, here are a few to get you started.
Overview | Syntax | Syntax explained |
This specifies the minimum and maximum length allowed.
Must be 0 or greater. | <xs:minLength value="3">
<xs:maxLength value="8"> | In this example, the length must be between 3 and 8. |
The lower and upper range for numerical values.
The value must be less than or equal to, greater than or equal to | <xs:minInclusive value="0"> <xs:maxInclusive value="10"> | The value must be between 0 and 10 |
The lower and upper range for numerical values
The value must be less than or greater than | <xs:minExclusive value="0"> <xs:maxExclusive value="10"> | The value must be between 1 and 9 |
The exact number of characters allowed | <xs:length value="30"><code> | The length must not be more than 30 |
Exact number of digits allowed | <xs:totalDigits value="9"> | Can not have more than 9 digits |
A list of values allowed |
<xs:enumeration value="Hippo"/><br />
<xs:enumeration value="Zebra"/><br />
<xs:enumeration value="Lion"/> | The only permitted values are Hippo , Zebra or Lion |
The number of decimal places allowed (must be >= 0 ) | <xs:fractionDigits value="2"/> | The value has to be to 2 d.p. |
This defines how whitespace will be handled.
Whitespace is line feeds, carriage returns, tabs, spaces, etc. | <xs:whitespace value= "preserve"/> <xs:whitespace value= "replace"/> <xs:whitespace value= "collapse"/> | Preserve - Keeps whitespaces
Replace - Replaces all whitespace with a space
Collapse - Replaces whitespace characters with a space, then if there are multiple spaces together then they will be reduced to one space. |
Pattern determines what characters are allowed and in what order.
These are regular expressions and there is a complete list at:
http://www.w3.org/TR/xmlschema-2/#regexs | <xs:pattern value="[0-999]"/> | [0-999] - 1 digit only between 0 and 999
[0-99][0-99][0-99] - 3 digits all have to be between 0 and 99
[a-z][0-10][A-Z] - 1st digit has to be between a and z and 2nd digit has to be between 0 and 10 and the 3rd digit is between A and Z. These are case sensitive.
[a-zA-Z] - 1 digit that can be either lower or uppercase A – Z
[123] - 1 digit that has to be 1, 2 or 3
([a-z])* - Zero or more occurrences of a to z
([q][u])+ - Looking for a pair letters that satisfy the criteria, in this case a q followed by a u
([a-z][0-999])+ - As above, looking for a pair where the 1st digit is lowercase and between a and z, and the 2nd digit is between 0 and 999, for example a1, c99, z999, f45
[a-z0-9]{8} - Must be exactly 8 characters in a row and they must be lowercase a to z or number 0 to 9. |
It is important to note that not all facets are valid for all data type
s - for example, maxInclusive
has no meaning when applied to a string
. For the combinations of facets that are valid for a given data type
, refer to the XSD standard.
Union
A union
is a mechanism for combining 2 or more different data type
s into one.
The following defines 2 simple type
s "SizeByNumberType
" all the positive integers up to 21 (e.g. 10, 12, 14), and "SizeByStringNameType
" the values small
, medium
and large
.
<xs:simpleType name="SizeByNumberType">
<xs:restriction base="xs:positiveInteger">
<xs:maxInclusive value="21"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="SizeByStringNameType">
<xs:restriction base="xs:string">
<xs:enumeration value="small"/>
<xs:enumeration value="medium"/>
<xs:enumeration value="large"/>
</xs:restriction>
</xs:simpleType>

We can then define a new type
called "USClothingSizeType
", we define this as a union
of the type
s "SizeByNumberType
" and "SizeByStringNameType
" (although we can add any number of type
s, including the built in type
s - separated by whitespace).
<xs:simpleType name="USClothingSizeType">
<xs:union memberTypes="SizeByNumberType SizeByStringNameType" />
</xs:simpleType>


This means the type
can contain any of the values that the 2 members can take (e.g. 1, 2, 3, ...., 20, 21, small, medium, large
). This new type
can then be used in the same way as any other <xs:simpleType>
List
A list
allows the value (in the XML document) to contain a number of valid values separated by whitespace.
A List
is constructed in a similar way to a Union
. The difference being that we can only specify a single type
. This new type
can contain a list of values that are defined by the itemType
property. The values must be whitespace separated. So, a valid value for this type
would be "5 9 21
".
<xs:simpleType name="SizesinStockType">
<xs:list itemType="SizeByNumberType" />
</xs:simpleType>

