Click here to Skip to main content
Click here to Skip to main content

XSD Tutorial - Part 1 of 5 - Elements and Attributes

, 3 Jul 2014
Rate this:
Please Sign up or sign in to vote.
This article gives a basic overview of the building blocks underlying XML Schemas.

XSD Tutorial Parts

  1. Elements and Attributes
  2. Conventions and Recommendations
  3. Extending Existing Types
  4. Namespaces
  5. Other Useful bits...

Introduction

This article gives a basic overview of the building blocks underlying XML Schemas and how to use them. It covers:

Overview

First let's look at what an XML schema is. A schema formally describes what a given XML document contains, in the same way a database schema describes the data that can be contained in a database (table structure, data types). An XML schema describes the coarse shape of the XML document, what fields an element can contain, which sub elements it can contain etc. It can also describe the values that can be placed into any element or attribute.

A Note About Standards

  • "DTD" was the first formalized standard, but is rarely used anymore.
  • "XDR" was an early attempt by Microsoft to provide a more comprehensive standard than DTD. This standard has pretty much been abandoned now in favor of XSD.
  • "XSD" is currently the de facto standard for describing XML documents. There are 2 versions in use 1.0 and 1.1, which are on the whole the same (you have to dig quite deep before you notice the difference). An XSD schema is itself an XML document, there is even an a XSD schema to describe the XSD standard.
  • There are also a number of other standards but their take up has been patchy at best.

The XSD standard has evolved over a number of years, and is controlled by the W3C. It is extremely comprehensive, and as a result has become rather complex. For this reason, it is a good idea to make use of design tools when working with XSD's (See XML Studio, a FREE XSD development tool), also when working with XML documents programmatically XML Data Binding is a much easier way to manipulate your documents (an object oriented approach - see Liquid XML Data Binding).

The remainder of this tutorial guides you through the basics of the XSD standard, things you should really know even if you are using a design tool like Liquid XML Studio.

Elements

Elements are the main building block of any XML document, they contain the data and determine the structure of the document. An element can be defined within an XML Schema (XSD) as follows:

<xs:element name="x" type="y"/>

An element definition within the XSD must have a name property, this is the name that will appear in the XML document. The type property provides the description of what can be contained within the element when it appears in the XML document. There are a number of predefined types, such as xs:string, xs:integer, xs:boolean or xs:date (see XSD standard for a complete list). You can also create a user defined type using the <xs:simple type> and <xs:complexType> tags, but more on these later.

If we have set the type property for an element in the XSD, then the corresponding value in the XML document must be in the correct format for its given type (failure to do this will cause a validation error). Examples of simple elements and their XML are below:

Sample XSD Sample XML
<xs:element name="Customer_dob"

			type="xs:date"/> 
<Customer_dob>
2000-01-12T12:13:14Z</Customer_dob>
<xs:element name="Customer_address"

			type="xs:string"/>
<Customer_address>

			99 London Road

			</Customer_address>
<xs:element name="OrderID"

			type="xs:int"/>
<OrderID>

			5756

			</OrderID>
<xs:element name="Body" type="xs:string"/> <Body> (a type can be defined as a string but not have any content, this is not true of all data types however).</Body>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

The previous XSD shown graphically using Liquid XML Studio

The value the element takes in the XML document can further be affected using the fixed and default properties.

Default means that if no value is specified in the XML document then the application reading the document (typically an XML parser or XML Data binding Library) should use the default specified in the XSD.

Fixed means the value in the XML document can only have the value specified in the XSD.

For this reason it does not make sense to use both default and fixed in the same element definition (in fact it is illegal to do so).

<xs:element name="Customer_name" type="xs:string" default="unknown"/>
<xs:element name="Customer_location" type="xs:string" fixed=" UK"/> 
The previous XSD shown graphically using Liquid XML Studio The previous XSD shown graphically using Liquid XML Studio

Cardinality

Specifying how many times an element can appear is referred to as cardinality, and is specified using the attributes minOccurs and maxOccurs. In this way, an element can be mandatory, optional, or appear many times. minOccurs can be assigned any non-negative integer value (e.g. 0, 1, 2, 3... etc.), and maxOccurs can be assigned any non-negative integer value or the string constant "unbounded" meaning no maximum.
The default values for minOccurs and maxOccurs is 1 . So if both the minOccurs and maxOccurs attributes are absent, as in all the previous examples, the element must appear once and once only.

Sample XSD Description
<xs:element name="Customer_dob"

			type="xs:date"/> 
If we don't specify minOccurs or maxOccurs, then the default values of 1 are used, so in this case there has to be one and only one occurrence of Customer_dob
<xs:element name="Customer_order"

			type="xs:integer"

			minOccurs ="0"

			maxOccurs="unbounded"/>
Here, a customer can have any number of Customer_orders (even 0)
<xs:element name="Customer_hobbies"

			type="xs:string"

			minOccurs="2"

			maxOccurs="10"/>
In this example, the element Customer_hobbies must appear at least twice, but no more than 10 times

The previous XSD shown graphically using Liquid XML Studio

Simple Types

So far, we have touched on a few of the built in data types xs:string, xs:integer, xs:date. But you can also define your own types by modifying the existing ones.

Examples of this would be:

  • Defining an ID, this may be an integer<code> with a max limit.
  • A PostCode or Zip code could be restricted to ensure it is the correct length and complies with a regular expression.
  • A field may have a maximum length

Creating your own types is covered more thoroughly in the next section

Complex Types

A complex type is a container for other element definitions; this allows you to specify which child elements an element can contain. This allows you to provide some structure within your XML documents.

Have a look at these simple elements:

<xs:element name="Customer" type="xs:string"/>
<xs:element name="Customer_dob" type="xs:date"/>
<xs:element name="Customer_address" type="xs:string"/>

<xs:element name="Supplier" type="xs:string"/>
<xs:element name="Supplier_phone" type="xs:integer"/>
<xs:element name="Supplier_address" type="xs:string"/> 

We can see that some of these elements should really be represented as child elements, "Customer_dob" and "Customer_address" belong to a parent element – "Customer". While "Supplier_phone" and "Supplier_address" belong to a parent element "Supplier". We can therefore re-write this in a more structured way:

<xs:element name="Customer">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="Dob" type="xs:date" />
                <xs:element name="Address" type="xs:string" />
            </xs:sequence>
        </xs:complexType>
</xs:element>

<xs:element name="Supplier">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Phone" type="xs:integer"/>
            <xs:element name="Address" type="xs:string"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>

The previous XSD shown graphically using Liquid XML Studio

Example XML

<Customer>
    <Dob> 2000-01-12T12:13:14Z </Dob>
    <Address> 34 thingy street, someplace, sometown, w1w8uu </Address>
</Customer>

<Supplier>
    <Phone>0123987654</Phone>
    <Address>22 whatever place, someplace, sometown, ss1 6gy </Address>
</Supplier>

What's changed?

Let's look at this in detail.

  • We created a definition for an element called "Customer".
  • Inside the <xs:element> definition we added a <xs:complexType>. This is a container for other <xs:element> definitions, allowing us to build a simple hierarchy of elements in the resulting XML document.
  • Note the contained elements for "Customer" and "Supplier" do not have a type specified as they do not extend or restrict an existing type, they are a new definition built from scratch.
  • The <xs:complexType> element contains another new element <xs:sequence>, but more on these in a minute.
  • The <xs:sequence> in turn contains the definitions for the 2 child elements "Dob" and "Address". Note the customer/supplier prefix has been removed as it is implied from its position within the parent element "Customer" or "Supplier".

So. in English this is saying we can have an XML document that contains an element <Customer> which must have 2 child elements <Dob> and <Address>.

Compositors

There are 3 types of compositors <xs:sequence>, <xs:choice> and <xs:all>. These compositors allow us to determine how the child elements within them appear within the XML document.

Compositor Description
Sequence The child elements in the XML document MUST appear in the order they are declared in the XSD schema.
Choice Only one of the child elements described in the XSD schema can appear in the XML document.
All The child elements described in the XSD schema can appear in the XML document in any order.

Notes

The compositors <xs:sequence> and <xs:choice> can be nested inside other compositors, and be given their own minOccurs and maxOccurs properties. This allows for quite complex combinations to be formed.

One step further… The definition of "Customer->Address" and "Supplier->Address" are currently not very usable as they are grouped into a single field. In the real world it would be better to break this out into a few fields. Let's fix this by breaking it out using the same technique shown above:

  <xs:element name="Customer">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="Dob" type="xs:date" />
        <xs:element name="Address">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="Line1" type="xs:string" />
              <xs:element name="Line2" type="xs:string" />
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>


  <xs:element name="Supplier">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="Phone" type="xs:integer" />
        <xs:element name="Address">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="Line1" type="xs:string" />
              <xs:element name="Line2" type="xs:string" />
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

The previous XSD shown graphically using Liquid XML Studio

This is much better, but we now have 2 definitions for Address, which are the same.

Re-use

It would make much more sense to have 1 definition of "Address", that could be used by both Customer and Supplier.
We can do this by defining a complexType independently of an element, and giving it a unique name :

<xs:complexType name="AddressType">
    <xs:sequence>
        <xs:element name="Line1" type="xs:string"/>
        <xs:element name="Line2" type="xs:string"/>
    </xs:sequence>
</xs:complexType> 

The previous XSD shown graphically using Liquid XML Studio

We have now defined a <xs:complexType> that describes our representation of an Address, so let's use it.

Remember when we started looking at elements and we said you could define your own type instead of using one of the standard ones (xs:string, xs:integer), well that's exactly what were doing now.

<xs:element name="Customer">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Dob" type="xs:date"/>
            <xs:element name="Address" type="AddressType"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:element name="supplier">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Phone" type="xs:integer"/>
            <xs:element name="Address" type="AddressType"/>
        </xs:sequence>
    </xs:complexType>
</xs:element> 

The previous XSD shown graphically using Liquid XML Studio

The advantage should be obvious, instead of having to define Address twice (once for Customer and once for Supplier) we have a single definition. This makes maintenance simpler i.e. if you decide to add "Line3" or "Postcode" elements to your address, you only have to add them in one place.

Example XML

<Customer>
    <Dob> 2000-01-12T12:13:14Z </Dob>
    <Address>
        <Line1>34 thingy street, someplace</Line1>
        <Line2>sometown, w1w8uu </Line2>
    </Address>
</Customer>

<Supplier>
    <Phone>0123987654</Phone>
    <Address>
        <Line1>22 whatever place, someplace</Line1>
        <Line2>sometown, ss1 6gy </Line2>
    </Address>
</Supplier>

Note: Only complex types defined globally (as children of the <xs:schema> element can have their own name and be re-used throughout the schema). If they are defined inline within an <xs:element> they cannot have a name (anonymous) and cannot be reused elsewhere.

Attributes

An attribute provides extra information within an element. Attributes are defined within an XSD as follows, having name and type properties.

<xs:attribute name="x" type="y"/>

An Attribute can appear 0 or 1 times within a given element in the XML document. Attributes are either optional or mandatory (by default, they are optional). The "use" property in the XSD definition is used to specify if the attribute is optional or mandatory.

So the following are equivalent:

<xs:attribute name="ID" type="xs:string"/>
<xs:attribute name="ID" type="xs:string" use="optional"/>

The previous XSD shown graphically using Liquid XML Studio

To specify that an attribute must be present, use <code>= "required" (Note: use may also be set to "prohibited", but we'll come to that later).

An attribute is typically specified within the XSD definition for an element, this ties the attribute to the element. Attributes can also be specified globally and then referenced (but more about this later).

Sample XSD Sample XML

<xs:element name="Order">

<xs:complexType>

			<xs:attribute name="OrderID"

			type="xs:int"/>

			</xs:complexType>

			</xs:element> 

<Order OrderID="6"/>
or
<Order/>
<xs:element name="Order">

			<xs:complexType>

			<xs:attribute name="OrderID"

			type="xs:int"

			use="optional"/>

			</xs:complexType>

			</xs:element> 
<Order OrderID="6"/> or
<Order/>
<xs:element name="Order">

			<xs:complexType>

			<xs:attribute name="OrderID"

			type="xs:int"

			use="required"/>

			</xs:complexType>

			</xs:element> 
<Order OrderID="6"/>

The default and fixed attributes can be specified within the XSD attribute specification (in the same way as they are for elements).

Mixed Element Content

So far we have seen how an element can contain data, other elements or attributes. Elements can also contain a combination of all of these. You can also mix elements and data. You can specify this in the XSD schema by setting the mixed property.

<xs:element name="MarkedUpDesc">
    <xs:complexType mixed="true">
        <xs:sequence>
            <xs:element name="Bold" type="xs:string" />
            <xs:element name="Italic" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element> 

A sample XML document could look like this.

<MarkedUpDesc>
    This is an <Bold>Example</Bold> of <Italic>Mixed</Italic> Content,
    Note there are elements mixed in with the elements data.
</MarkedUpDesc>

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Simon Sprott
Software Developer (Senior) Liquid Technologies
United Kingdom United Kingdom
No Biography provided

Comments and Discussions

 
QuestionWhere is Part 3 of 5 - Extending Existing Types? PinmemberMy Name Is GUID5-Jul-14 0:36 
GeneralMy vote of 5 PinprofessionalMB Seifollahi3-Jul-14 22:17 
QuestionThe Namespaces section is broken at the moment PinmemberWray Smallwood24-Apr-14 7:28 
GeneralMy vote of 5 Pinmemberusrikanthvarma8-Feb-13 21:11 
GeneralMy vote of 5 Pinmemberahm4me22-Jan-13 20:38 
GeneralBad Links PinmemberMember 399587312-Mar-08 5:27 
GeneralExcellent! Pinmemberldyc6-Nov-07 7:45 
GeneralRe: Excellent! Pinmembersachinyadav14220-Sep-08 3:49 
GeneralMerging XSD Pinmembergauravpatel8520-Jun-07 5:25 
GeneralRe: Merging XSD PinmemberSprotty20-Jun-07 7:25 
GeneralRe: Merging XSD PinmemberMuaddubby31-Jul-08 6:31 
GeneralRe: Merging XSD PinmemberSprotty31-Jul-08 23:33 
GeneralRe: Merging XSD PinmemberMuaddubby1-Aug-08 1:50 
GeneralOverriding extended xsd element Pinmembergauravpatel8520-Jun-07 1:28 
GeneralRe: Overriding extended xsd element PinmemberSprotty20-Jun-07 3:02 
GeneralRe: Overriding extended xsd element Pinmembergauravpatel8520-Jun-07 3:33 
GeneralRe: Overriding extended xsd element PinmemberSprotty20-Jun-07 3:43 
QuestionWhy XSD? PinmemberStan F. Form25-Apr-07 2:21 
AnswerRe: Why XSD? PinmemberSprotty28-Apr-07 11:21 
GeneralRe: Why XSD? PinmemberStan F. Form28-Apr-07 21:46 
GeneralExcellent Work PinmemberAmr Aly Elsehemy23-Apr-07 23:29 
GeneralNice article, one comment PinmemberEnnis Ray Lynch, Jr.18-Apr-07 5:44 
GeneralRe: Nice article, one comment PinmemberSimon Sprott22-Apr-07 3:36 
GeneralA nice clear tutorial, but a small typo PinmemberJeffrey Watkins17-Apr-07 5:58 
GeneralRe: A nice clear tutorial, but a small typo PinmemberSimon Sprott17-Apr-07 13:17 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140721.1 | Last Updated 3 Jul 2014
Article Copyright 2007 by Simon Sprott
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid