![]() |
Languages »
C / C++ Language »
General
Beginner
License: The Code Project Open License (CPOL)
XSD Tutorial - Part 1 of 5 - Elements and AttributesBy Simon SprottThis article gives a basic overview of the building blocks underlying XML Schemas. |
XML, Windows, Visual-Studio, Dev
|
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||
This article gives a basic overview of the building blocks underlying XML Schemas and how to use them. It covers:
First let's look at what an XML schema is. A schema formally describes what a given XML document contains, in the same way a database schema describes the data that can be contained in a database (table structure, data types). An XML schema describes the coarse shape of the XML document, what fields an element can contain, which sub elements it can contain etc. It can also describe the values that can be placed into any element or attribute.
The XSD standard has evolved over a number of years, and is controlled by the W3C. It is extremely comprehensive, and as a result has become rather complex. For this reason, it is a good idea to make use of design tools when working with XSD's (See XML Studio, a FREE XSD development tool), also when working with XML documents programmatically XML Data Binding is a much easier way to manipulate your documents (an object oriented approach - see Liquid XML Data Binding).
The remainder of this tutorial guides you through the basics of the XSD standard, things you should really know even if you are using a design tool like Liquid XML Studio.
Elements are the main building block of any XML document, they contain the data and determine the structure of the document. An element can be defined within an XML Schema (XSD) as follows:
<xs:element name="x" type="y"/>
An element definition within the XSD must have a name property, this is the name that will appear in the XML document. The type property provides the description of what can be contained within the element when it appears in the XML document. There are a number of predefined types, such as xs:string, xs:integer, xs:boolean or xs:date (see XSD standard for a complete list). You can also create a user defined type using the <xs:simple type> and <xs:complexType> tags, but more on these later.
If we have set the type property for an element in the XSD, then the corresponding value in the XML document must be in the correct format for its given type (failure to do this will cause a validation error). Examples of simple elements and their XML are below:
| Sample XSD | Sample XML |
<xs:element name="Customer_dob" |
<Customer_dob> 2000-01-12T12:13:14Z</Customer_dob> |
<xs:element name="Customer_address" |
<Customer_address> |
<xs:element name="OrderID" |
<OrderID> |
<xs:element name="Body" type="xs:string"/> |
<Body> (a type can be defined as a string but not have any content, this is not true of all data types however).</Body> |
The previous XSD definitions are shown graphically in Liquid XML Studio as follows
The value the element takes in the XML document can further be affected using the fixed and default properties.
Default means that if no value is specified in the XML document then the application reading the document (typically an XML parser or XML Data binding Library) should use the default specified in the XSD.
Fixed means the value in the XML document can only have the value specified in the XSD.
For this reason it does not make sense to use both default and fixed in the same element definition (in fact it is illegal to do so).
<xs:element name="Customer_name" type="xs:string" default="unknown"/>
<xs:element name="Customer_location" type="xs:string" fixed=" UK"/>
![]() |
![]() |
Specifying how many times an element can appear is referred to as cardinality, and is specified using the attributes minOccurs and maxOccurs. In this way, an element can be mandatory, optional, or appear many times. minOccurs can be assigned any non-negative integer value (e.g. 0, 1, 2, 3... etc.), and maxOccurs can be assigned any non-negative integer value or the string constant "unbounded" meaning no maximum.
The default values for minOccurs and maxOccurs is 1 . So if both the minOccurs and maxOccurs attributes are absent, as in all the previous examples, the element must appear once and once only.
| Sample XSD | Description |
<xs:element name="Customer_dob" |
If we don't specify minOccurs or maxOccurs, then the default values of 1 are used, so in this case there has to be one and only one occurrence of Customer_dob |
<xs:element name="Customer_order" |
Here, a customer can have any number of Customer_orders (even 0) |
<xs:element name="Customer_hobbies" |
In this example, the element Customer_hobbies must appear at least twice, but no more than 10 times |
So far, we have touched on a few of the built in data types xs:string, xs:integer, xs:date. But you can also define your own types by modifying the existing ones.
Examples of this would be:
ID, this may be an integer<code> with a max limit. Creating your own types is covered more thoroughly in the next section
A complex type is a container for other element definitions; this allows you to specify which child elements an element can contain. This allows you to provide some structure within your XML documents.
Have a look at these simple elements:
<xs:element name="Customer" type="xs:string"/>
<xs:element name="Customer_dob" type="xs:date"/>
<xs:element name="Customer_address" type="xs:string"/>
<xs:element name="Supplier" type="xs:string"/>
<xs:element name="Supplier_phone" type="xs:integer"/>
<xs:element name="Supplier_address" type="xs:string"/>
We can see that some of these elements should really be represented as child elements, "Customer_dob" and "Customer_address" belong to a parent element – "Customer". While "Supplier_phone" and "Supplier_address" belong to a parent element "Supplier". We can therefore re-write this in a more structured way:
<xs:element name="Customer">
<xs:complexType>
<xs:sequence>
<xs:element name="Dob" type="xs:date" />
<xs:element name="Address" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Supplier">
<xs:complexType>
<xs:sequence>
<xs:element name="Phone" type="xs:integer"/>
<xs:element name="Address" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<Customer>
<Dob> 2000-01-12T12:13:14Z </Dob>
<Address> 34 thingy street, someplace, sometown, w1w8uu </Address>
</Customer>
<Supplier>
<Phone>0123987654</Phone>
<Address>22 whatever place, someplace, sometown, ss1 6gy </Address>
</Supplier>
Let's look at this in detail.
Customer". <xs:element> definition we added a <xs:complexType>. This is a container for other <xs:element> definitions, allowing us to build a simple hierarchy of elements in the resulting XML document. Customer" and "Supplier" do not have a type specified as they do not extend or restrict an existing type, they are a new definition built from scratch. <xs:complexType> element contains another new element <xs:sequence>, but more on these in a minute. <xs:sequence> in turn contains the definitions for the 2 child elements "Dob" and "Address". Note the customer/supplier prefix has been removed as it is implied from its position within the parent element "Customer" or "Supplier". So. in English this is saying we can have an XML document that contains an element <Customer> which must have 2 child elements <Dob> and <Address>.
There are 3 types of compositors <xs:sequence>, <xs:choice> and <xs:all>. These compositors allow us to determine how the child elements within them appear within the XML document.
| Compositor | Description |
Sequence |
The child elements in the XML document MUST appear in the order they are declared in the XSD schema. |
Choice |
Only one of the child elements described in the XSD schema can appear in the XML document. |
All |
The child elements described in the XSD schema can appear in the XML document in any order. |
The compositors <xs:sequence> and <xs:choice> can be nested inside other compositors, and be given their own minOccurs and maxOccurs properties. This allows for quite complex combinations to be formed.
One step further… The definition of "Customer->Address" and "Supplier->Address" are currently not very usable as they are grouped into a single field. In the real world it would be better to break this out into a few fields. Let's fix this by breaking it out using the same technique shown above:
<xs:element name="Customer">
<xs:complexType>
<xs:sequence>
<xs:element name="Dob" type="xs:date" />
<xs:element name="Address">
<xs:complexType>
<xs:sequence>
<xs:element name="Line1" type="xs:string" />
<xs:element name="Line2" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Supplier">
<xs:complexType>
<xs:sequence>
<xs:element name="Phone" type="xs:integer" />
<xs:element name="Address">
<xs:complexType>
<xs:sequence>
<xs:element name="Line1" type="xs:string" />
<xs:element name="Line2" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
This is much better, but we now have 2 definitions for Address, which are the same.
It would make much more sense to have 1 definition of "Address", that could be used by both Customer and Supplier.
We can do this by defining a complexType independently of an element, and giving it a unique name :
<xs:complexType name="AddressType">
<xs:sequence>
<xs:element name="Line1" type="xs:string"/>
<xs:element name="Line2" type="xs:string"/>
</xs:sequence>
</xs:complexType>
We have now defined a <xs:complexType> that describes our representation of an Address, so let's use it.
Remember when we started looking at elements and we said you could define your own type instead of using one of the standard ones (xs:string, xs:integer), well that's exactly what were doing now.
<xs:element name="Customer">
<xs:complexType>
<xs:sequence>
<xs:element name="Dob" type="xs:date"/>
<xs:element name="Address" type="AddressType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="supplier">
<xs:complexType>
<xs:sequence>
<xs:element name="Phone" type="xs:integer"/>
<xs:element name="Address" type="AddressType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The advantage should be obvious, instead of having to define Address twice (once for Customer and once for Supplier) we have a single definition. This makes maintenance simpler i.e. if you decide to add "Line3" or "Postcode" elements to your address, you only have to add them in one place.
<Customer>
<Dob> 2000-01-12T12:13:14Z </Dob>
<Address>
<Line1>34 thingy street, someplace</Line1>
<Line2>sometown, w1w8uu </Line2>
</Address>
</Customer>
<Supplier>
<Phone>0123987654</Phone>
<Address>
<Line1>22 whatever place, someplace</Line1>
<Line2>sometown, ss1 6gy </Line2>
</Address>
</Supplier>
Note: Only complex types defined globally (as children of the <xs:schema> element can have their own name and be re-used throughout the schema). If they are defined inline within an <xs:element> they cannot have a name (anonymous) and cannot be reused elsewhere.
An attribute provides extra information within an element. Attributes are defined within an XSD as follows, having name and type properties.
<xs:attribute name="x" type="y"/>
An Attribute can appear 0 or 1 times within a given element in the XML document. Attributes are either optional or mandatory (by default, they are optional). The "use" property in the XSD definition is used to specify if the attribute is optional or mandatory.
So the following are equivalent:
<xs:attribute name="ID" type="xs:string"/>
<xs:attribute name="ID" type="xs:string" use="optional"/>
To specify that an attribute must be present, use <code>= "required" (Note: use may also be set to "prohibited", but we'll come to that later).
An attribute is typically specified within the XSD definition for an element, this ties the attribute to the element. Attributes can also be specified globally and then referenced (but more about this later).
| Sample XSD | Sample XML |
|
|
<Order OrderID="6"/> or <Order/> |
<xs:element name="Order"> |
<Order OrderID="6"/> or<Order/> |
<xs:element name="Order"> |
<Order OrderID="6"/> |
The default and fixed attributes can be specified within the XSD attribute specification (in the same way as they are for elements).
So far we have seen how an element can contain data, other elements or attributes. Elements can also contain a combination of all of these. You can also mix elements and data. You can specify this in the XSD schema by setting the mixed property.
<xs:element name="MarkedUpDesc">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="Bold" type="xs:string" />
<xs:element name="Italic" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
A sample XML document could look like this.
<MarkedUpDesc>
This is an <Bold>Example</Bold> of <Italic>Mixed</Italic> Content,
Note there are elements mixed in with the elements data.
</MarkedUpDesc>
General
News
Question
Answer
Joke
Rant
Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads.
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 16 Apr 2007 Editor: Sean Ewington |
Copyright 2007 by Simon Sprott Everything else Copyright © CodeProject, 1999-2010 Web17 | Advertise on the Code Project |