Generic Data Points Series XML format and its validated loading with LINQ to XML






4.67/5 (3 votes)
How to express a series of generic data points in XML and read them without much pain.
Foreword
It's often desirable to provide a generic data point series data in some XML format. XML gives the ability to decorate a series with attributes, nest them in complex data objects, mix different data series/data objects in one data file, and load them with concise LINQ to XML code.
A generic data point is a structure with just two required {X, Y} properties expressing the point position in 2D space. Each point dimension has its own "Base Type", e.g., numeric, DateTime
, etc. That's why the term "generic" is applied. The Data Point object type is defined by the pair of these Base Types. There is a lot of point types we can express in XML. For example, the basic set of Data Point types is produced by the Cartesian self-product of all XML atomic types. This basic set could be extended by the inclusion of XPath data types, simple types derived from XML simple types by restrictions, etc.
A Data Point Series contains one or more Data Points of the same type. The Data Point Series type is defined by the type of Data Points it contains.
One Data Point Series XML document could contain multiple Data Point Series of different types. Every application can pose its own requirements on the Data Point Series types it will accept or forbid. The Loader library must be able to validate the file format against the following list of requirements:
- Ensure that all the Data Points in the Data Point Series have the same type.
- Restrict the list of Data Point Series types it contains.
- Validate the Data Point dimension values against the XSD namespace and, optionally, other namespaces (like XPath) where the Base Types are defined.
- More...
Some of these requirements are application-specific, so the application must provide the Loader
class with the appropriate information in some way.
We'll use an XML schema to validate the content of a Data Point Series XML document. This approach gives the following opportunities:
- Abstract the
Loader
code from the features which could be described in terms of the XML schema. This allows theLoader
code to be both generic and concise. We'll use LINQ to XML to load the data. - Pass the XML schema data to the
Loader
class in one form or another. TheLoader
class can use: - The default schema stored in the library. This is the easy-to-use option, but it suffers from the lack of configurability and extensibility.
- Dynamically generated schema based on type mappings (see below for details). This option allows to define the list of expected Data Point Series types, but limits (at the present time) the list of Base Types by the XML schema's atomic types.
- User-provided schema. This is the most powerful option, but the user should be aware of the XML schema language.
Generic Data Point Series XML Format
First, we have to define the root element. Suppose it is called Items
. For the sake of safety, we'll require it to define the default XML namespace urn:PointSeries-schema
. The root element will look like that:
<?xml version="1.0" encoding="utf-8"?>
<Items xmlns="urn:PointSeries-schema">
...
</Items>
The root element contains an unrestricted number of Data Point Series. First, we will try to define point series elements as follows:
<Items xmlns="urn:PointSeries-schema">
<Points ...>
</Points>
<Points ...>
</Points>
...
</Items>
That won't work because different Data Point Series elements could contain points of different Base Types and, so, the Data Point Series elements themselves could be of different types. XML schema rules don't allow elements of different types to have the same name in the same scope. Hence, we must assign different names to Data Point Series elements of different types.
So, we name the Data Point Series elements according to the following patterns:
<Points.BaseType ...>
if both data series dimensions have the same Base Type. E.g.,<Points.Double ...>
.<Points.XBaseType.YBaseType ...>
if data series dimensions have different Base Types. E.g.,<Points.DateTime.Int ...>
.
BaseType
, XBaseType
, and YBaseType
Data Point Series element name parts are collectively called "type strings". It's necessary to draw an agreement on how to define these type strings, and establish the mapping between the type strings, XSD-defined data types, and CLR data types.
XSD Type | Description | Examples | Type string | .NET type |
---|---|---|---|---|
xsd:int |
An integer that can be represented as a four-byte, two's complement number | -2147483648, 2147483645,..., -3, -2, -1, 0, 1, 2, 3, ... | Int |
System.Int32 |
xsd:double |
IEEE 754 64-bit floating-point number | -INF, 1.401E-90, -1E4, -0, 0, 12.78E-2, 12, INF, NaN, 3.4E42 | Double |
System.Double |
xsd:dateTime |
A particular moment in Coordinated Universal Time, up to an arbitrarily small fraction of a second | 1999-05-31T13:20:00.000-05:00, 1999-05-31T18:20:00.000Z, 1999-05-31T13:20:00.000, 1999-05-31T13:20:00.000-05:32 | DateTime |
System.DateTime |
xsd:date |
A specific day in history | 0044-03-15, 0001-01-01, 1969-06-27, 2000-10-31, 2001-11-17 | Date |
System.DateTime |
xsd:gMonth |
A month in no particular year | --01--, --02--, --03--,..., --09--, --10--, --11--, --12-- | Month |
System.Int32 |
This table contains a partial list of XSD simple types. You can extend it by including other XML types.
According to the mapping above, for example, the <Points.Double ...>
Data Point Series XML element should contain Data Points of xsd:double
type for both x and y dimensions, and these points will be loaded as points with System.Double
x, y properties.
The Point
element itself is something like <Point x="2008-01-01" y="-20"/>
with the required x and y attributes.
Shown below is the excerpt from the example input XML data file:
<?xml version="1.0" encoding="utf-8"?>
<Items xmlns="urn:PointSeries-schema">
<Points.Int.Double YName="y=x^2">
<Point x="0" y="0"/>
<Point x="1" y="0.01"/>
...
</Points.Int.Double>
<Points.Date.Int YName="temperature" XName="Date">
<Point x="2008-01-01" y="-20"/>
<Point x="2008-02-01" y="-25"/>
...
</Points.Date.Int>
<Points.Month.Double YName="2008 year month temperatures" XName="Month">
<Point x="--01--" y="-20.8"/>
<Point x="--02--" y="-25.2"/>
...
</Points.Month.Double>
...
</Items>
Note: the point series elements are decorated with optional YName
and XName
attributes intending to represent x and y dimension labels.
XML Schema
A generic Data Point Series XML format is defined by an XML schema whose excerpt follows:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- Root element -->
<xs:element name="Items" type="itemsType"/>
<!-- Root element type -->
<xs:complexType name="itemsType">
<xs:choice maxOccurs="unbounded">
<xs:element name="Points.Int" type="pointsIntIntType"/>
<xs:element name="Points.Int.DateTime" type="pointsIntDttmType"/>
...
<xs:element name="Points.Double" type="pointsDblDblType"/>
<xs:element name="Points.Double.Int" type="pointsDblIntType"/>
...
</xs:choice>
</xs:complexType>
<!-- Point Series Type attributes -->
<xs:attributeGroup name="pointSetAttributes">
<xs:attribute name="YName"
type="xs:string" use="optional" />
<xs:attribute name="XName"
type="xs:string" use="optional" />
</xs:attributeGroup>
<!-- Point Series Types -->
<xs:complexType name="pointsIntIntType">
<xs:sequence>
<xs:element minOccurs="1"
maxOccurs="unbounded" name="Point">
<xs:complexType>
<xs:attribute name="x"
type="xs:int" use="required" />
<xs:attribute name="y"
type="xs:int" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attributeGroup ref="pointSetAttributes"/>
</xs:complexType>
<xs:complexType name="pointsIntDttmType">
<xs:sequence>
<xs:element minOccurs="1"
maxOccurs="unbounded" name="Point">
<xs:complexType>
<xs:attribute name="x"
type="xs:int" use="required" />
<xs:attribute name="y"
type="xs:dateTime" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attributeGroup ref="pointSetAttributes"/>
</xs:complexType>
...
<xs:complexType name="pointsDblIntType">
<xs:sequence>
<xs:element minOccurs="1"
maxOccurs="unbounded" name="Point">
<xs:complexType>
<xs:attribute name="x"
type="xs:double" use="required" />
<xs:attribute name="y"
type="xs:int" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attributeGroup ref="pointSetAttributes"/>
</xs:complexType>
<xs:complexType name="pointsDblDblType">
<xs:sequence>
<xs:element minOccurs="1"
maxOccurs="unbounded" name="Point">
<xs:complexType>
<xs:attribute name="x"
type="xs:double" use="required" />
<xs:attribute name="y"
type="xs:double" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attributeGroup ref="pointSetAttributes"/>
</xs:complexType>
...
</xs:schema>
This schema defines the <Items ...>
root element whose expected contents is defined by the XSD choice
selector. You should modify the contents of the selector to just those Data Point Series element types your application expects.
The rest of the schema contains the long list of element type definitions. Each of these types defines the Data Point Series with specific x, y Base Types.
You can define new Base Types in the schema using XML Schema type derivation rules.
Type Mapping
Writing or editing the Data Points Series XML schema by hand is tedious, and requires a knowledge of the XML schema specification (see part1, part2).
Instead, the schema could be composed on the fly. If you look at the schema excerpt above, you'll see that most of the text is repeated from one type definition to another. The information which varies from one schema to another can be expressed in a much more compact form than the schema itself. All that is required to compose the schema is data like those in Table 1. We should describe the Data Point Series types by defining the Base Types and the mapping between the XSD and CLR types along with the "type string" used to construct the Data Point Series XML element tag name.
That is an example type mapping XML document excerpt:
<?xml version="1.0" encoding="utf-8"?>
<Mappings xmlns="urn:PointSeries-mapping">
<Mapping>
<XAxis xsd-type="int" clr-type="System.Int32" type-string="Int"/>
<YAxis xsd-type="double" type-string="Double"/>
</Mapping>
<Mapping>
<XAxis xsd-type="double" clr-type="System.Double" type-string="Double"/>
<YAxis xsd-type="date" clr-type="System.DateTime" type-string="Date"/>
</Mapping>
<Mapping>
<XAxis xsd-type="double" clr-type="System.Double" type-string="Double"/>
<YAxis xsd-type="gMonth" clr-type="System.Int32" type-string="Month"/>
</Mapping>
...
<Mapping>
<XAxis xsd-type="dateTime" type-string="DateTime"/>
<YAxis xsd-type="double" type-string="Double"/>
</Mapping>
</Mappings>
The root Mappings
element declares the urn:PointSeries-mapping
XML namespace. It could contains one or more Mapping
elements.
A Mapping
element defines a Data Point Series type. It contains exactly two elements: XAxis
for the x dimension, and YAxis
for the y dimension.
Every ...Axis
element defines the type of the dimension in the world of XML (xsd-type
) and the world of .NET (clr-type
). The type-string
attribute provides the name used to compose the name of the Data Point Series element in the data XML file. For example, the first mapping element in the snippet above will produce the type definition for the <Points.Int.Double>
element. The xsd-type
and the type-string
attributes are required, and the clr-type
attribute is optional. If it's missed, then the CLR type is deduced from the XSD type to CLR type default mapping table hardcoded into the Loader library (it's the same mapping as .NET uses, see Mapping XML Data Types to CLR Types). If it's present, then the Loader will attempt to convert the value of the XSD type to the CLR type specified. For example, see the third Mapping
element. The default CLR type for the gMonth
XSD type is DateTime
, but the clr-type
attribute value is Int32
. The Loader will convert the value of the gMonth
type to Int32
with the help of the XML Converter class instance, see below. Note that the clr-type
attribute value could contain the full assembly-qualified type name.
The mapping file must not contain contradictory entries: it must not define two Data Point Series elements with the same element names.
The mapping document is validated against the following schema:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
targetNamespace="urn:PointSeries-mapping"
xmlns="urn:PointSeries-mapping"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Mappings">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Mapping" type="mappingType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="axisType">
<xs:attribute name="xsd-type" type="xs:string" use="required" />
<xs:attribute name="clr-type" type="xs:string" use="optional" />
<xs:attribute name="type-string" type="xs:string" use="required" />
</xs:complexType>
<xs:complexType name="mappingType">
<xs:sequence>
<xs:element name="XAxis" type="axisType"/>
<xs:element name="YAxis" type="axisType"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
This schema is stored in the Loader library assembly as a resource.
Reading the Data
In the code attached to the article, all data reading code is placed into the Loader class library project producing the XmlDataPointSeries.Loader assembly. The Loader
class contains the data reading/parsing code, and the supplementary classes, XsdDataPoint
, DataPoint
, and DataPointSeries
provide the place to store the results.
/// <summary>
/// Loads <see cref="DataPointSeries"/> collection from an XML file or Stream.
/// </summary>
/// <remarks>
/// Contents of the Data Point Series XML document is validated against a XML schema.
/// <para>That schema is either
/// <list type="number">
/// <item>Prebuilt and stored as the resource.</item>
/// <item>Provided by the user.</item>
/// <item>Dynamically constructed from the contents of XML mapping data.</item>
/// </list>
/// </para>
/// </remarks>
public static class Loader
{
// XML namespace must be used in XML data files.
internal const string dataNamespaceName = "urn:PointSeries-schema";
#region LoadWithSchema
/// <summary>
/// Loads a <see cref="DataPointSeries"/> collection from
/// the <paramref name="dataReader"/> specified with
/// the XML schema provided by <paramref name="schemaReader"/>.
/// </summary>
/// <param name="dataReader">XML DataPointSeries collection
/// <see cref="System.IO.TextReader"/>.</param>
/// <param name="schemaReader">XML schema <see cref="System.IO.TextReader"/>.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries>
LoadWithSchema(TextReader dataReader, XmlReader schemaReader){ ... }
/// <summary>
/// Loads <see cref="DataPointSeries"/> collection from
/// the <paramref name="dataStream"/> specified with
/// the XML schema provided by <paramref name="schemaStream"/>.
/// </summary>
/// <param name="dataStream">Input XML data <see cref="System.IO.Stream"/>.</param>
/// <param name="schemaStream">Input XML schema <see cref="System.IO.Stream"/>.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries>
LoadWithSchema(Stream dataStream, Stream schemaStream) { ... }
/// <summary>
/// Loads <see cref="DataPointSeries"/> collection from the
/// <paramref name="dataFileName"/> file specified with
/// the XML schema provided by <paramref name="schemaFileName"/>.
/// </summary>
/// <param name="dataFileName">Name of the data file.</param>
/// <param name="schemaFileName">Name of the schema file.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries>
LoadWithSchema(string dataFileName, string schemaFileName) { ... }
/// <summary>
/// Loads a <see cref="DataPointSeries"/> collection from
/// the XML data <paramref name="dataReader"/>
/// specified with prebuilt XML schema.
/// </summary>
/// <param name="dataReader">DataPointSeries collection
/// XML <see cref="System.IO.TextReader"/>.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries> LoadWithSchema(TextReader dataReader) { ... }
/// <summary>
/// Loads a <see cref="DataPointSeries"/> collection
/// from the XML data file specified with prebuilt XML schema.
/// </summary>
/// <param name="dataStream">Input XML data <see cref="System.IO.Stream"/>.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries> LoadWithSchema(Stream dataStream) { ... }
/// <summary>
/// Loads a <see cref="DataPointSeries"/> collection
/// from the XML data file specified with the prebuilt XML schema.
/// </summary>
/// <param name="fileName">DataPointSeries collection XML file Name.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries> LoadWithSchema(string fileName) { ... }
#endregion LoadWithSchema
#region LoadWithMappings
/// <summary>
/// Loads a <see cref="DataPointSeries"/> collection
/// from the <paramref name="dataReader"/>
/// specified with the mappings provided by the <paramref name="mappingReader"/>.
/// </summary>
/// <param name="dataReader">Input XML data
/// <see cref="System.IO.TextReader"/>.</param>
/// <param name="mappingReader">Input XML mapping
/// <see cref="System.IO.TextReader"/>.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries>
LoadWithMappings(TextReader dataReader, TextReader mappingReader) { ... }
/// <summary>
/// Loads <see cref="DataPointSeries"/> collection from the
/// <paramref name="dataStream"/> specified.
/// </summary>
/// <param name="dataStream">Input XML data
/// <see cref="System.IO.Stream"/>.</param>
/// <param name="mappingStream">Input XML mapping
/// <see cref="System.IO.Stream"/>.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries>
LoadWithMappings(Stream dataStream, Stream mappingStream) { ... }
/// <summary>
/// Loads <see cref="DataPointSeries"/> collection from the file specified.
/// </summary>
/// <param name="dataFileName">Name of the data file.</param>
/// <param name="mappingFileName">Name of the mapping file.</param>
/// <returns><see cref="DataPointSeries"/> collection.</returns>
/// <exception cref="ValidationException"/>
public static IEnumerable<DataPointSeries>
LoadWithMappings(string dataFileName, string mappingFileName) { ... }
#endregion LoadWithMappings
/// <summary>
/// Parses the point series element tag and returns x,y type strings.
/// </summary>
/// <param name="tagName">Tag name.</param>
/// <param name="xType">Output Type of the x-dimension.</param>
/// <param name="yType">Output Type of the y-dimension.</param>
static void getXYTypeStrings(string tagName, out string xType, out string yType)
{
int n = tagName.IndexOf('}');
Debug.Assert(n > 0, "n > 0");
const string pointsTagPrefix = "Points";
int pointsTagPrefixLength = pointsTagPrefix.Length;
Debug.Assert(tagName.Length > n + pointsTagPrefixLength + 1,
"tagName.Length > n + pointsTagPrefixLength + 1");
string xyTypes = tagName.Substring(n + pointsTagPrefixLength + 2);
n = xyTypes.IndexOf('.');
if (n < 0)
{
xType = xyTypes;
yType = xyTypes;
}
else
{
xType = xyTypes.Substring(0, n);
yType = xyTypes.Substring(n + 1);
}
}
}
This class provides the LoadWithSchema
and LoadWithMappings
method overloads to load Data Point Series XML documents validating against either the default or the user-supplied schema, or against the dynamically generated schema.
By design, the LoadWithSchema
and LoadWithMappings
methods fail on any error occurring on file opening, reading, parsing, and validating, and throw either System.IO exceptions or the Loader library ValidationException
containing the error descriptions. All validation errors are returned by the ValidationException.ValidationErrors
property; this gives the user a chance to fix all the errors at once.
Load with Schema
The principal LoadWithSchema
method overload is:
public static IEnumerable<DataPointSeries>
LoadWithSchema(TextReader dataReader, XmlReader schemaReader)
{
StringBuilder sbErrors = null;
List<ValidationException.ValidationError> errors = null;
// Load and validate the schema.
XmlSchema schema = XmlSchema.Read(schemaReader, (sender, e) =>
{
if (sbErrors == null)
sbErrors = new StringBuilder();
sbErrors.AppendFormat(
"Schema validation error: {1}{0}Line={2}, position={3}{0}",
System.Environment.NewLine, e.Exception.Message,
e.Exception.LineNumber, e.Exception.LinePosition);
if (errors == null)
errors = new List<ValidationException.ValidationError>();
errors.Add(new ValidationException.ValidationError()
{
Message = e.Exception.Message,
Line = e.Exception.LineNumber,
Position = e.Exception.LinePosition
});
});
if (sbErrors != null)
// Validation error(s) occured.
throw new ValidationException(sbErrors.ToString(), errors.ToArray());
XmlSchemaSet schemaSet = new XmlSchemaSet();
schemaSet.Add(schema);
// Load and validate the data file.
using (XmlReader reader = XmlReader.Create(dataReader))
{
XDocument doc = XDocument.Load(reader);
doc.Validate(schemaSet, (sender, e) =>
{
if (sbErrors == null)
sbErrors = new StringBuilder();
sbErrors.AppendFormat("Validation error: {1}{0}Line={2}, position={3}{0}"
, System.Environment.NewLine, e.Exception.Message
, e.Exception.LineNumber, e.Exception.LinePosition);
if (errors == null)
errors = new List<ValidationException.ValidationError>();
errors.Add(new ValidationException.ValidationError()
{
Message = e.Exception.Message,
Line = e.Exception.LineNumber,
Position = e.Exception.LinePosition
});
}, true);
if (sbErrors != null)
// Validation error(s) occured.
throw new ValidationException(sbErrors.ToString(), errors.ToArray());
XNamespace xns = dataNamespaceName;
XElement items = doc.Element(xns + "Items");
// Check the root element name (i.e. Items in "urn:PointSeries-schema" xmlns).
//if (items.Name != xns + "Items")
if (items == null)
throw new ValidationException(string.Format("Root element {0} missed",
xns + "Items"));
// Parse the Point.XXX elements.
return items.Elements().Select<XElement, DataPointSeries>(
(item) =>
{
// Parse item tag name for X/Y type strings.
string xType, yType;
getXYTypeStrings(item.Name.ToString(), out xType, out yType);
// Optional attributes.
var yName = item.Attribute("YName");
var xName = item.Attribute("XName");
IXmlSchemaInfo schemaInfo = item.GetSchemaInfo();
XmlSchemaElement e = schemaInfo.SchemaElement;
DataPointSeries series = new DataPointSeries()
{
XName = xName == null ? "" : xName.Value,
XTypeString = xType,
YName = yName == null ? "" : yName.Value,
YTypeString = yType
};
foreach (var pt in from pt in item.Elements(xns + "Point") select pt)
{
XAttribute xAttr = pt.Attribute("x");
if (series.XXsdTypeString == null)
series.XXsdTypeString =
xAttr.GetSchemaInfo().SchemaAttribute.SchemaTypeName.Name;
XAttribute yAttr = pt.Attribute("y");
if (series.YXsdTypeString == null)
series.YXsdTypeString =
yAttr.GetSchemaInfo().SchemaAttribute.SchemaTypeName.Name;
series.XsdPoints.Add(new XsdDataPoint((string)xAttr, (string)yAttr));
}
return series;
});
}
}
At first, this method loads the XML schema with the XmlSchema schema = XmlSchema.Read()
method call. Then, it creates the XmlReader
reader object, loads the XML document with XDocument doc = XDocument.Load(reader)
, and validates the loaded XML with the Validate
extension method. If no errors happen at this point, the data is loaded and is validated against the schema.
The LoadWithSchema
method gets the root element with:
XNamespace xns = dataNamespaceName;
XElement items = doc.Element(xns + "Items");
Note the xns
variable: it assures that the Items
element is defined in the right XML namespace. After that, the LoadWithSchema
method parses the loaded XML and returns the result with:
return items.Elements().Select<XElement, DataPointSeries>(...)
DataPointSeries
instances are created by the lambda statement which:
- Extracts the data series Base Types from the
XElement
tag name with thegetXYTypeStrings
method. - Gets the optional attributes.
- Creates the instance of the
DataPointSeries
class. Extracts theDataPointSeries
class instanceXXsdTypeString
andYXsdTypeString
property values from the post-validationIXmlSchemaInfo
instances associated with aPoint
element x and y attributes. TheXClrType
andYClrType
property values are leftnull
. - Fills that instance's
XsdPoints
property with thePoints
collection.
Some of the LoadWithSchema
method overloads have just one argument. These overloads use the default schema stored as a resource in the Loader assembly.
Load with Mappings
The principal LoadWithMappings
method overload is:
public static IEnumerable<DataPointSeries>
LoadWithMappings(TextReader dataReader, TextReader mappingReader)
{
// Load mappings.
List<Mapping> mappings = Mapping.Load(mappingReader);
// Prepaire XmlReaderSettings for input file validation.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add(SchemaBuilder.Build(mappings));
StringBuilder sbErrors = null;
List<ValidationException.ValidationError> errors = null;
settings.ValidationEventHandler += (sender, e) =>
{
if (sbErrors == null)
sbErrors = new StringBuilder();
sbErrors.AppendFormat(
"Validation error: {1}{0}Line={2}, position={3}{0}",
System.Environment.NewLine, e.Exception.Message,
e.Exception.LineNumber, e.Exception.LinePosition);
if (errors == null)
errors = new List<ValidationException.ValidationError>();
errors.Add(new ValidationException.ValidationError()
{
Message = e.Exception.Message,
Line = e.Exception.LineNumber,
Position = e.Exception.LinePosition
});
};
// Load and validate the file.
using (XmlReader reader = XmlReader.Create(dataReader, settings))
{
XElement items = XElement.Load(reader);
if (sbErrors != null)
// Validation error(s) occured.
throw new ValidationException(sbErrors.ToString(), errors.ToArray());
XNamespace xns = dataNamespaceName;
// Check the root element name (i.e. Items in "urn:PointSeries-schema" xmlns).
if (items.Name != xns + "Items")
throw new ValidationException(string.Format("Root element {0} missed",
xns + "Items"));
// Parse the Point.XXX elements.
return items.Elements().Select<XElement, DataPointSeries>(
(item) =>
{
// Parse item tag name for X/Y type strings.
string xType, yType;
getXYTypeStrings(item.Name.ToString(), out xType, out yType);
// Dot-separated type string.
string xyType = xType == yType ? xType : xType + "." + yType;
Mapping map = (from mapItem in mappings
where mapItem.DotSeparatedTypeString == xyType
select mapItem).Single();
// Optional attributes.
var yName = item.Attribute("YName");
var xName = item.Attribute("XName");
DataPointSeries series = new DataPointSeries()
{
XName = xName == null ? "" : xName.Value,
XXsdTypeString = map.XAxis.XsdTypeString,
XClrType = map.XAxis.ClrType,
XTypeString = map.XAxis.TypeString,
YName = yName == null ? "" : yName.Value,
YXsdTypeString = map.YAxis.XsdTypeString,
YClrType = map.YAxis.ClrType,
YTypeString = map.YAxis.TypeString
};
foreach (var pt in from pt in item.Elements(xns + "Point") select pt)
{
series.XsdPoints.Add(new XsdDataPoint((string)pt.Attribute("x"),
(string)pt.Attribute("y")));
}
return series;
});
}
}
The LoadWithMappings
method calls List<Mapping> mappings = Mapping.Load(mappingReader)
to build the schema from the mappings reader instance provided (see later). Then, it creates the XmlReader
instance with the reader = XmlReader.Create(fileName, settings)
statement, and loads the XML into memory with XElement items = XElement.Load(reader)
. If no errors happen at this point, the data is loaded and is validated against the schema generated.
Then, the LoadWithSchema
method parses the loaded XML and returns the result with:
return items.Elements().Select<XElement, DataPointSeries>(...)
DataPointSeries
instances are created by the lambda statement which:
- Extracts the data series Base Types from the
XElement
tag name with thegetXYTypeStrings
method. - Gets the
Mapping
class instance associated with theXElement
. - Gets the optional attributes.
- Creates the instance of the
DataPointSeries
class. That instanceXXsdTypeString
,YXsdTypeString
,XClrType
, andYClrType
property values are got from theMapping
class instance. - Fills that instance's
XsdPoints
property with thePoints
collection.
Constructing the XML Schema from the Mappings XML
The XML-CLR-string type mapping XML document and its associated XML schema are described above. In the code, this mapping is represented by two classes.
The first one represents the XML-CLR-string type mapping in one dimension:
/// <summary>
/// XML-CLR-string type mapping in one dimension.
/// </summary>
public class AxisMapping
{
/// <summary>
/// Initializes a new instance of the <see cref="AxisMapping"/> class.
/// </summary>
/// <param name="xsdType">XML Type Name.</param>
/// <param name="clrType">CLR Type Name.</param>
/// <param name="typeString">The type string.</param>
public AxisMapping(string xsdType, string clrType, string typeString)
{
XsdTypeString = xsdType;
ClrType = string.IsNullOrEmpty(clrType) ? null : Type.GetType(clrType);
TypeString = typeString;
}
/// <summary>
/// Gets XML atomic type name like "double" or "gMonth".
/// </summary>
/// <value>XML atomic type name string.</value>
/// <remarks>XSD type name string doesn't
/// contains namespace prefix. </remarks>
public string XsdTypeString { get; private set; }
/// <summary>
/// Gets the CLR type.
/// </summary>
/// <value>The CLR type or <c>null</c>.</value>
public Type ClrType { get; private set; }
/// <summary>
/// Gets the "type string" assigned to this mapping like Double, Int, etc.
/// </summary>
/// <value>The type string.</value>
/// <remarks>"Type string" is used in XML
/// schema construction.</remarks>
public string TypeString { get; private set; }
...
}
The second one contains the XML-CLR-string type mapping for x, y dimensions, and defines some Load
method overrides to load the mappings from the mappings XML document:
/// <summary>
/// X,Y dimensions XML-CLR-string type mappings.
/// </summary>
/// <remarks>
/// <see cref="IEquatable{Mapping}"/> interface
/// implemented for use with the Distinct() LINQ operator.
/// <para>In order to compare the elements,
/// the Distinct operator uses the elements'
/// implementation of the IEquatable<T>.Equals method if the elements
/// implement the IEquatable<T> interface.
/// It uses their implementation of the
/// Object.Equals method otherwise.</para>
/// </remarks>
public class Mapping : IEquatable<Mapping>
{
public AxisMapping XAxis { get; private set; }
public AxisMapping YAxis { get; private set; }
/// <summary>
/// Loads XML-CLR-string type mappings from the
/// <see cref="System.IO.TextReader"/> specified.
/// </summary>
/// <param name="mappingReader">Input Mapping XML
/// <see cref="System.IO.TextReader"/>.</param>
/// <returns>List of <see cref="Mapping"/> objects.</returns>
/// <remarks>
/// <see cref="Mapping"/> instances
/// with recurring <see cref="TypeString"/> property
/// values are removed from output.
/// </remarks>
/// <exception cref="ValidationException"/>
/// <exception cref="RecurringMappingEntriesException"/>
public static List<Mapping> Load(TextReader mappingReader) { ... }
/// <summary>
/// Loads XML-CLR-string type mappings from the
/// <see cref="System.IO.Stream"/> specified.
/// </summary>
/// <param name="stm">Input Mapping XML data
/// <see cref="System.IO.Stream"/>.</param>
/// <returns>List of <see cref="Mapping"/> objects.</returns>
/// <remarks>
/// <see cref="Mapping"/> instances with
/// recurring <see cref="TypeString"/> property
/// values are removed from output.
/// </remarks>
/// <exception cref="ValidationException"/>
/// <exception cref="RecurringMappingEntriesException"/>
public static List<Mapping> Load(Stream stm) { ... }
/// <summary>
/// Loads XML-CLR-string type mappings from the file specified.
/// </summary>
/// <param name="mappingFileName">Mapping file name.</param>
/// <returns>List of <see cref="Mapping"/> objects.</returns>
/// <remarks>
/// <see cref="Mapping"/> instances with recurring
/// <see cref="TypeString"/> property values are removed from output.
/// </remarks>
/// <exception cref="ValidationException"/>
/// <exception cref="RecurringMappingEntriesException"/>
public static List<Mapping> Load(string mappingFileName) { ... }
...
}
The principal Load
method overload is as follows:
public static List<Mapping> Load(TextReader mappingReader)
{
// XML Mapping Schema resource name.
const string mappingSchemaResourceName = "typemappings.xsd";
// XML namespace must be used in XML mappings files.
const string mappingNamespaceName = "urn:PointSeries-mapping";
// Mapping element attributes.
const string attrNameXsdType = "xsd-type"
, attrNameClrType = "clr-type"
, attrNameTypeString = "type-string";
// Get xml schema stream from the "mappingSchemaFileName" resource.
Assembly assembly = Assembly.GetAssembly(typeof(Loader));
ResourceManager rm = new ResourceManager(assembly.GetName().Name +
".g", assembly);
using (XmlTextReader schemaReader =
new XmlTextReader(rm.GetStream(mappingSchemaResourceName)))
{
// Prepaire XmlReaderSettings for input file validation.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add(mappingNamespaceName, schemaReader);
StringBuilder sbErrors = null;
List<ValidationException.ValidationError> errors = null;
settings.ValidationEventHandler += (sender, e) =>
{
if (sbErrors == null)
sbErrors = new StringBuilder();
sbErrors.AppendFormat(
"Validation error: {1}{0}Line={2}, position={3}{0}",
System.Environment.NewLine, e.Exception.Message,
e.Exception.LineNumber, e.Exception.LinePosition);
if (errors == null)
errors = new List<ValidationException.ValidationError>();
errors.Add(new ValidationException.ValidationError()
{
Message = e.Exception.Message,
Line = e.Exception.LineNumber,
Position = e.Exception.LinePosition
});
};
// Load and validate the file.
using (XmlReader reader = XmlReader.Create(mappingReader, settings))
{
XElement mappings = XElement.Load(reader);
if (sbErrors != null)
// Validation error(s) occured.
throw new ValidationException("Mapping file validation errors\n"
+ sbErrors.ToString(), errors.ToArray());
XNamespace xns = mappingNamespaceName;
// Check the root element name
// (i.e. Mappings in "urn:PointSeries-mapping" xmlns).
if (mappings.Name != xns + "Mappings")
throw new ValidationException(string.Format("Root element {0} missed",
xns + "Items"));
// Parse the Mapping elements.
List<Mapping> mappingList = (from mapping in mappings.Elements(xns + "Mapping")
let xAxis = mapping.Element(xns + "XAxis")
let yAxis = mapping.Element(xns + "YAxis")
select new Mapping()
{
XAxis = new AxisMapping((string)xAxis.Attribute(attrNameXsdType),
(string)xAxis.Attribute(attrNameClrType),
(string)xAxis.Attribute(attrNameTypeString)),
YAxis = new AxisMapping((string)yAxis.Attribute(attrNameXsdType),
(string)yAxis.Attribute(attrNameClrType),
(string)yAxis.Attribute(attrNameTypeString))
}
).ToList();
// Check result for recurring entries.
List<Mapping> recurring = new List<Mapping>();
for (int i = 0; i < mappingList.Count - 1; i++)
{
Mapping map = mappingList[i];
for (int j = i + 1; j < mappingList.Count; j++)
{
Mapping map1 = mappingList[j];
if (map.DotSeparatedTypeString == map1.DotSeparatedTypeString)
recurring.Add(map1);
}
}
if (recurring.Count > 0)
{
StringBuilder sb =
new StringBuilder("Recurring entries found in the mapping file:");
foreach (Mapping map in recurring)
{
sb.Append(System.Environment.NewLine + map.ToString());
}
throw new RecurringMappingEntriesException(sb.ToString(),
recurring.ToArray());
}
return mappingList;
}
}
}
The mapping XML schema file is stored in the Loader library assembly as a resource. The Load
method gets it with the ResourceManager
, and uses it to prepare the XmlReaderSettings
class instance for loading the mappings XML document with validation. Then, the Load
method loads the mappings XML with XmlReader
, and converts its content to the Mapping
object collection with the LINQ query. At last, it checks if the Mapping
object collection contains recurring entries and, if so, throws the RecurringMappingEntriesException
.
Loaded Data Representation
The result of data loading is stored in the DataPointSeries
object collection.
The DataPointSeries
class contains the properties describing the x, y dimension types in terms of both the XML and the CLR. The points loaded are returned as a Collection<XsdDataPoint>
by the DataPointSeries.XsdPoints
property. The XsdDataPoint
structure stores the x, y point coordinate values as strings in the same form as they were presented in the input XML file.
To get the typed Data Points, you should use the
public IEnumerable<DataPoint> GetPoints(IXmlTypeConverter converter)method which converts the
XsdDataPoint
x, y field string values to the specific CLR types with the help of the XML-to-CLR type converter provided by the caller. As an alternative, you can use the GetPoints
method overload without parameters. It uses the default converter hardcoded into the Loader assembly.
Note that the DataPoint
class stores x, y values in the fields of the System.Object
type. We could resort to the more type safe world, but with C# 3.0, we'll be forced, sooner or later, to return or get such values as System.Object
and use Reflection to work with them. Let's wait for C# 4.0 dynamic types.
Using the Code
The code attached to this article contains the Visual Studio 2008 SP1 solution targeted at .NET Framework 3.5, with three projects. The main part is the Loader class library project described above.
The other two projects are the simple Console applications which load the data from the XML file pointed to by the first command line argument and (for the second project) the mapping XML file pointed to by the second argument. They either report errors, or display the results of the XML data parsing. The sample input files for these applications are in the root solution directory.
Pay your attention to the Unit Test project. It contains the tests for a lot of Data Point Series types, and provides you with examples of which data is supported by the XML format in question and how they should look like.
History
- 9th April, 2009: Initial post.
- 16th April, 2009: Second article revision with the following additions:
- Added support for on-the-fly XML schema generation.
- The
Loader
class interface modified to load Data Points Series XML data with either the default schema, the schema provided by the caller, or the schema generated from the type mappings XML file. - The
IXmlConverter
interface and its default implementation added. - The
DataPointSeries
class interface modified to return the results of the Data Points Series XML data parsing as either a collection of rawXsdDataPoint
objects or typesafeDataPoint
objects.