Click here to Skip to main content
15,860,844 members
Articles / Programming Languages / XML

Parsing and Interpreting XSD Using LINQ

Rate me:
Please Sign up or sign in to vote.
5.00/5 (6 votes)
6 May 2012CPOL2 min read 41.9K   1K   13   1
How to use a Linq2Xsd generated object to directly manipulate XmlSchema

Background

For all it's verbosity, slow-parsing and ambiguous data layout ... XML is still one of just a few openly enforceable data transport mechanisms we have in our coders tool chest. What I'm talking about is XSD or XML Schema. Schema interpreting has traditionally been a black art. I wrote an article on using MS Schema Object Model (SOM). After I finished this article, I immediately realized the SOM has many foibles, it's an exceedingly weakly typed and awkward data model. The pattern; "Get, is type, cast, use" repeats over and over and introduces misinterpretation bugs.

This article is a more modern, far more robust rework of that first attempt using a strongly-typed more general approach.

Schema-Schema

XmlSchema.xsd defines the format for schema... schema schema. LinqToXsd is as a powerful Xsd to linq-friendly code generator. XmlSchema.xsd is a little special as core definitions are handled by including the much older file definition language DTD (XMLSchema.dtd & datatypes.dtd).

C:\projects>linqtoxsd.exe xmlschema.xsd
[Microsoft (R) .NET Framework, Version v4.0.30319]
Generated xmlschema.cs... 

See XmlSchema.cs in source zip

Parsing

Parsing is not quite as trivial as "just loading", you need to assemble the source xsd you are reading for yourself. That means recursively including the subsequent xsd's and building their object maps.

C#
/// <summary>
///  Load a schema using the include file resolver, so you can find any resource
/// </summary>
/// <param name="files"></param>
/// <param name="resolver"></param>
/// <returns></returns>
public static schema Load(IEnumerable<string> files, IncludedFileResolver resolver)
{
    schema mtr = new schema();
    foreach(var fil in files)
    {
	var sch = schema.Load(fil);
	Merge(mtr, sch);

	// Combine import & include
	var incs = sch.import.Select(q => q.schemaLocation).ToList();
	incs.AddRange(sch.include.Select(s=>s.schemaLocation));

	var resolved = incs.Select(inc => resolver(fil, inc))
	    .Where(q=>null!=q);

	if (resolved.Count() > 0)
	    Merge(mtr, Load(resolved));
    }

    return mtr;
}

Storing the Objects

As we walk through the schema, it's important that we merge object references as we go.

/// <summary>
///  combines multiple schemas into one large schema object
/// </summary>
/// <param name="mstr"></param>
/// <param name="src"></param>
private static void Merge(schema mstr, schema src)
{
    mstr.element = MergeList<element>(mstr.element, src.element);
    mstr.attribute = MergeList<attribute>(mstr.attribute, src.attribute);
    mstr.complexType = MergeList<complexType>(mstr.complexType, src.complexType);
    mstr.simpleType = MergeList<simpleType>(mstr.simpleType, src.simpleType);
    mstr.include = MergeList<include>(mstr.include, src.include);
    mstr.import = MergeList<import>(mstr.import, src.import);
}

/// <summary>
///  Generic list of stuff merge mechanism.  Simple ref copy if null, otherwise append
///   ignores duplicate.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="dest"></param>
/// <param name="src"></param>
/// <returns></returns>
private static IList<T> MergeList<T>(IList<T> dest, IList<T> src)
{
    if (null == dest)
	return src;

    if (null != src && src.Count() > 0)
	src.Where(q=>!dest.Contains(q)).ToList().ForEach(s => dest.Add(s));

    return dest;
}

And through the magic of LinqToXsd, that's it for parsing.

Interpretation; but what does it mean?!?

What XML Schema means is non-trivial. Rather than taking a stab at describing it, I'l reference better sources:

Doing Something Useful

We've loaded all the objects, mapped them, found useful purposes for them and now let's re-interpret them into something different, yet still meaningful. In this case, I created a "new" language I call "SKA" (comes from SKemA). It's just a more natural c-like re-interpretation of Xsd for demonstration.

Some of the basic rules in English:

  • A symbol on it's own line is the root element.
  • "Type [name] {" is a complex type definition. Same of Choice, Enum, etc...
  • Simple types are expanded out into their constituent base element.
  • Elements are default typed to a complex type with the same name + "Info"
C#
//
// Ska(c) 2011, Bruce Meacham - An intuitive Xml Schema language
// DO NOT EDIT - This is file was generated on 5/4/2012 6:51:18 AM by ska.exe
//


Type CubInfo {
	!Id 
	!First 
	!Last 
	@Place 
	}

Type GroupInfo {
	Cub [0-n]
	@Name 
	}

Type RacersInfo {
	Group [0-n]
	}

Type ResultInfo {
	!CubId 
	!Time 
	}

Type RaceInfo {
	Result [0-n]
	}

Type RacesInfo {
	Race [0-n]
	}

Type DerbyInfo {
	Racers 
	Races 
	}

Derby

The language trades readability and succinctness for self-afined xml formatting and the robustness of the formal XmlSchema standard. Try loading some more complex examples. The IRS 1040 MeF Schema was my hard-core test schema and it is quite interesting to see in Ska.

If you run the example program provided, it will produce this output.

Summary

Like my last article on schemas this is really just a starting point for many XSD related capabilities. Code generators, translators or validaters.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Engineer Big Company
United States United States
My professional career began as a developer fixing bugs on Microsoft Word97 and I've been fixing bad habits ever since. Now I do R&D work writing v1 line of business applications mostly in C#/.Net.

I've been an avid pilot/instructor for 13+ years, I've built two airplanes and mostly fly gliders now for fun. I commute in an all-electric 1986 BMW 325 conversion.

I'd like to get back to my academic roots of programming 3D analysis applications to organize complex systems.

Comments and Discussions

 
GeneralThanks for good work Pin
AbdulRhman Salah Shaheen3-Nov-13 19:10
AbdulRhman Salah Shaheen3-Nov-13 19:10 
QuestionMany Thanks Bruce Pin
frosty30514-Jul-12 16:10
frosty30514-Jul-12 16:10 
GeneralMy vote of 5 Pin
Eugene Sadovoi7-May-12 4:56
Eugene Sadovoi7-May-12 4:56 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.