SequelMax.NET: C# XML SAX Parser





5.00/5 (1 vote)
New SAX parsing model comes to .NET!
SequelMax is an improved SAX model where the programmer registers a reading delegate for the XML element of interest. SequelMax.NET engine is ported from the C++ library; The C++ SequelMax is in turn ported and modified from the Portable Elmax DOM engine. Practically, SequelMax.NET uses the same engine. This parsing engine, instead of creating and populating a internal DOM tree, invokes the registered delegates during parsing. A Javascript edition, SequelMax.js, is in the cards.
This is the XML we are going to use for our article example. Notice the last employee does not have a comment.
<?xml version="1.0" encoding="UTF-8"?>
<Employees>
<Employee EmployeeID="1286" SupervisorID="666">
<Name>Amanda Dion</Name>
<Salary>2200</Salary>
<Gender>Female</Gender>
<!--Hardworking employee!-->
</Employee>
<Employee EmployeeID="1287" SupervisorID="666">
<Name>John Smith</Name>
<Salary>3200</Salary>
<Gender>Male</Gender>
<!--Hardly working employee!-->
</Employee>
<Employee EmployeeID="1288" SupervisorID="666">
<Name>Sheldon Cohn</Name>
<Salary>5600</Salary>
<Gender>Male</Gender>
</Employee>
</Employees>
We use the Employee
class to hold the data from the XML.
class Employee
{
public int EmployeeID;
public int SupervisorID;
public string Name;
public string Gender;
public double Salary;
public string Comment;
};
The parsing code to read the employee XML is listed below. To read each element, a delegate has to be registered. A anonymous lambda can be used for delegate. The element path and delegate is stored in a dictionary in the Document
class. The parsing engine invokes the delegate whenever the element path is matched. To do the matching, the current element path is generated from a LIFO stack which pushes the element name. After processing, the name is popped. The Open
method opens and parses the XML. It is required to set up all the delegates before the Open
call.
static bool ReadDoc(string file, List<Employee> list)
{
SequelMaxNet.Document doc = new SequelMaxNet.Document();
doc.RegisterStartElementDelegate("Employees|Employee", (elem) =>
{
Employee emp = new Employee();
emp.EmployeeID = elem.Attr("EmployeeID").GetInt32(0);
emp.SupervisorID = elem.Attr("SupervisorID").GetInt32(0);
list.Add(emp);
});
doc.RegisterEndElementDelegate("Employees|Employee|Name", (text) =>
{
list[list.Count - 1].Name = text;
});
doc.RegisterEndElementDelegate("Employees|Employee|Gender", (text) =>
{
list[list.Count - 1].Gender = text;
});
doc.RegisterEndElementDelegate("Employees|Employee|Salary", (text) =>
{
Double.TryParse(text, out list[list.Count - 1].Salary);
});
doc.RegisterCommentDelegate("Employees|Employee", (text) =>
{
list[list.Count - 1].Comment = text;
});
return doc.Open(file);
}
The concise code to display the data on the console.
static void DisplayDoc(List<Employee> list)
{
for (int i = 0; i < list.Count; ++i)
{
Console.WriteLine("Name: {0}", list[i].Name);
Console.WriteLine("EmployeeID: {0}", list[i].EmployeeID);
Console.WriteLine("SupervisorID: {0}", list[i].SupervisorID);
Console.WriteLine("Gender: {0}", list[i].Gender);
Console.WriteLine("Salary: {0}", list[i].Salary);
if (string.IsNullOrEmpty(list[i].Comment) == false)
Console.WriteLine("Comment: {0}", list[i].Comment);
Console.WriteLine();
}
}
The console output is below. Notice again the last employee does not have a comment.
Name: Amanda Dion EmployeeID: 1286 SupervisorID: 666 Gender: Female Salary: 2200 Comment: Hardworking employee! Name: John Smith EmployeeID: 1287 SupervisorID: 666 Gender: Male Salary: 3200 Comment: Hardly working employee! Name: Sheldon Cohn EmployeeID: 1288 SupervisorID: 666 Gender: Male Salary: 5600
Please download the source code at Github.