![]() |
Desktop Development »
Files and Folders »
File Formats
Intermediate
License: The Code Project Open License (CPOL)
Convert PBS Legacy Files to XMLBy Henrik ThomsenLegacy file formats, such as UN-EDIFACT with a record per line and fixed-length fields, still exist and are widely used for B2B transactions. A tool that can convert legacy files to human-readable XML might come in handy. |
C#, .NET (.NET 2.0), Dev
|
|
Advanced Search |
|
|
|
||||||||||||||||
Legacy file formats, such as UN-EDIFACT with a record per line and fixed-length fields, still exist and are widely used for B2B transactions. A tool that can convert legacy files to human-readable XML might come in handy. The tool I present here converts files similar to, but not identical to UN-EDIFACT. The file format in question is used by PBS - Payment Business Services (PBS) in Denmark, see http://www.pbs.dk/en/. The tool might not be terribly relevant outside Denmark, but it does show how to deal with validating, searching and converting > 100 megabyte legacy files to XML in a fairly general manner. So I have decided to place it on CodeProject in spite of the strong local coupling to PBS in Denmark. This tool uses the class arguments from the article C#/.NET Command Line Arguments Parser, thanks to R. LOPES.
The tool works like this:
pbs2Xml.exe –s InfoService.xml –i Leverance.xml –o Leverance.xml –f "John Schmidt"
–s command line argument is the specification file which must follow the schema in PbsSpecification.xsd. –i argument is the input file in legacy format. –o argument is the output file in XML format. This is optional; leave it out when all you want is to validate the legacy file. –f argument is a search filter. This is optional. It can be handy when dealing with very large files. If you are looking for information regarding a specific SSN, use this option to convert only records containing that SSN. | Description | File |
| Information service. Information types 100, 150: Pension and 700: LetLøn | InformationsService.zip |
| Payment Service Invoicing: 601, section 112 | |
| Payment Service Invoicing: 601, section 117 | BetalingsService601-0117.zip |
| Payment Service Payments: 602 | BetalingsService602.zip |
I needed a tool to validate files used for business transactions in banking, pension and life insurance and convert them to XML. I also needed a general approach because the business rules for validating data were unclear. Basically I wanted a general parser that could read a legacy file with a record per line, fixed-length fields and a hierarchical record structure like the one in UN-EDIFACT documents. The parser must not know the specifics of the records, fields and validation rules. The specifics must be provided in a specification file so that changing parsing details does not require code changes, but merely changes to an XML file containing the parsing rules.
pbs2xml is just a parser, and a parser of a specific B2B legacy file format, which is only used in Denmark. This sounds like application-specific code, not suited for CodeProject!
Well, maybe not. It does however demonstrate an interesting technique: pulling out all of the business rules for parsing and validating a specific file format from the code and into an XML specification file.
The specification file must follow some ground rules that are common for all B2B files used by Payment Business Services (PBS); these rules are represented by the schema in PbsSpecification.xsd. The overall format is similar to UN-EDIFACT: one record per line with fixed-length fields and a hierarchy of record types.
The following classes model the entities in the specification schema:
– Specifies the position, length and validation rule of a field in a record of fixed-length fields. Field.Key to true if the field is part of what identifies the record. Field.Optional to true if the field is not always supplied in the input fileRecord – Contains fields Section – Contains a start record, some data records and an end record Leverance – Contains Sections The class PbsReader can read and validate an input file given a valid specification:
XmlDocument spec = new XmlDocument();
spec.Load("InformationsService.xml");
Leverance leverance = new Leverance(spec);
PbsReader target = new PbsReader();
target.Read("Leverance.txt", leverance);
Console.WriteLine("Errors:");
foreach (Error error in target.Errors)
{
Console.WriteLine(error);
}
If the input file does not honor the ground rules, a PbsFormatException is thrown. Fields with format errors are summarized in PbsReader.ErrorCount and the first 100 errors are accumulated in the collection PbsReader.Errors.
PbsReader is inherited by PbsWriter, which can convert the input file to XML.
PbsReader is inherited by PbsSearcher, which converts a selection of records to XML based on a search filter.
This tool was developed by myself and my colleague Lotte Jensen during a programming course with Kent Beck. I learned at least two important things during that course:
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 22 Jun 2009 Editor: Deeksha Shenoy |
Copyright 2008 by Henrik Thomsen Everything else Copyright © CodeProject, 1999-2009 Web11 | Advertise on the Code Project |