Click here to Skip to main content
15,867,453 members
Articles / Programming Languages / C#
Article

Flattening out the complexity in flat file schemas in BizTalk 2004 - Part 1

Rate me:
Please Sign up or sign in to vote.
3.19/5 (11 votes)
27 Feb 20067 min read 70K   209   22   9
An article explaining, how to write various flat file schemas in BizTalk Server 2004.

Introduction - First things first!

This article is the first part of a series of articles on writing flat file schemas in BizTalk Server 2004. Flat file schemas are known to be complex and cryptic. This article shall try to allay all the fears in writing a flat file schema.

Flat file structure

A flat file unlike an XML file does not have any visible inherent structure. A flat file's structure is evident from its usage and also requires some domain knowledge to understand its representation. A flat file structure can be of several types:

  1. Delimited flat file.
  2. Positional flat file.
  3. A flat file with a combination of delimited and positional records.

Let's take some examples of a flat file and understand its structure.

Example 1 - A "TAB" delimited flat file

JOHN DOE    1964-10-05    CLAYTON
ROBERT B    1978-11-10    EDWARD STREET
JOHN LENON    1927-02-30    WORTHING
EDMOND DANTES    1910-09-12    COVENTRY
SIR CHAMBERS    1934-05-18    HARRODS
  1. In the above example, there are 5 lines, each line represents a "record" of information.
  2. A "record" consists of several "fields". In the example, the "fields" are "name", "date of birth" and "place of origin".
  3. Each "field" is separated by a "TAB" space, which acts as a "delimiter" for each "field".
  4. Each "line" is separated by a "CRLF" (a combination of carriage return and line feed). This "CRLF" combination works for "Windows based" systems. For "UNIX" based systems, only a "line feed" (LF) is used.

Example 2 - A comma (,) separated flat file

JOHN DOE, 1964-10-05, CLAYTON
ROBERT B, 1978-11-10, EDWARD STREET
JOHN LENON, 1927-02-30, WORTHING
EDMOND DANTES, 1910-09-12, COVENTRY
PETER JAMES,  , GATWICK
SIR CHAMBERS, 1934-05-18, HARRODS
  1. This structure is very similar to the one used in example 1. The difference being that each field is separated by a comma (,), which acts as a delimiter for each field.
  2. The fourth line does not have a value for the "date of birth" field, but even then the commas are in place. This is known as place holder, basically, the commas are in place indicating that the field has no value.

Example 3 - A positional flat file

12345678901234567890123456789012345678901234567890
JOHN DOE            1964-10-05CLAYTON             
ROBERT B            1978-11-10EDWARD STREET       
JOHN LENON          1927-02-30WORTHING            
EDMOND DANTES       1910-09-12COVENTRY            
SIR CHAMBERS        1934-05-18HARRODS
  1. A positional flat file is one whose fields are placed in positions (columns), and the field lengths are of fixed size. In the above example, the "name" field has a fixed size of 20 characters, the "date-of-birth" field has a fixed size of 10 characters and the "place of origin" has a fixed size of 20 characters.
  2. A positional flat file record must always be a child of the delimited record. The delimiter character specified for the parent-delimited record must not appear in the data of the child positional record. There is no way to escape the delimiter character of the parent-delimited record in the data of the child positional field.

Creating the BizTalk flat file schema solution

Create a new BizTalk Server Solution in Visual Studio.

Step 1: In the Visual Studio .NET menu, select the File -> New -> "Blank Solution" and type the name "FFSchemas":

Image 1

Step 2: In the Solution Explorer, right click on the solution name "FFSchemas" and select Add -> New Project. In the "Add Project" dialog box, for the type of project, select "BizTalk Projects". Select the template "Empty BizTalk Project" and create a project named "FlatFileSchema".

Building the schemas - Example 1

We shall create the schema based on the example 1.

Step 1: Right-click on the project in the Solution Explorer and select the "Add New Item" option. Then, select the item "Schema" and name it "FFSchema_TAB". When the schema shows up, rename the "Root" element to "TSV".

Step 2: Select the item "Schema" and right-click, select Properties. Change the property "Schema Editor Extensions" to Flat File Extension:

Image 2

Step 3: Select the item "TSV" and right-click, select Properties:

TSV "Root Node" properties
Property NameProperty Value
Child Delimiter TypeHexadecimal
Child Delimiter0x0D 0x0A
StructureDelimited

Step 4: Select the item "TSV" and right-click -> Insert Schema Node -> Child Record. Name the Records as "Record" and create the child elements: "Name", "DOB" and "Address":

Image 3

Step 5: Select the item "Record" and right-click, select Properties. Set the properties as shown in the table below. The hexadecimal character "0X09" represents a <tab>character. The child order is set to "Infix", since the tab appears in between the fields in each record. We need to support multiple records and hence we set the Max Occurs to "*" or "unbounded":

"Record" Node properties
Property NameProperty Value
Child Delimiter TypeHexadecimal
Child Delimiter0x09
Child OrderInfix
Min Occurs1
Max Occursunbounded

Step 6: In the Solution Explorer, select the schema "FFSchema_TAB.xsd" and right-click -> select "Properties". In the property pages screen, select the properties as shown in the image. For the "Input Instance File Name" -> Choose the path where the input files(*.txt) are present:

Image 4

Validating and testing the schema created

Once we have finished writing the schema, we need to validate and test the schema.

Step 1: In the Solution Explorer, select the schema "FFSchema_TAB.xsd" and right-click -> select "Validate Schema". Observe the output window and you would notice a message starting with "Validate Schema succeeded for file...".

Step 2: In the Solution Explorer, select the schema "FFSchema_TAB.xsd" and right-click -> select "Validate Instance". Observe the output window and you would notice a message starting with "Validate Instance succeeded for schema FFSchema_TAB.xsd...". Now click on the link which starts with the message "Validation generated XML output..."

The XML output file would look like this:

Image 5

Building the schemas - Example 2

We shall create the schema based on the example 2.

Step 1: Right-click on the project in the Solution Explorer and select the "Add New Item" option. Then, select the item "Schema" and name it "FFSchema_CSV". When the schema shows up, rename the "Root" element to "CSV".

Step 2: Select the item "Schema" and right-click, and select Properties. Change the property "Schema Editor Extensions" to Flat File Extension:

Image 6

Step 3: Select the item "CSV" and right-click, select the Properties:

CSV "Root Node" properties
Property NameProperty Value
Child Delimiter TypeHexadecimal
Child Delimiter 0x0D 0x0A
StructureDelimited

Step 4: Select the item "CSV" and right-click -> Insert Schema Node -> Child Record. Name the Records as "Record" and create the child elements: "Name", "DOB" and "Address":

Image 7

Step 5: Select the item "Record" and right-click, select Properties. Set the properties are shown in the table below. The character "," represents a comma character. The child order is set to "Infix", since the comma appears in between the fields in each record. We need to support multiple records and hence we set the Max Occurs to "*" or "unbounded":

"Record" Node properties
Property NameProperty Value
Child Delimiter TypeCharacter
Child Delimiter,
Child OrderInfix
Min Occurs1
Max Occursunbounded

Step 6: In the Solution Explorer, select the schema "FFSchema_CSV.xsd" and right-click -> select "Properties". In the property pages screen, select the properties as shown in the image. For the "Input Instance File Name" -> Choose the path where the input files(*.txt) are present:

Image 8

Validating and testing the schema created

Once we have finished writing the schema, we need to validate and test the schema.

Step 1: In the Solution Explorer, select the schema "FFSchema_CSV.xsd" and right-click -> select "Validate Schema". Observe the output window and you would notice a message starting with "Validate Schema succeeded for file...".

Step 2: In the Solution Explorer, select the schema "FFSchema_CSV.xsd" and right-click -> select "Validate Instance". Observe the output window and you would notice a message starting with "Validate Instance succeeded for schema FFSchema_CSV.xsd...". Now click on the link which starts with the message "Validation generated XML output..."

The XML output file would look like this:

Image 9

Quick takeaways

  1. Set the schema's editor extensions property before you start with the flat file schema.
  2. Set "Child Delimiter Type" property to Hexadecimal to avoid character ambiguity.

Part 2

The next part of this article shall discuss about "Positional Flat files". Until then happy schema programming.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Architect AT&T Wi-Fi Services
United States United States
Naveen has done his Masters (M.S.) in Computer science, has started his career programming the mainframes and now has more than a decade of programming, development and design experience. Naveen has a sharp eye and keen observation skills. Naveen has worked for several companies and strived hard to build large scale business applications and bringing better solutions to the table.
Quite recently Naveen has built a fairly complex integration platform for a large bank. His hobbies include training, mentoring and research. Naveen spends his free time visiting National Parks nationwide.

Naveen has developed the BizTalk Control Center (BCC)
http://biztalkcontrolcenter.codeplex.com

Comments and Discussions

 
GeneralMy vote of 4 Pin
Shrilata Ellaboina14-Apr-13 21:59
Shrilata Ellaboina14-Apr-13 21:59 
GeneralRe: My vote of 4 Pin
Naveen Karamchetti16-Apr-13 8:27
professionalNaveen Karamchetti16-Apr-13 8:27 
GeneralI know..... Pin
walterhevedeich28-Mar-11 5:06
professionalwalterhevedeich28-Mar-11 5:06 
Generalerror Pin
kbabu200016-Mar-06 0:18
kbabu200016-Mar-06 0:18 
Hi,

When I do a Validate Instance for first example,
I'm getting error "C:\TabDelimited\input.txt The data at the root level is invalid. Line 1, position 1."

I just copied the input into text file and given it in diaglog box.

Any idea ?

Kishore

GeneralRe: error Pin
kbabu200016-Mar-06 0:22
kbabu200016-Mar-06 0:22 
QuestionRe: error Pin
MoreAmit16-Nov-06 18:00
MoreAmit16-Nov-06 18:00 
GeneralRe: error Pin
virtual_manu1-Feb-07 0:39
virtual_manu1-Feb-07 0:39 
GeneralCoincidence Pin
Raj Mathai S3-Mar-06 7:28
Raj Mathai S3-Mar-06 7:28 
GeneralRe: Coincidence Pin
Naveen Karamchetti3-Mar-06 17:27
professionalNaveen Karamchetti3-Mar-06 17:27 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.