Click here to Skip to main content
15,867,686 members
Articles / Programming Languages / C#
Article

A Modified C# Implementation of Tony Selke's TextFieldParser

Rate me:
Please Sign up or sign in to vote.
4.31/5 (21 votes)
27 Feb 20052 min read 104.2K   964   38   30
A C# implementation of the TextFieldParser class submitted by Tony Selkes that includes the ability to put the schema in an XML file and load the data directly into a DataTable.

Application after parsing 1,500,000 fields.

Introduction

One day while browsing the Code Project, I found an excellent article by Tony Selke 'Wrapper Class for Parsing Fixed-Width or Delimited Text Files'. I decided that I would port the code to C# because that is my language of choice. While doing this, I also added a couple of features:

  1. I added the ability to put the schema for the text file in an XML file.
  2. I added the ability to parse the text file directly to a DataTable.

This is my first article submitted to the Code Project, so be gentle.

What can it do?

The library can import delimited or fixed width files while the developer decides what to do with each record by subscribing to the RecordFound event. The library can import delimited or fixed width files directly into a DataTable.

How does it work?

The developer sets up a schema either with code or in an XML schema file. This determines the data types that will be used in the DataTable. Based on the schema, the text values are parsed and converted to the respective data types and either put in a DataTable or simply passed to the calling object as an event.

Using the code

The first thing to do is, add a reference to the library. Then add the using statement at the top of your source file.

C#
using WhaysSoftware.Utilities.FileParsers;

Create an instance of the TextFieldParser object.

C#
TextFieldParser tfp = new TextFieldParser(filePath);

If you will be using an XML schema file, use the constructor that has the 'schemaFile' parameter.

C#
TextFieldParser tfp = new TextFieldParser(filePath, schemaPath);

If using an XML schema file, the following is an example of how the XML schema file would look:

XML
<TABLE Name="TEST" FileFormat="Delimited" ID="Table1">
    <FIELD Name="LineNumber" DataType="Int32" />
    <FIELD Name="Quoted String" DataType="String" Quoted="true" />
    <FIELD Name="Unquoted String" DataType="String" Quoted="false" />
    <FIELD Name="Double" DataType="Double" />
    <FIELD Name="Boolean" DataType="Boolean" />
    <FIELD Name="Decimal" DataType="Decimal" />
    <FIELD Name="DateTime" DataType="DateTime" />
    <FIELD Name="Int16" DataType="Int16" />
</TABLE>

I have included with the source code a complete description of the schema file attributes. Here is an example of the same thing, but done in code.

C#
TextFieldCollection fields = new TextFieldCollection();
fields.Add(new TextField("Line Number", TypeCode.Int32));
fields.Add(new TextField("Quoted String", TypeCode.String, true));
fields.Add(new TextField("Unquoted String", TypeCode.String, false));
fields.Add(new TextField("Double", TypeCode.Double));
fields.Add(new TextField("Boolean", TypeCode.Boolean));
fields.Add(new TextField("Decimal", TypeCode.Decimal));
fields.Add(new TextField("DateTime", TypeCode.DateTime));
fields.Add(new TextField("Int16", TypeCode.Int16));
tfp.TextFields = fields;

Now you can either subscribe to the RecordFound event if you want to do something custom with the records...

C#
tfp.RecordFound += new RecordFoundHandler(tfp_RecordFound);
tfp.ParseFile();
...
private void tfp_RecordFound(ref int CurrentLineNumber, 
                                         TextFieldCollection TextFields)
{
    //Do something with the TextFields parameter
}

or you can call ParseToDataTable to get the results in a DataTable.

C#
DataTable dt = tfp.ParseToDataTable();

Note: Even when calling ParseToDataTable, the RecordFound event is still fired.

You can also subscribe to the RecordFailed event to get notification of when a record fails to parse. In the event handler, you can decide if you can continue or not.

C#
tfp.RecordFailed += new RecordFailedHandler(tfp_RecordFailed);
...
private void tfp_RecordFailed(ref int CurrentLineNumber, 
        string LineText, string ErrorMessage, ref bool Continue)
{
    MessageBox.Show("Error: " + ErrorMessage + Environment.NewLine + 
                            "Line: " + LineText);
}

That's it. I look forward to comments, suggestions from you all.

History

  • 02/27/2005

    Initial release.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
United States United States
I have BA in Computer Science from a small college in Indiana. I have been programming for about 7 years - mostly business applications.

Comments and Discussions

 
QuestionExclude Header and Footer from getting loaded Pin
Jeet Sahoo3-Oct-15 22:08
Jeet Sahoo3-Oct-15 22:08 
QuestionMissing columns in csv file Pin
sentdata21-Nov-12 8:39
sentdata21-Nov-12 8:39 
Hello, I have looked all over for an answer to the problem of missing columns when importing a csv file in SSIS. Do you think this might be the way to go?
Thanks, Chris
GeneralMy vote of 3 Pin
oyangbingrui13-Jul-12 16:27
oyangbingrui13-Jul-12 16:27 
GeneralMy vote of 5 Pin
Anoop X29-Jun-11 1:23
Anoop X29-Jun-11 1:23 
QuestionStart Parameter Pin
Gavin O'Brien18-Mar-10 2:02
Gavin O'Brien18-Mar-10 2:02 
GeneralTypeCode to Type Conversion Pin
SNathani10-Feb-10 7:46
SNathani10-Feb-10 7:46 
GeneralNon alpha numeric characters Pin
cbonsall19-Jun-07 1:23
cbonsall19-Jun-07 1:23 
GeneralRe: Non alpha numeric characters Pin
cbonsall19-Jun-07 1:38
cbonsall19-Jun-07 1:38 
QuestionError when single-field file missing end quote Pin
PeterGomis10-May-07 8:13
PeterGomis10-May-07 8:13 
AnswerRe: Error when single-field file missing end quote Pin
WendellH10-May-07 10:04
WendellH10-May-07 10:04 
GeneralRe: Error when single-field file missing end quote Pin
PeterGomis11-May-07 7:48
PeterGomis11-May-07 7:48 
QuestionHandle char data type? Pin
portia vandemere16-Feb-07 4:11
portia vandemere16-Feb-07 4:11 
AnswerRe: Handle char data type? Pin
WendellH16-Feb-07 4:31
WendellH16-Feb-07 4:31 
GeneralRe: Handle char data type? Pin
portia vandemere16-Feb-07 5:33
portia vandemere16-Feb-07 5:33 
GeneralRe: Handle char data type? Pin
WendellH16-Feb-07 5:39
WendellH16-Feb-07 5:39 
GeneralRe: Handle char data type? Pin
WendellH16-Feb-07 5:40
WendellH16-Feb-07 5:40 
GeneralRe: Handle char data type? Pin
portia vandemere16-Feb-07 6:08
portia vandemere16-Feb-07 6:08 
GeneralRe: Handle char data type? Pin
WendellH16-Feb-07 6:53
WendellH16-Feb-07 6:53 
GeneralRe: Handle char data type? Pin
portia vandemere16-Feb-07 7:45
portia vandemere16-Feb-07 7:45 
QuestionDelimiter = Tab Character? Pin
Apuhjee11-Jan-07 10:40
Apuhjee11-Jan-07 10:40 
AnswerRe: Delimiter = Tab Character? Pin
Apuhjee25-Jan-07 5:56
Apuhjee25-Jan-07 5:56 
GeneralNice parser Pin
Matthew Hazlett24-Oct-06 19:06
Matthew Hazlett24-Oct-06 19:06 
GeneralRe: Nice parser Pin
WendellH25-Oct-06 2:06
WendellH25-Oct-06 2:06 
GeneralFYI ... Multiple Line Types Pin
kennster27-Feb-06 9:17
kennster27-Feb-06 9:17 
GeneralRe: FYI ... Multiple Line Types Pin
WendellH28-Feb-06 2:04
WendellH28-Feb-06 2:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.