Click here to Skip to main content
6,306,412 members and growing! (17,793 online)
Email Password   helpLost your password?
General Programming » Algorithms & Recipes » General     Advanced License: The GNU General Public License (GPL)

Simple CSV Parser/Reader Function written in C#

By Mandar Ranjit Date

This function parses string read from CSV file and returns values in ArrayList object
C# 1.0, C# 2.0, Windows, .NET 1.0, .NET 1.1, .NET 2.0VS.NET2003, VS2005, Dev
Posted:1 Mar 2007
Updated:14 Oct 2008
Views:33,091
Bookmarked:25 times
Unedited contribution
Announcements
Loading...
 
Search    
Advanced Search
printPrint   Broken Article?Report       add Share
  Discuss Discuss   Recommend Article Email
21 votes for this article.
Popularity: 3.22 Rating: 2.43 out of 5
9 votes, 42.9%
1
2 votes, 9.5%
2
1 vote, 4.8%
3
2 votes, 9.5%
4
7 votes, 33.3%
5

Introduction

This is a simple parser function named CSVParser for parsing csv (comma separeted values) file. It takes string containing single line in csv file as input and returns ArrayList.

csv follows following rules:-

1) All values are separated by comma

2) If comma is part of value, then enclose value in double quotes

e.g. a,b,"12,000",c

3) If double quote is part of value, then replace it with two double quotes and enclose value in double quotes.

e.g. a,b,"He said ""Hi""",c

Usage:-

ArrayList alResult;
using (StreamReader objReader = new StreamReader(@"C:\Testfile.csv"))
{
    while ((strLineText = objReader.ReadLine()) != null)
    {
        alResult = CSVParser(strLineText);
        //do processing
    }
}

How does it work?

This function works based on Finite State Automata concept. Finite state automata has current state and an input. Based on transition table, it changes current state for an input and performs an action.

The state diagram is as follows:-

Screenshot - CSVParser.jpg

The parser function maintains two objects, a string builder object (henceforth called as TEMPSTR) to temporarily store characters and an array list (henceforth called as ARRLIST).

INPUT

" (double quote) (Indicated by 0)
, (Comma) (Indicated by 1)
N (newline) (Indicated by 3)
O (character other than , " and N) (Indicated by 2)

TRANSITION TABLE
CUR_STATE>INPUT>NEXT STATE
0 > 0 > 2
0 > 1 > 0
0 > 2 > 1
0 > 3 > 5

1 > 0 > 6
1 > 1 > 0
1 > 2 > 1
1 > 3 > 5

2 > 0 > 4
2 > 1 > 3
2 > 2 > 3
2 > 3 > 6

3 > 0 > 4
3 > 1 > 3
3 > 2 > 3
3 > 3 > 6

4 > 0 > 2
4 > 1 > 8
4 > 2 > 6
4 > 3 > 7

5 > X > 5

6 > X > 6

7 > X > 5

8 > X > 0

(X = Any input)

The code is as follows:-

The 9X4 aActionDecider array represents the above transition table. First dimetion represents state (S0 to S8) while second dimention represents input charcter (0 to 3). The array gives the next state based on the current state and input character. For example, if the current state is 3 and next input character is quote (0) then the next state is aActionDecider[3][0] i.e. 4.

private static ArrayList CSVParser(string strInputString)
{
    int intCounter = 0, intLenght;
    StringBuilder strElem = new StringBuilder();
    ArrayList alParsedCsv = new ArrayList();
    intLenght = strInputString.Length;
    strElem = strElem.Append("");
    int intCurrState = 0;
    int[][] aActionDecider = new int[9][];
    //Build the state array
    aActionDecider[0] = new int[4] { 2, 0, 1, 5 };
    aActionDecider[1] = new int[4] { 6, 0, 1, 5 };
    aActionDecider[2] = new int[4] { 4, 3, 3, 6 };
    aActionDecider[3] = new int[4] { 4, 3, 3, 6 };
    aActionDecider[4] = new int[4] { 2, 8, 6, 7 };
    aActionDecider[5] = new int[4] { 5, 5, 5, 5 };
    aActionDecider[6] = new int[4] { 6, 6, 6, 6 };
    aActionDecider[7] = new int[4] { 5, 5, 5, 5 };
    aActionDecider[8] = new int[4] { 0, 0, 0, 0 };
    for (intCounter = 0; intCounter < intLenght; intCounter++)
    {
        intCurrState = aActionDecider[intCurrState]
                                  [GetInputID(strInputString[intCounter])];
        //take the necessary action depending upon the state 
        PerformAction(ref intCurrState, strInputString[intCounter], 
                     ref strElem, ref alParsedCsv);
    }
    //End of line reached, hence input ID is 3
    intCurrState = aActionDecider[intCurrState][3];
    PerformAction(ref intCurrState, '\0', ref strElem, ref alParsedCsv); 
    return alParsedCsv;
}

private static int GetInputID(char chrInput)
{
    if (chrInput == '"')
    {
        return 0;
    }
    else if (chrInput == ',')
    {
        return 1;
    }
    else
    {
        return 2;
    }
}
private static void PerformAction(ref int intCurrState, char chrInputChar, 
                    ref StringBuilder strElem, ref ArrayList alParsedCsv)
{
    string strTemp = null;
    switch (intCurrState)
    {
    case 0:
        //Seperate out value to array list
        strTemp = strElem.ToString();
        alParsedCsv.Add(strTemp);
        strElem = new StringBuilder();
        break;
    case 1:
    case 3:
    case 4:
        //accumulate the character
        strElem.Append(chrInputChar);
        break;
    case 5:
        //End of line reached. Seperate out value to array list
        strTemp = strElem.ToString();
        alParsedCsv.Add(strTemp);
        break;
    case 6:
        //Erroneous input. Reject line.
        alParsedCsv.Clear();
        break;
    case 7:
        //wipe ending " and Seperate out value to array list
        strElem.Remove(strElem.Length - 1, 1);
        strTemp = strElem.ToString();
        alParsedCsv.Add(strTemp);
        strElem = new StringBuilder();
        intCurrState = 5;
        break;
    case 8:
        //wipe ending " and Seperate out value to array list
        strElem.Remove(strElem.Length - 1, 1);
        strTemp = strElem.ToString();
        alParsedCsv.Add(strTemp);
        strElem = new StringBuilder();
        //goto state 0
        intCurrState = 0;
        break;
    }
}

About The Demo

The demo program reads CSV file having 4 columns and displays data in datagrid. The demo program is just to show how to use the parser function. It has no practical use.

Download the zip file, run the windows application. Select "temp.csv" file in "data" folder. Click on parse. The datagrid shows content of the csv file.

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPL)

About the Author

Mandar Ranjit Date


Member

Occupation: Web Developer
Location: United States United States

Other popular Algorithms & Recipes articles:

Article Top
You must Sign In to use this message board.
FAQ FAQ 
 
Noise Tolerance  Layout  Per page   
 Msgs 1 to 15 of 15 (Total in Forum: 15) (Refresh)FirstPrevNext
GeneralAdded class wrapper and extension methods [modified] PinmemberChuck141115:35 8 Jun '09  
GeneralSimple Excel version Pinmemberthailandmatt6:24 5 May '09  
GeneralGreat article! Pinmembernospam1961-codeproject@yahoo.com6:47 16 Apr '09  
Generalseems a bit of a long way to do it PinmemberGriffinPeter11:44 14 Oct '08  
QuestionQuestion PinmemberDaniel Kamisnki4:06 3 Sep '08  
AnswerRe: Question PinmemberMandar Ranjit Date8:33 14 Oct '08  
GeneralCool PinmemberFrederic Sivignon2:43 8 Mar '07  
GeneralRe: Cool Pinmemberlmas12:23 20 Nov '07  
GeneralNince one Pinmemberyavor nenov23:11 4 Mar '07  
GeneralHow about escapes? PinmemberPIEBALDconsult11:09 2 Mar '07  
AnswerRe: How about escapes? [modified] PinmemberMandar Ranjit Date19:12 4 Mar '07  
GeneralRe: How about escapes? Pinmembermpwcsfb6:10 27 Jun '07  
GeneralRe: How about escapes? Pinmemberpercyboy22:57 30 Mar '09  
GeneralFormatting PinmvpColin Angus Mackay5:24 2 Mar '07  
GeneralRe: Formatting PinmemberMandar Ranjit Date17:38 7 Mar '07  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 14 Oct 2008
Editor:
Copyright 2007 by Mandar Ranjit Date
Everything else Copyright © CodeProject, 1999-2009
Web13 | Advertise on the Code Project