Click here to Skip to main content
13,768,884 members
Click here to Skip to main content
Add your own
alternative version

Tagged as

Stats

55.6K views
37 bookmarked
Posted 27 Sep 2014
Licenced CPOL

C# - Light and Fast CSV Parser

, 27 Sep 2014
Rate this:
Please Sign up or sign in to vote.
Light yet functional CSV Parser with custom delimiters and qualifiers, yield returns records.

Introduction

Parsing CSV files may sound like an easy task, but in reality it is not that trivial. Below is a CsvParser class implementation that I use in my own projects. It supports the following features that I find critical:

  • Custom Delimiter and Qualifier characters
  • Supports quoting notation (allows delimiter character to be part of a value)
  • Supports quote escaping (allows quote character to be part of a value)
  • Supports both '\n' and '\r\n' line endings
  • Designed to return IEnumerable via yield return (no memory buffers)
  • Designed to return Header and the rest of lines separately (using Tuple).

Source Code

public static class CsvParser
{
    private static Tuple<T, IEnumerable<T>> HeadAndTail<T>(this IEnumerable<T> source)
    {
        if (source == null)
            throw new ArgumentNullException("source");
        var en = source.GetEnumerator();
        en.MoveNext();
        return Tuple.Create(en.Current, EnumerateTail(en));
    }

    private static IEnumerable<T> EnumerateTail<T>(IEnumerator<T> en)
    {
        while (en.MoveNext()) yield return en.Current;
    }

    public static IEnumerable<IList<string>> Parse(string content, char delimiter, char qualifier)
    {
        using (var reader = new StringReader(content))
            return Parse(reader, delimiter, qualifier);
    }

    public static Tuple<IList<string>, IEnumerable<IList<string>>> ParseHeadAndTail(TextReader reader, char delimiter, char qualifier)
    {
        return HeadAndTail(Parse(reader, delimiter, qualifier));
    }

    public static IEnumerable<IList<string>> Parse(TextReader reader, char delimiter, char qualifier)
    {
        var inQuote = false;
        var record = new List<string>();
        var sb = new StringBuilder();

        while (reader.Peek() != -1)
        {
            var readChar = (char) reader.Read();

            if (readChar == '\n' || (readChar == '\r' && (char) reader.Peek() == '\n'))
            {
                // If it's a \r\n combo consume the \n part and throw it away.
                if (readChar == '\r')
                    reader.Read();

                if (inQuote)
                {
                    if (readChar == '\r')
                        sb.Append('\r');
                    sb.Append('\n');
                }
                else
                {
                    if (record.Count > 0 || sb.Length > 0)
                    {
                        record.Add(sb.ToString());
                        sb.Clear();
                    }

                    if (record.Count > 0)
                        yield return record;

                    record = new List<string>(record.Count);
                }
            }
            else if (sb.Length == 0 && !inQuote)
            {
                if (readChar == qualifier)
                    inQuote = true;
                else if (readChar == delimiter)
                {
                    record.Add(sb.ToString());
                    sb.Clear();
                }
                else if (char.IsWhiteSpace(readChar))
                {
                    // Ignore leading whitespace
                }
                else
                    sb.Append(readChar);
            }
            else if (readChar == delimiter)
            {
                if (inQuote)
                    sb.Append(delimiter);
                else
                {
                    record.Add(sb.ToString());
                    sb.Clear();
                }
            }
            else if (readChar == qualifier)
            {
                if (inQuote)
                {
                    if ((char) reader.Peek() == qualifier)
                    {
                        reader.Read();
                        sb.Append(qualifier);
                    }
                    else
                        inQuote = false;
                }
                else
                    sb.Append(readChar);
            }
            else
                sb.Append(readChar);
        }

        if (record.Count > 0 || sb.Length > 0)
            record.Add(sb.ToString());

        if (record.Count > 0)
            yield return record;
    }
}

Using the Code

Here is an example of reading CSV file. The following code snippet parses out the first 5 records and prints them out to the Console in form of key/value pairs:

const string fileName = @"C:\Temp\file.csv";
using (var stream = File.OpenRead(fileName))
using (var reader = new StreamReader(stream))
{
    var data = CsvParser.ParseHeadAndTail(reader, ',', '"');

    var header = data.Item1;
    var lines = data.Item2;

    foreach (var line in lines.Take(5))
    {
        for (var i = 0; i < header.Count; i++)
            if (!string.IsNullOrEmpty(line[i]))
                Console.WriteLine("{0}={1}", header[i], line[i]);
        Console.WriteLine();
    }
}
Console.ReadLine();

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Yuriy Magurdumov
Architect
United States United States
No Biography provided

You may also be interested in...

Comments and Discussions

 
QuestionAbout the fields in each line Pin
Ing. Cristian Marucci19-Oct-18 3:18
professionalIng. Cristian Marucci19-Oct-18 3:18 
QuestionDo not ignore leading whitespace at all Pin
seblon26-May-18 7:20
memberseblon26-May-18 7:20 
QuestionYikes Pin
Jeremy Stafford 14-May-17 16:52
memberJeremy Stafford 14-May-17 16:52 
AnswerRe: Yikes Pin
Yuriy Magurdumov5-May-17 6:22
memberYuriy Magurdumov5-May-17 6:22 
GeneralRe: Yikes Pin
Jeremy Stafford 15-May-17 8:01
memberJeremy Stafford 15-May-17 8:01 
GeneralRe: Yikes Pin
Yuriy Magurdumov5-May-17 9:51
memberYuriy Magurdumov5-May-17 9:51 
GeneralRe: Yikes Pin
Jeremy Stafford 15-May-17 12:07
memberJeremy Stafford 15-May-17 12:07 
QuestionCR only line breaks Pin
ssdred15-Feb-17 9:22
memberssdred15-Feb-17 9:22 
QuestionShort Question Pin
david123@codeproject4-Sep-15 3:59
memberdavid123@codeproject4-Sep-15 3:59 
AnswerRe: Short Question Pin
Yuriy Magurdumov4-Sep-15 4:55
memberYuriy Magurdumov4-Sep-15 4:55 
QuestionYuramag ! Very good article! Pin
Volynsky Alex29-Sep-14 13:19
professionalVolynsky Alex29-Sep-14 13:19 
QuestionVery nice. Just a few suggestions Pin
irneb29-Sep-14 5:15
memberirneb29-Sep-14 5:15 
AnswerRe: Very nice. Just a few suggestions Pin
Yuriy Magurdumov30-Sep-14 5:44
memberYuriy Magurdumov30-Sep-14 5:44 
GeneralRe: Very nice. Just a few suggestions Pin
irneb30-Sep-14 20:25
memberirneb30-Sep-14 20:25 
GeneralRe: Very nice. Just a few suggestions Pin
irneb30-Sep-14 20:58
memberirneb30-Sep-14 20:58 
GeneralRe: Very nice. Just a few suggestions Pin
Yuriy Magurdumov1-Oct-14 5:11
memberYuriy Magurdumov1-Oct-14 5:11 
GeneralRe: Very nice. Just a few suggestions Pin
PIEBALDconsult30-Sep-14 5:48
protectorPIEBALDconsult30-Sep-14 5:48 
GeneralRe: Very nice. Just a few suggestions Pin
irneb30-Sep-14 20:17
memberirneb30-Sep-14 20:17 
GeneralRe: Very nice. Just a few suggestions Pin
PIEBALDconsult1-Oct-14 5:42
protectorPIEBALDconsult1-Oct-14 5:42 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web05-2016 | 2.8.181117.1 | Last Updated 27 Sep 2014
Article Copyright 2014 by Yuriy Magurdumov
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid