65.9K
CodeProject is changing. Read more.
Home

Simple typesafe CSV Parser

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.91/5 (11 votes)

Feb 8, 2014

CPOL

1 min read

viewsIcon

21887

downloadIcon

512

Load CSV files as a list of typesafe record entities

Introduction

Doing the bookkeeping and reporting for our app, I had to import a lot of different CSV files, as:

  • Google merchant reports
  • Apple reports
  • Paypal stuff
  • Bank account exports

Creating CSV record entities, reading and converting data and setting the properties of the entities was quite boring work to do. So I wanted a simple solution for operating with the data of a CSV file in a typesafe way.

Using the Code

The idea was to create a class describing a line of a CSV file. With the attribute "CsvField", the name of the CSV file column is bound to the property.

// One record (=line) of the CSV file
public class MyCsvRecord{
    [CsvField("Order ID")]
    public String Orderid{get;set;}
    
    [CsvField("Merchant Order Number")]
    public double Merchantordernumber{get;set;}
    
    [CsvField("Order Creation Date",DateFormat="yyyy-MM-dd")]
    public DateTime Ordercreationdate{get;set;}
 
    // Loads the given CSV file.
    public static List<mycsvrecord> LoadFromFile(String file){
        return CsvParser.ParseCsvAuto<MyCsvrecord>(file, false, ',', Encoding.UTF8);
    }
}

Currently, the following datatypes are supported:

  • String
  • int
  • double
  • DateTime

If DateTime is used, the additional property DateFormat of CsvField should be specified (ParseExact String). For other types, you can specify an IValueConverter:

...
[CsvField("FieldName", ValueConverter=typeof(MyConverter))]
public bool IsEur{get;set;}
...
// Any converter converting a String to the wanted value (as object).
class MyConverter : IValueConverter{
    public object FromString(string value)
    {
        return value.ToUpper() == "EUR";
    }
}

Using the Code Generator

The code also includes a generator (CsvParser.Generator) creating the source for a CSV record class from a given CSV file.

The datatypes of the properties are guessed by investigating the data of the given CSV file.
Details of the used rules for guessing the datatype can be found in:

/// Guesses the type of a value from a given value.
private static String GuessType(String dataValue, out String dateFormat){
    dateFormat = null;
 
    if (dataValue.Trim()=="") return "String";
    int numericCharCount = 0;
    int minusCount = 0;
    int dotCount = 0;
    int commaCount = 0;
    int colonCount = 0;
    int blankCount = 0;
    int noneNumeriCharCount = 0;
 
    foreach(var ch in dataValue){
        if ('0' <= ch && ch <= '9') ++numericCharCount;
        else if (ch=='.') ++dotCount;
        else if (ch==',') ++commaCount;
        else if (ch=='-') ++minusCount;
        else if (ch==':') ++colonCount;
        else if (ch==' ') ++blankCount;
        else{
            ++noneNumeriCharCount;
        }
    }
    // If we have none-numeric chars it is string.
    if (noneNumeriCharCount > 0) return "String";
    // If we only have numeric chars it is integer.
    if ((minusCount + commaCount + dotCount + colonCount) == 0) return "int";
    // If we have one decimal separator it is double.
    if (colonCount==0 && (commaCount+dotCount)==1) return "double";
 
    // If we do not have commas, it could be a Date:
    if (commaCount==0){
        // Date only:
        if (colonCount==0){
            if (minusCount == 2 && dotCount == 0) {
                dateFormat = "yyyy-MM-dd";
                return "DateTime";
            }
            if (minusCount == 0 && dotCount == 2) {
                dateFormat = "dd.MM.yyyy";
                return "DateTime";
            }
        }
        // Date and time: e.g. 2013-06-08 16:03:53
        if (colonCount==2){
            if (minusCount == 2 && dotCount == 0) {
                dateFormat = "yyyy-MM-dd HH:mm:ss";
                return "DateTime";
            }
            if (minusCount == 0 && dotCount == 2) {
                dateFormat = "dd.MM.yyyy HH:mm:ss";
                return "DateTime";
            }
        }
    }
    // Default is string:
    return "String";
}

History

  • 8th February, 2014: Initial version
  • 9th February, 2014: Uploaded a new version of the source adding support for parsing double values with thousands separator
Simple typesafe CSV Parser - CodeProject