Click here to Skip to main content
11,495,758 members (60,982 online)
Click here to Skip to main content

Tagged as

Simple typesafe CSV Parser

, 8 Feb 2014 CPOL 8.4K 311 23
Rate this:
Please Sign up or sign in to vote.
Load CSV files as a list of typesafe record entities

Introduction

Doing the bookkeeping and reporting for our app, I had to import a lot of different CSV files, as:

  • Google merchant reports
  • Apple reports
  • Paypal stuff
  • Bank account exports

Creating CSV record entities, reading and converting data and setting the properties of the entities was quite boring work to do. So I wanted a simple solution for operating with the data of a CSV file in a typesafe way.

Using the Code

The idea was to create a class describing a line of a CSV file. With the attribute "CsvField", the name of the CSV file column is bound to the property.

// One record (=line) of the CSV file
public class MyCsvRecord{
    [CsvField("Order ID")]
    public String Orderid{get;set;}
    
    [CsvField("Merchant Order Number")]
    public double Merchantordernumber{get;set;}
    
    [CsvField("Order Creation Date",DateFormat="yyyy-MM-dd")]
    public DateTime Ordercreationdate{get;set;}
 
    // Loads the given CSV file.
    public static List<mycsvrecord> LoadFromFile(String file){
        return CsvParser.ParseCsvAuto<MyCsvrecord>(file, false, ',', Encoding.UTF8);
    }
}

Currently, the following datatypes are supported:

  • String
  • int
  • double
  • DateTime

If DateTime is used, the additional property DateFormat of CsvField should be specified (ParseExact String). For other types, you can specify an IValueConverter:

...
[CsvField("FieldName", ValueConverter=typeof(MyConverter))]
public bool IsEur{get;set;}
...
// Any converter converting a String to the wanted value (as object).
class MyConverter : IValueConverter{
    public object FromString(string value)
    {
        return value.ToUpper() == "EUR";
    }
}

Using the Code Generator

The code also includes a generator (CsvParser.Generator) creating the source for a CSV record class from a given CSV file.

The datatypes of the properties are guessed by investigating the data of the given CSV file.
Details of the used rules for guessing the datatype can be found in:

/// Guesses the type of a value from a given value.
private static String GuessType(String dataValue, out String dateFormat){
    dateFormat = null;
 
    if (dataValue.Trim()=="") return "String";
    int numericCharCount = 0;
    int minusCount = 0;
    int dotCount = 0;
    int commaCount = 0;
    int colonCount = 0;
    int blankCount = 0;
    int noneNumeriCharCount = 0;
 
    foreach(var ch in dataValue){
        if ('0' <= ch && ch <= '9') ++numericCharCount;
        else if (ch=='.') ++dotCount;
        else if (ch==',') ++commaCount;
        else if (ch=='-') ++minusCount;
        else if (ch==':') ++colonCount;
        else if (ch==' ') ++blankCount;
        else{
            ++noneNumeriCharCount;
        }
    }
    // If we have none-numeric chars it is string.
    if (noneNumeriCharCount > 0) return "String";
    // If we only have numeric chars it is integer.
    if ((minusCount + commaCount + dotCount + colonCount) == 0) return "int";
    // If we have one decimal separator it is double.
    if (colonCount==0 && (commaCount+dotCount)==1) return "double";
 
    // If we do not have commas, it could be a Date:
    if (commaCount==0){
        // Date only:
        if (colonCount==0){
            if (minusCount == 2 && dotCount == 0) {
                dateFormat = "yyyy-MM-dd";
                return "DateTime";
            }
            if (minusCount == 0 && dotCount == 2) {
                dateFormat = "dd.MM.yyyy";
                return "DateTime";
            }
        }
        // Date and time: e.g. 2013-06-08 16:03:53
        if (colonCount==2){
            if (minusCount == 2 && dotCount == 0) {
                dateFormat = "yyyy-MM-dd HH:mm:ss";
                return "DateTime";
            }
            if (minusCount == 0 && dotCount == 2) {
                dateFormat = "dd.MM.yyyy HH:mm:ss";
                return "DateTime";
            }
        }
    }
    // Default is string:
    return "String";
}

History

  • 8th February, 2014: Initial version
  • 9th February, 2014: Uploaded a new version of the source adding support for parsing double values with thousands separator

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Michael Kurz, ape@map
Software Developer Onyx Technologie OG
Austria Austria
No Biography provided

Comments and Discussions

 
QuestionMy vote of 5 Pin
Clemens Lackner25-Feb-14 9:14
memberClemens Lackner25-Feb-14 9:14 
GeneralEmpty string at end of input lines Pin
Alex Strickland18-Feb-14 12:21
memberAlex Strickland18-Feb-14 12:21 
QuestionMy vote of 5 Pin
Steven M Hunt10-Feb-14 11:52
memberSteven M Hunt10-Feb-14 11:52 
GeneralMy vote of 5 Pin
johannesnestler10-Feb-14 1:44
memberjohannesnestler10-Feb-14 1:44 
GeneralMy vote of 5 Pin
Wooters9-Feb-14 6:45
memberWooters9-Feb-14 6:45 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.150520.1 | Last Updated 8 Feb 2014
Article Copyright 2014 by Michael Kurz, ape@map
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid