Introduction
Sometimes there's the need to import and handle sets of data coming from a CVS file. The usual steps are to read each line of the flat file as an array of strings
and map them to the corresponding entity.
Scope
The aim of the article is to provide a strongly typed access to flat files; this is done by associating, with a lambda expression, the property of our entity with
the position on the file. This can be handy specially in these days where newnesses like EF Code First are coming out and there's the need to initialize the table with
more than a few records created manually.
Using the code
The class responsible to parse and cast the file into a list of the desired objects is CsvManager
. Once we define the target type, we can associate
the columns we want to get and their position, and the unmapped fields will be ignored. We can also specify if the first line of the CSV contains the column names
in order to skip it. The SetField
method only supports string properties but I guess it can be easily extended.
Update (26/08/2011)
I've updated the SetField
method, it now supports:
The Solution contains three projects:
- GM.CsvToObj.Lib contains the CsvToObj library responsible for the CSV handling.
- GM.CsvToObj.Models contains a few entities used to run the code sample. It also contains the initializer for Code First and the Entity Framework Context.
- GM.CsvToObj.ConsoleApp is responsible for creating the database and calling the initializer. It also triggers the parsing of the CSVs and populates the database.
CsvManager<Postcode> postcodeManager =
new CsvManager<Postcode>(@"c:\temp\postcodes.csv", true);
postcodeManager.SetField(x => x.Name, 0);
postcodeManager.SetField(x => x.Latitute, 1);
postcodeManager.SetField(x => x.Longitudine, 2);
postcodeManager.SetField(x => x.Easting, 3);
postcodeManager.SetField(x => x.Northing, 4);
postcodeManager.SetField(x => x.Grid, 5);
postcodeManager.SetField(x => x.Area, 6);
postcodeManager.SetField(x => x.Region, 7);
List<Postcode> postcodesResult = postcodeManager.GetObjectList();
Here are the CsvManager
and the CsvReader
:
public class CsvManager<T> where T: new()
{
public char DefaultSeparator { get; set; }
public List<CsvFieldTarget> CsvFieldsToMap { get; set; }
public string FilePath { get; set; }
public CsvReader Reader { get; set; }
public bool IsFirstLineColumnName { get; private set; }
public CsvManager(string filePath, bool isFirstLineColumnName, char separator)
{
CsvFieldsToMap = new List<CsvFieldTarget>();
DefaultSeparator = separator;
IsFirstLineColumnName = isFirstLineColumnName;
FilePath = filePath;
}
public CsvManager(string filePath, bool isFirstLineColumnName):
this(filePath, isFirstLineColumnName,','){}
#region Public methods
public void SetField(Expression<Func<T, string>> expression, int position)
{
GetAndSetFieldTarget(expression, position);
}
public void SetField(Expression<Func<T,ValueType>> expression, int position)
{
GetAndSetFieldTarget(expression, position);
}
public void SetField(Expression<Func<T, DateTime>> expression, int position)
{
GetAndSetFieldTarget(expression, position);
}
public List<T> GetObjectList()
{
Reader = new CsvReader(FilePath, CsvFieldsToMap,
DefaultSeparator, IsFirstLineColumnName);
var csvRows = GetRowsFromFile();
var resultList = new List<T>(csvRows.Count);
foreach (var csvRow in csvRows)
{
var destinationObject = new T();
var createdObj = SetPropertyViaReflection(destinationObject, csvRow);
resultList.Add(createdObj);
}
return resultList;
}
#endregion
#region Private methods
private void GetAndSetFieldTarget(Expression expression, int position)
{
var property = GetMemberInfo(expression);
CsvFieldsToMap.Add(new CsvFieldTarget()
{
FieldName = property.Name,
Position = position
});
}
private List<CsvRow> GetRowsFromFile()
{
var csvRows = Reader.ReadCsvRows();
return csvRows;
}
private T SetPropertyViaReflection(T destinationObj, CsvRow csvRow)
{
Type type = destinationObj.GetType();
foreach (var csvFieldResult in csvRow.CsvFieldsResult)
{
PropertyInfo prop = type.GetProperty(csvFieldResult.FieldName);
var propertyType = prop.PropertyType;
var convertedValue =
Convert.ChangeType(csvFieldResult.FieldValue, propertyType);
prop.SetValue(destinationObj, convertedValue, null);
}
return destinationObj;
}
private static MemberInfo GetMemberInfo(Expression method)
{
var lambda = method as LambdaExpression;
if (lambda == null)
throw new InvalidCastException("Invalid lambda expression");
MemberExpression memberExpression = null;
switch (lambda.Body.NodeType)
{
case ExpressionType.Convert:
memberExpression =
((UnaryExpression) lambda.Body).Operand as MemberExpression;
break;
case ExpressionType.MemberAccess:
memberExpression = lambda.Body as MemberExpression;
break;
}
if (memberExpression == null)
throw new ArgumentException("Invalid expression");
return memberExpression.Member;
}
#endregion
}
public class CsvReader
{
#region Public members
public List<CsvFieldTarget> CsvFieldTargets { get; private set; }
public List<CsvRow> CsvRowsResult { get; private set; }
public char Separator { get; private set; }
public bool IsFirstLineColumnName { get; private set; }
public string FilePath { get; private set; }
#endregion
#region Constructors
public CsvReader(string filePath, List<CsvFieldTarget>
csvFieldTargets, char separator, bool isFirstLineColumnName)
{
FilePath = filePath;
CsvFieldTargets = csvFieldTargets;
Separator = separator;
IsFirstLineColumnName = isFirstLineColumnName;
CsvRowsResult = new List<CsvRow>();
}
#endregion
#region Public methods
public List<CsvRow> ReadCsvRows()
{
CsvRowsResult = IsFirstLineColumnName ?
GetCsvRowIterator().Skip(1).ToList() : GetCsvRowIterator().ToList();
return CsvRowsResult;
}
#endregion
#region Private methods
private IEnumerable<CsvRow> GetCsvRowIterator()
{
using (StreamReader readFile = new StreamReader(FilePath))
{
string line;
string[] row;
while ((line = readFile.ReadLine()) != null)
{
row = line.Split(Separator);
var resultRow = BuildCsvRow(row);
yield return resultRow;
}
}
}
private CsvRow BuildCsvRow(string[] row)
{
var resultRow = new CsvRow(CsvFieldTargets.Count);
foreach (var csvField in CsvFieldTargets)
{
var field = new CsvFieldResult()
{
FieldName = csvField.FieldName,
Position = csvField.Position,
FieldValue = row[csvField.Position]
};
resultRow.CsvFieldsResult.Add(field);
}
return resultRow;
}
#endregion
}
Usage example
The sample attached contains a console application that triggers the creation of three tables with EF Code First that gets then initialized using the CsvToObj library
with two flat files, one for countries and one for the UK postcodes; the result can be seen through Management Studio:

Update (26/08/2011)
The sample file USPresident_Wikipedia.csv has been added. The corresponding class UsPresident
has a few properties as DateTimes and integers.
public class UsPresident
{
[Key]
public int PresidencyId { get; set; }
public string President { get; set; }
public string WikipediaEntry { get; set; }
public DateTime TookOffice { get; set; }
public DateTime LeftOffice { get; set; }
public string Party { get; set; }
public string Portrait { get; set; }
public string Thumbnail { get; set; }
public string HomeState { get; set; }
}
Mapping:
private static List<UsPresident> LoadUsPresidents()
{
CsvManager<UsPresident> presidentsManager =
new CsvManager<UsPresident>(
@"c:\temp\USPresident_Wikipedia.csv", true);
presidentsManager.SetField(x=>x.PresidencyId, 0);
presidentsManager.SetField(x => x.President, 1);
presidentsManager.SetField(x => x.WikipediaEntry, 2);
presidentsManager.SetField(x => x.TookOffice, 3);
presidentsManager.SetField(x => x.LeftOffice, 4);
presidentsManager.SetField(x => x.Party, 5);
presidentsManager.SetField(x => x.Portrait, 6);
presidentsManager.SetField(x => x.Thumbnail, 7);
presidentsManager.SetField(x => x.HomeState, 8);
List<UsPresident> presidentsResult = presidentsManager.GetObjectList();
return presidentsResult;
}
Here's the corresponding table created by EF:

Dependencies
In order to run the sample, you will need the latest version of Entity Framework. This is only needed if you want to use the Code First feature
in the sample: http://www.microsoft.com/download/en/details.aspx?id=8363.
Last thing: create the c:\temp folder and add the CSV sample files included in the sample.