Combining DataTables with Different Schema






4.33/5 (3 votes)
There is a problem with accessing Excel files via OleDb which contain more than 255 columns: they are truncated and need to be fetched using ranges. But what do you do with DataTables after that?
Introduction
This came up because there is a problem in QA with OleDb access to Excel files which means that only 255 columns are fetched at a time. That's not difficult to work around - use Ranges to fetch the separate columns into different tables. But ... what then? It's a pain to handle two (or more) separate tables in order to process a single row, and the DataTable.Merge
method combines two tables with the same schema to provide more rows, not more columns.
This tip provides a "combine columns" method to "merge" tables producing the same number of rows, but more columns. Do note that it will fail if the column names overlap - It wouldn't be difficult to add code to get round that, but it would slow the process so I'll leave it as an exercise for the reader.
The Code that Does the Work
/// <summary>
/// Merge a list of tables into a single table with more columns.
/// </summary>
/// <remarks>
/// This isn't the most efficient method, I'm sure - but the code
/// would get seriously incomprehensible to do it all via one
/// single Linq method call. For the sake of readability and
/// maintainability I'll live with the inefficiency.
/// </remarks>
/// <param name="baseTable"></param>
/// <param name="additionalTables"></param>
/// <returns></returns>
public static DataTable MergeTables(DataTable baseTable, params DataTable[] additionalTables)
{
// Build combined table columns
DataTable merged = baseTable;
foreach (DataTable dt in additionalTables)
{
merged = AddTable(merged, dt);
}
return merged;
}
/// <summary>
/// Merge two tables into a single table with more columns.
/// </summary>
/// <param name="baseTable"></param>
/// <param name="additionalTable"></param>
/// <returns></returns>
public static DataTable AddTable(DataTable baseTable, DataTable additionalTable)
{
// Build combined table columns
DataTable merged = baseTable.Clone(); // Include all columns from base table in result.
foreach (DataColumn col in additionalTable.Columns)
{
string newColumnName = col.ColumnName;
merged.Columns.Add(newColumnName, col.DataType);
}
// Add all rows from both tables
var bt = baseTable.AsEnumerable();
var at = additionalTable.AsEnumerable();
var mergedRows = bt.Zip(at, (r1, r2) => r1.ItemArray.Concat(r2.ItemArray).ToArray());
foreach (object[] rowFields in mergedRows)
{
merged.Rows.Add(rowFields);
}
return merged;
}
Only one line is complicated:
var mergedRows = bt.Zip(at, (r1, r2) => r1.ItemArray.Concat(r2.ItemArray).ToArray());
Zip just executes a Lambda on each "pair" of items: a DataRow
from the base table, and a DataRow
from the additional table.
The Lambda just concatenates the two item collections into a single collection and converts it into an array of objects. The array just makes it easier to describe the collection content in the following loop: using a var
there confuses the compiler a bit. (Now you see why I do this the inefficient way!)
Using the Code
It's pretty obvious how to use it - you need a few tables:
ID | Name | |
1 | Mike | Mike@MyDomain.com |
2 | Susan | Susan@MyDomain.com |
City | Country |
London | England |
Cardiff | Wales |
Favourite Colour | Favourite Fruit |
Green | Apple |
Yellow | Banana |
As in:
DataTable dt1 = new DataTable();
dt1.Columns.Add("ID", typeof(int));
dt1.Columns.Add("Name", typeof(string));
dt1.Columns.Add("Email", typeof(string));
dt1.Rows.Add(1, "Mike", "Mike@MyDomain.com");
dt1.Rows.Add(2, "Susan", "Susan@MyDomain.com");
var dt2 = new DataTable();
dt2.Columns.Add("City", typeof(string));
dt2.Columns.Add("Country", typeof(string));
dt2.Rows.Add("London", "England");
dt2.Rows.Add("Cardiff", "Wales");
var dt3 = new DataTable();
dt3.Columns.Add("Favourite Colour", typeof(string));
dt3.Columns.Add("Favourite Fruit", typeof(string));
dt3.Rows.Add("Green", "Apple");
dt3.Rows.Add("Yellow", "Banana");
Then just combine the tables:
DataTable dtMerged = MergeTables(dt1, dt2, dt3);
ID | Name | City | Country | Favourite Colour | Favourite Fruit | |
1 | Mike | Mike@MyDomain.com | London | England | Green | Apple |
2 | Susan | Susan@MyDomain.com | Cardiff | Wales | Yellow | Banana |
History
- 2017-11-21 First version