5,550,131 members and growing! (19,611 online)
Email Password   helpLost your password?
Enterprise Systems » Office Development » Microsoft Excel     Intermediate License: The Code Project Open License (CPOL)

A Very Easy to Use Excel XML Import-Export Library

By Yogesh Jagota

Import export library for Excel XML Format, which reduces a programmer's work to the bare minimum
C++/CLI, C# (C# 1.0, C# 2.0, C# 3.0, C#), VB, .NET (.NET, .NET 2.0, .NET 3.5, .NET 3.0)

Posted: 15 Jan 2008
Updated: 23 Jul 2008
Views: 53,526
Bookmarked: 177 times
Announcements
Want a new Job?



Search    
Advanced Search
Sitemap
48 votes for this Article.
Popularity: 6.50 Rating: 3.87 out of 5
5 votes, 11.1%
1
2 votes, 4.4%
2
3 votes, 6.7%
3
9 votes, 20.0%
4
26 votes, 57.8%
5
Note: This is an unedited contribution. If this article is inappropriate, needs attention or copies someone else's work without reference then please Report This Article

Code breaking changes in v3.06

Formula system does not work the way it used to, so the previous code might break. For backward compatibility, I have included a static class FormulaHelper which can be used with the previous code. Just replace the previous code in the following way:

cell.Value = new Formula("Sum", new Range(sheet[3, 3], sheet[6, 3]));

to

cell.Value = FormulaHelper.Formula("Sum", new Range(sheet[3, 3], sheet[6, 3]));

Introduction

Just like any other Excel XML export article/library writer, my quest started when I wanted to export a simple report to Excel. I knew that writing a Excel XML library is very easy and I was sure I will find plenty of such libraries on the net. I did found many, but none of them was simple, yet intuitive, powerful and implemented everything I needed. I can take ages to write a background component like a export library, but when using it, I want it to do my work fast and without fuss. The library is compatible with .Net 2.0/3.0/3.5.

So I came with this library which is easy and fast to use, yet is very powerful to be considered to be one of the most powerful XML export libraries present. OK, I will count it as a shameless self promotion.

PS: The XML file format is only supported by Excel versions XP, 2003 and 2007. Previous versions, i.e. Excel 97 and Excel 2000 do not support this feature.

Features

There are a multitude of features which are present in the library. They are...

  • Full import and export of Excel XML files
  • Access cells from anywhere including ExcelXmlWorkbook, Worksheet, Row and Cell
  • Full support for formulae and ranges
  • Imported files's formulae and ranges are parsed to Formula and Range instances
  • Multitude of Insert, Delete and Add functions available for ExcelXmlWorkbook, Worksheet, Row and Cell
  • Auto management of cell references in the book on any change made via any of the Insert, Delete and Add functions
  • Full support for styles in Worksheet, Row and Cell and even Range
  • No instance declaration. All the dirty work is done by the library
  • Cell ContentType, which enables you to know what the cell actually contains
  • Dataset to ExcelXmlWorkbook conversion facility
  • Absolute and Non-absolute Ranges and Named ranges
  • Auto filter
  • Very fast speed as everything is done via XmlWriter
  • Ability to freeze rows or columns or both
  • Almost all print options
  • Almost all style options
  • Many formatting options
  • Hidden columns and rows

Using the Library

Using the code is very easy. This was the primary concern when I was building this library. The primary or top level class is ExcelXmlWorkbook which contains multiple Worksheets. The library resides in Yogesh.Extensions.ExcelXml. The following example shows the various ways of adding cells in a Workbook right from creating a instance.

// Create the instance
ExcelXmlWorkbook book = new ExcelXmlWorkbook();

// Many such properties exist. Details can be found in the documentation
book.Properties.Author = "Yogesh Jagota"; // The author of the document

// This returns the first worksheet.
// Note that we have not declared a instance of a new worksheet
// All the dirty work is done by the library.
Worksheet sheet = book[0];

// Name is the name of the sheet. If not set, the default name
// style is "sheet" + sheet number, like sheet1, sheet2
sheet.Name = "AgewiseOutstanding";

// More on this in documentation
sheet.FreezeTopRows = 3;

// and this too...
sheet.PrintOptions.Orientation = PageOrientation.Landscape;
sheet.PrintOptions.SetMargins(0.5, 0.4, 0.5, 0.4);

// This is the actual code which sets out the cell values
// Note again, that we don't declare any instance at all.
// All the work is done by the library.
// Index operator takes first value as column and second as row.
sheet[0, 0].Value = "Outstanding as on " + DateTime.Now;

sheet[0, 1].Value = "Name of Party";
sheet[1, 1].Value = "RSM";
sheet[2, 1].Value = "ASM";
sheet[3, 1].Value = "0-30";
sheet[4, 1].Value = "31-60";
sheet[5, 1].Value = "61-90";
sheet[6, 1].Value = "91+";

sheet[0, 2].Value = "M/s Stupid Paymaster";
sheet[1, 2].Value = "Mr. Nonsense";
sheet[2, 2].Value = "Mr. More Nonsense";
sheet[3, 2].Value = 0;
sheet[4, 2].Value = 5000;
sheet[5, 2].Value = 45000;
sheet[6, 2].Value = 0;

sheet[0, 3].Value = "M/s Good Paymaster";
sheet[1, 3].Value = "Mr. Good RSM";
sheet[2, 3].Value = "Mr. Good ASM";
sheet[3, 3].Value = 32000;
sheet[4, 3].Value = 0;
sheet[5, 3].Value = 0;
sheet[6, 3].Value = 0;
sheet[7, 3].Value = sheet[6, 3];

string outputFile = "Outstanding File.xml"; // no extension is added if not present

book.Export(outputFile);

Importing a File

To import a file, you can either supply a file name or a Stream object to the static ExcelXmlWorkbook.Import method which returns a ExcelXmlWorkbook instance loaded with the fine. If any error occurs, the function simply returns null and there is no way to find out what error occurred. So the preferred way of import is to pass a Stream because it gives you more control with error management.

Exporting the File

All the code is written to disk only when the Export function is called. Export must be supplied with either a Stream or a File name. If any error occurs, the function simply returns false and there is no way to find out what error occurred. So the preferred way of export is to pass a Stream because it gives you more control with error management.

Assigning Values to Cells

Notice the last assignment in the previous example:

sheet[7, 3].Value = sheet[6, 2];

Here we are actually assigning a cell to a cell. What will be the value of the cell you might wonder? The cell will not have a value at all. It will have a reference to the assigned cell, something like this when you will open the file in excel, =G3. It wont be a absolute reference, more on that later. So, We can assign these values to a cell.

  1. string
  2. bool
  3. All integar types i.e. byte, sbyte, int, uint, long, ulong, float, double, decimal.
  4. DateTime
  5. Cell
  6. Formula [More on this below]

Knowing the type of content a cell contains

Every Cell contains a ContentType readonly field which can be used to check what value type the cell contains. Available values are String, Number, Boolean, DateTime, Formula, UnresolvedValue.

Retrieving Values from Cells

A readonly property GetValue<T> returns the cell value converted to the type supplied. You can use ContentType with GetValue<T> to retrieve the exact value of a cell. GetValue<T> enables strict type checking when retrieving a cell value. Further, if the type supplied with GetValue<T> does not matches the type of the cell type, default(T) is returned instead. For example: if a cell's ContentType == ContentType.Numeric, the only way to retrieve value of the cell is to supply byte, sbyte, int, uint, long, ulong, float, double. If a cell's ContentType == ContentType.String, the only way to retrieve value of the cell is to supply string etc.

Various Ways of Accessing the Cells

There is no hard coded way of accessing a particular cell. There are numerous ways of doing so. For example, the fourth column of the second row in the last example can be set to a value of 1 by...

  1. Directly using the ExcelXmlWorkbook class.
    book[0][3, 1].Value = 1
  2. Using the Worksheet class.
    Worksheet sheet = book[0];
    sheet[3, 1].Value = 1
  3. Using the Row class.
    Worksheet sheet = book[0];
    Row row = sheet[1];
    row[3].Value = 1
  4. Using the Cell class.
    Worksheet sheet = book[0];
    Row row = sheet[1];
    Cell cell = row[3];
    cell.Value = 1

Note that we do not need to declare a instance of a new worksheet, row or cell. All the dirty work is done by the library. This style of coding opens many ways of accessing cells and rows.

Styles

All cells, rows, worksheets have styles which can be set individually. These are Font, Alignment, Interior, Border and DisplayFormat. More information can be found in the documentation about members of the style classes. Changing a worksheet style setting affects all cells in the worksheet. A row setting affects all child cells in the row and a single cell setting affects, well, that very cell. Example:

sheet[1, 3].Font.Bold = true;

All the functionality of a style is implemented in a class XmlStyle. You can create a instance to XmlStyle in your code and assign it to the Style property which is present in all cells, rows and worksheets. Example:

XmlStyle style = new XmlStyle();
style.Font.Bold = true;
sheet[1, 3].Style = style;

Ranges

The main reason of writing my own implementation was ranges, which I found missing or not having the powers which ranges should have. In this library, ranges are very powerful and extendible. Range have all the style elements found in cells, rows and worksheets. Example:
// This sets the text of cells 1-8 of row 3 to bold
Range range = new Range(sheet[0, 2], sheet[7, 2]);
range.Font.Bold = true;

Even this is valid code, although many might recommend of doing it the first way...

new Range(sheet[0, 2], sheet[7, 2]).Font.Bold = true;

Please note that ranges can not be assigned to a cell value. Assigning it will generate a empty cell. Range can contain a single cell or range of cells. In the above example, we are providing the constructor with the first cell and the last cell. Ranges always contain rectangular ranges just like in Excel.

Applying auto filter to ranges

To apply auto filter to a range, you only need to call the range's AutoFilter method and you are done. Example:
new Range(sheet[0, 1], sheet[6, 3]).AutoFilter();

Absolute and Non-absolute Ranges

By default, all ranges output a non-absolute reference. To set up a absolute reference, just set the Absolute property of the range to true.

Range range = new Range(sheet[0 ,2], sheet[7, 2]);
range.Font.Bold = true;
range.Absolute = true;

Functions

Now we come to the real use of ranges and their Absolute property: Adding functions. I think a function in my library can be easily understood by this example which uses the first example in this article.

sheet[7, 3].Value = FormulaHelper.Formula("sum", new Range(sheet[3, 3], sheet[6, 3]));
or
sheet[7, 3].Value = new Formula().Add("sum").StartGroup().Add(new Range(sheet[3, 3], sheet[6, 3])).EndGroup();

When you will open this book in excel, the value of the cell will be =SUM(D4:G4).

Function Parameters

Here we have added one single parameter in the formula constructor. You can add as many parameters as you want using the Add function of the Formula class. Only two types of parameters are allowed though, string or Range. The string parameter type can be used to add any value and named ranges also (Read more about named ranges in documentation). Example:

Formula formula = new Formula().Add("sum").StartGroup();

formula.Add("D4").Operator(',');

// Here I am using the object initializers just to fit the code in one line
// The library is compatible with both VS2005 and VS2008
formula.Add(new Range(sheet[4, 3]) { Absolute = true } ).Operator(',');

formula.Add(new Range(sheet[5, 3], Range(sheet[6, 3])).EndGroup();

sheet[7, 3].Value = formula;

When you will open this book in excel, the value of the cell will be =SUM(D4, $E$4, F4:G4).

Filtering Cells as Parameters by Checking Cell Value or Style

You can also filter all cells and auto add them to the parameter list of a formula by passing a parameter, i.e. a delegate which accepts Cell as its value and returns bool to both Formula constructor or Add. All the values accessors (i.e. Value, IntValue etc.) and cell style can be checked. Examples:

  • // Lets assume column 1,2,3,6 and 7 are bold...
    XmlStyle style = new XmlStyle();
    style.Font.Bold = true;
    
    // VS2008 style
    sheet[7, 3].Value = FormulaHelper.Formula("sum", new Range(sheet[0, 3], sheet[6, 3]), 
            cell => cell.Style == style);
    
    // or VS2005 style
    sheet[7, 3].Value = FormulaHelper.Formula("sum", new Range(sheet[0, 3], sheet[6, 3]), 
            delegate (Cell cell) { return cell.Style == style; } );
  • sheet[7, 3].Value = FormulaHelper.Formula("sum", new Range(sheet[0, 3], sheet[6, 3]), 
                cell => cell.GetValue<int>() > 10000 && cell.GetValue<int>() <= 50000);

In the first example of style, the value of the cell will be =SUM(A4:C4, F4:G4). Continuous ranges matching to true will be joined as one parameter, i.e. A4:C4 and not three parameters, i.e. A4,B4,C4.

Modifying imported excel xml files

Imported excel xml files can be modified directly via direct assignment just like new files. Further to this, there are many functions which allow Insertion, Deletion and Addition of...

  • One or multiple worksheets in books, eg. InsertSheetBefore, InsertSheetAfter
  • One or multiple rows and columns in sheets eg. InsertColumnAfter, InsertColumnsAfter, InsertRowBefore, InsertRowsBefore
  • One or multiple cells in rows eg. InsertCellBefore, InsertCellsBefore

See the documentation for more on these functions.

Exporting a dataset to a ExcelXmlWorksheet

A static member, in ExcelXmlWorksheet, DataSetToWorkbook is provided which converts a dataset and returns a worksheet reference. All the tables are converted into different sheets of the workbook.

Usage:

ExcelXmlWorksheet sheet = ExcelXmlWorksheet.DataSetToWorkbook(sourceDataSet)

Cell Collection

Cell collection is a strongly typed List collection with full support for Linq. You can use the Add method to add Worksheet, Range, Row or Cell. You can add all cells or you can filter the cells using a predicate.

Memory

Looking at all this code might make you think that all the cells, rows, worksheets, ranges must be using too much memory. They must also be having there own separate copy of styles which will cause extra overhead. The answer is no.

I have optimized the library to use as little memory as I could. As far as the styles go, if you have a 100,000 cell workbook written programmatically, which contains only 10 individual styles, the number of styles in memory will be only 11, i.e. 10 separate styles + 1 default style. Although the styles are added on a book level, so if you have 10 books with 10 same styles present in all of them, the number of style instances active in the program will be 110.

Updates

  • [23 Jul 2008] Revision 3.06
    • New Formula system implemented.
    • RowSpan and ColumnSpan properties added to cell.
    • Small error in named range rename fixed.
    • Freeze column won't work if freeze row is set. Fixed.
    • Style was not saved in Column export. Fixed.
    • Assembly file contained wrong information about this library. Fixed.
    • Workbook export can throw a NullReferenceException when new XmlWriter is created. Fixed.
    • Documented IStyle interfaces.
  • [10 Jun 2008] Revision 2.89
    • Added Print Area support.
    • Added TabColor property to Worksheet.
    • Added reference (HRef) support to cell.
    • Added support for custom display formats.
    • Added support for cell patterns.
    • Fixed error where "Long Date", "Short Date", "Time", "@" as DisplayFormat caused a exception.
    • Fixed number format issue where some international formats are not saved properly.
  • [03 Apr 2008] Revision 2.82
    • Fixed single cell merge bug.
  • [19 Mar 2008] Revision 2.81
    • Fixed multi worksheet import bug.
  • [10 Mar 2008] Revision 2.80
    • Added decimal support.
    • Fixed a error where assigning 0 to a cell still caused a exception.
  • [06 Mar 2008] Revision 2.79
    • Added Cell Merge/Unmerge support.
    • Added GetEnumerator support for sheets, rows and ranges.
    • Added CellCollection class.
    • Fixed a error where numeric output of the cell contained global number format where it should only be US only format. Thanks to Reinhard.
    • Added 6 new display format types and removed Custom format type.
    • Added Index property to cell which also has a ExcelColumnIndex property which returns columns in excel format, eg. A, AA, AC, FA.
  • [28 Feb 2008] Revision 2.45
    • Fixed a error where GetValue<T> was not accepting string and DateTime types. Thanks to Karl for pointing this out.
  • [21 Feb 2008] Revision 2.44
    • Fixed a error where assigning 0 to a cell caused a exception. Thanks to Ralf for pointing this out.
  • [20 Feb 2008] Revision 2.43
  • [19 Jan 2008] Revision 1.30
    • Added complete documentation
    • Added dataset export capability
    • Added range auto filter
    • Added print row and column headers
    • Some name changes to remove warnings reported by FxCop. (ExcelXmlWorkBook to ExcelXmlWorkbook, WorkSheet to Worksheet and CellCompareDelegate to CellCompare.
    • CellCompare is now in Yogesh.Extensions.ExcelXml and not in Yogesh.Extensions.ExcelXml.Formula
    • Added cell comments
  • [16 Jan 2008] Revision 1.0
    • First release on The Code Project

Conclusion

I will love to hear your comments and suggestions. Any bugs can be reported here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Yogesh Jagota



My Blog

Location: India India

Other popular Office Development articles:

Article Top
Sign Up to vote for this article
You must Sign In to use this message board.
FAQ FAQ Noise ToleranceSearch Search Messages 
 Layout  Per page   
 Msgs 1 to 25 of 100 (Total in Forum: 100) (Refresh)FirstPrevNext
Subject  Author Date 
GeneralAnother way to handle Excel export/importmemberZeljkoS5:34 29 Sep '08  
GeneralRe: Another way to handle Excel export/importmemberJasonC726hrs 15mins ago 
QuestionUsing DataValidationmemberforbite12:43 24 Sep '08  
GeneralCould you send me a copy?memberliu_jiangtao72120:58 21 Sep '08  
QuestionVBControlsmemberkuesu9:59 31 Jul '08  
AnswerRe: VBControlsmemberYogesh Jagota9:12 11 Aug '08  
QuestionDateFormat AM/PMmemberlklinger61010:32 30 Jul '08  
AnswerRe: DateFormat AM/PMmemberYogesh Jagota9:14 11 Aug '08  
QuestionImporting Excel FilememberPhillip Martin14:49 19 Jul '08  
AnswerRe: Importing Excel FilememberYogesh Jagota23:20 22 Jul '08  
GeneralRe: Importing Excel Filememberygeordy13:30 9 Aug '08  
GeneralRe: Importing Excel FilememberYogesh Jagota9:11 11 Aug '08  
GeneralRe: Importing Excel FilememberYogesh Jagota4:14 12 Aug '08  
GeneralCustomFormatStringmemberSeishin#4:13 15 Jul '08  
GeneralRe: CustomFormatStringmemberSeishin#2:57 17 Jul '08  
GeneralRe: CustomFormatStringmemberYogesh Jagota23:24 22 Jul '08  
GeneralYogesh.Extensions source codememberMember 4868534:23 24 Jun '08  
GeneralRe: Yogesh.Extensions source codememberYogesh Jagota23:19 22 Jul '08  
GeneralPosting source code in a weekmemberYogesh Jagota11:59 6 Jun '08  
GeneralRe: Posting source code in a weekmemberYogesh Jagota22:47 13 Jun '08  
GeneralError when ExcelXmlWorkbook.Importmemberxwandragon785:24 6 Jun '08  
GeneralRe: Error when ExcelXmlWorkbook.ImportmemberYogesh Jagota6:28 10 Jun '08  
GeneralUseful utility.memberdnpro1:25 3 Jun '08