Click here to Skip to main content
15,890,512 members
Articles / Programming Languages / C#

Excel Reader

Rate me:
Please Sign up or sign in to vote.
4.92/5 (109 votes)
20 Feb 2009LGPL32 min read 1.1M   25.8K   357   173
Create, read and modify Excel *.xls files in pure C# without COM interop
Sample Image - ExcelReader.png

Introduction

This library is built based on the following documents, thanks to their authors:

Record structures in BIFF8/BIFF8X format are considered.

Now this project is hosted on Google Code.
You can get the latest code, report issues and submit improvements from there. 

What It Can Do?

  • It can read worksheets in a workbook and read cells in a worksheet.
  • It can read cell content (text, number, datetime, or error) and cell format (font, alignment, linestyle, background, etc.).
  • It can read pictures in the file, get information of image size, position, data, and format.
  • It can create a workbook and save to file.

Using the Code

  1. Open file:
    C#
    Stream fileStream = File.OpenRead(file);
    Workbook book = new Workbook();
    book.Open(fileStream);
    Worksheet sheet = book.Worksheets[0];
  2. Read cell:
    C#
    int row = 1;
    int col = 0;
    string ID = sheet.Cells[row, col].StringValue;
    
    Picture pic = sheet.ExtractPicture(row, col);
  3. Create workbook:
    C#
    string file = "C:\\newdoc.xls";
    Workbook workbook = new Workbook();
    Worksheet worksheet = new Worksheet("First Sheet");
    worksheet.Cells[0, 1] = new Cell(1);
    worksheet.Cells[2, 0] = new Cell(2.8);
    worksheet.Cells[3, 3] = new Cell((decimal)3.45);
    worksheet.Cells[2, 2] = new Cell("Text string");
    worksheet.Cells[2, 4] = new Cell("Second string");
    worksheet.Cells[4, 0] = new Cell(32764.5, "#,###.00");
    worksheet.Cells[5, 1] = new Cell(DateTime.Now, @"YYYY\-MM\-DD");
    worksheet.Cells.ColumnWidth[0, 1] = 3000;
    workbook.Worksheets.Add(worksheet);
    workbook.Save(file);
  4. Traverse worksheet:
    C#
    // traverse cells
    foreach (Pair<Pair<int, int>, Cell> cell in sheet.Cells)
    {
        dgvCells[cell.Left.Right, cell.Left.Left].Value = cell.Right.Value;
    }
    
    // traverse rows by Index
    for (int rowIndex = sheet.Cells.FirstRowIndex;
       rowIndex <= sheet.Cells.LastRowIndex; rowIndex++)
    {
        Row row = sheet.Cells.GetRow(rowIndex);
        for (int colIndex = row.FirstColIndex;
       colIndex <= row.LastColIndex; colIndex++)
        {
            Cell cell = row.GetCell(colIndex);
        }
    }
    

History

  • 2007-5-17
    • Displayed each Sheet in separate tabpage
    • Fixed some bugs in FORMULA.cs and Form1.cs
  • 2007-5-26
    • Decoded FORMULAR result
  • 2008-1-23
    • Changed default Encoding from ASCII to UTF8 in Record.cs ln97 (suggested by ragundo)
    • Return EmptyCell instead of null for non existing cells (requested by amrlafi)
    • Bug fix in Record.cs ln83 and ln133 (proposed by dhirshjr and ilogpasig)
    • Tried to fix bug in CompoundDocument.cs (found by stevenbright)
  • 2008-9-15
    • Create, Open and Modify CompoundDocument
  • 2008-9-22
    • Save modified CompoundDocument
  • 2008-10-26
    • Encoded workbook and worksheet to BIFF8 records
    • Fixed a bug that compressed Unicode string wrongly decoded by UTF8
  • 2008-11-10
    • Fixed XF format so that the created *.xls file can be opened by Microsoft Excel 2003
  • 2008-11-16
    • Assigned number format for number and date time values
    • Set column widths
  • 2009-1-3
    • Updated project source code and article text
    • Many bug fixes
    • Added support to create and modify *.xls files
  • 2009-01-20
    • Fix for getting 'NaN' when reading string value of a formula cell
    • Start a Google Code project
  • 2009-01-23
    • Support for decode cell format and encode predefined cell format
    • Changed namespaces and directory structure
    • Added some test cases
  • 2009-02-12
    • Initial implementation of encode images

License

This article, along with any associated source code and files, is licensed under The GNU Lesser General Public License (LGPLv3)


Written By
Architect YunCheDa Hangzhou
China China
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralRe: Here's an update for VS2008 (nunit tests coming) Pin
Liu Junfeng24-Feb-09 17:42
Liu Junfeng24-Feb-09 17:42 
GeneralContinuedRecord error Pin
sshenoi17-Feb-09 5:48
sshenoi17-Feb-09 5:48 
GeneralRe: ContinuedRecord error Pin
jetcat_au17-Feb-09 13:30
jetcat_au17-Feb-09 13:30 
GeneralNew Msg in the Google Code... Pin
Guilherme Morais11-Feb-09 8:26
Guilherme Morais11-Feb-09 8:26 
GeneralArgumentOutOfRangeException("sectorID") Pin
bhavana_14in10-Feb-09 23:36
bhavana_14in10-Feb-09 23:36 
GeneralRe: ArgumentOutOfRangeException("sectorID") Pin
Guilherme Morais11-Feb-09 10:19
Guilherme Morais11-Feb-09 10:19 
AnswerRe: ArgumentOutOfRangeException("sectorID") [modified] Pin
won tae,,Jeong28-Mar-09 17:45
won tae,,Jeong28-Mar-09 17:45 
AnswerDate Value/Format Solution [modified] Pin
mars.coder30-Jan-09 10:14
mars.coder30-Jan-09 10:14 
Hi.
Thank you for the library. Keep doing what you are doing,it's the most complete source in .net.
For the complete api in java you can find the code here.

http://www.docjar.com/projects/jexcelapi_2_6_8-code.html[^]

I added some code to solve the date problem. I'll try to list it bellow:

In WorkbookDecoder.cs i stored the number formats found in excel:

private static List<BOUNDSHEET> DecodeRecords(List<Record> records, out SharedResource sharedResource)
        {
            sharedResource = new SharedResource();
            List<BOUNDSHEET> boundSheets = new List<BOUNDSHEET>();
            foreach (Record record in records)
            {
                record.Decode();
                switch (record.Type)
                {
                    case RecordType.BOUNDSHEET:
                        boundSheets.Add(record as BOUNDSHEET);
                        break;
                    case RecordType.FORMAT:
                       FORMAT frm = record as FORMAT;
                       if (!sharedResource.CellFormats.ContainsKey(frm.FormatIndex)) {
                            sharedResource.CellFormats.Add(frm.FormatIndex, frm);
                        }
                        break;
                    case RecordType.XF:
                        sharedResource.ExtendedFormats.Add(record as XF);
                        break;
                    case RecordType.SST:
                        sharedResource.SharedStringTable = record as SST;
                        break;
                    case RecordType.DATEMODE:
                        DATEMODE dateMode = record as DATEMODE;
                        switch (dateMode.Mode)
                        {
                            case 0:
                                sharedResource.BaseDate = DateTime.Parse("1899-12-31");
                                break;
                            case 1:
                                sharedResource.BaseDate = DateTime.Parse("1904-01-01");
                                break;
                        }
                        break;
                    case RecordType.PALETTE:
                        PALETTE palette = record as PALETTE;
                        int colorIndex = 8;
                        foreach (int color in palette.RGBColours)
                        {
                            sharedResource.ColorPalette[colorIndex] = Color.FromArgb(color);
                            colorIndex++;
                        }
                        break;
                }
            }



In WorksheetDecoder.cs i checked if the records have date or number formats and populated the cells accordingly

private static CellCollection PopulateCells(List<Record> records, SharedResource sharedResource)
        {
            CellCollection cells = new CellCollection();
            cells.SharedResource = sharedResource;
            foreach (Record record in records)
            {
                record.Decode();
                switch (record.Type)
                {
                    //case RecordType.DIMENSIONS:
                    //    DIMENSIONS dimensions = record as DIMENSIONS;
                    //    cells.FirstRowIndex = dimensions.FirstRow;
                    //    cells.FirstColIndex = dimensions.FirstColumn;
                    //    cells.LastRowIndex = dimensions.LastRow-1;
                    //    cells.LastColIndex = dimensions.LastColumn-1;
                    //    break;
                    case RecordType.BOOLERR:
                        BOOLERR boolerr = record as BOOLERR;
                        cells.CreateCell(boolerr.RowIndex, boolerr.ColIndex, boolerr.GetValue(), boolerr.XFIndex);
                        break;
                    case RecordType.LABELSST:
                        LABELSST label = record as LABELSST;
                        cells.CreateCell(label.RowIndex, label.ColIndex, sharedResource.GetStringFromSST(label.SSTIndex), label.XFIndex);
                        break;
                    case RecordType.NUMBER:
                        NUMBER number = record as NUMBER;
                        cells.CreateCell(number.RowIndex, number.ColIndex, number.Value, number.XFIndex);
                        break;
                    case RecordType.RK:
                        RK rk = record as RK;
                        if (sharedResource.IsDate(rk.XFIndex)) {
                            cells.CreateCell(rk.RowIndex, rk.ColIndex, sharedResource.DecodeDateTime((double)Record.DecodeRK(rk.Value)), rk.XFIndex);
                        }
                        else {
                            cells.CreateCell(rk.RowIndex, rk.ColIndex, Record.DecodeRK(rk.Value), rk.XFIndex);
                        }
                        break;
                    case RecordType.MULRK:
                        MULRK mulrk = record as MULRK;
                        int row = mulrk.RowIndex;
                        for (int col = mulrk.FirstColIndex; col <= mulrk.LastColIndex; col++)
                        {
                            int index = col - mulrk.FirstColIndex;
                            object value = Record.DecodeRK(mulrk.RKList[index]);
                            int XFindex = mulrk.XFList[index];
                            if (sharedResource.IsDate(XFindex)) {
                                cells.CreateCell(row, col, sharedResource.DecodeDateTime((double)value), XFindex);
                            }
                            else {
                                cells.CreateCell(row, col, value, XFindex);
                            }
                        }
                        break;
                    case RecordType.FORMULA:
                        FORMULA formula = record as FORMULA;
                        cells.CreateCell(formula.RowIndex, formula.ColIndex, formula.DecodeResult(), formula.XFIndex);
                        break;
                }
            }
            return cells;
        }


Added the class FormatType.cs :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Office.Excel {
    public class FormatType {
        public static int[] DateFormats = new int[20]   {
         0xe,
         0xf,
         0x10,
         0x11,
         0x12,
         0x13,
         0x14,
         0x15,
         0x16,
         0x2d,
         0x2e,
         0x2f,
         0xbb,
         0xb5,
         0xb6,
         0xba,
         0xbb,
         0xbc,
         0xbd,
         0xbe
        };

          //DateFormat.SHORT,
          //DateFormat.MEDIUM,
          //"d-MMM",
          //"MMM-yy",
          //"h:mm a",
          //"h:mm:ss a",
          //"H:mm",
          //"H:mm:ss",
          //"M/d/yy H:mm",
          //"mm:ss",
          //"H:mm:ss",
          //"mm:ss.S";
          //"dd-MMM-yy"
          // '[$-409]h:mm:ss\\ AM/PM' 
          // '[$-409]dddd\\,\\ mmmm\\ dd\\,\\ yyyy' 
          //  dd/mm/yyyy
          // '[$-409]dd\\-mmm\\-yy;@' 
          // 'mmm\\-yyyy' ,"
          // 'mmm\\-yyyy' ,
          // '[$-409]dd\\-mmm\\-yy;@'



        public static int[] NumberFormats = new int[21] {
         0x1,
         0x2,
         0x3,
         0x4,
         0x5,
         0x6,
         0x7,
         0x8,
         0x9,
         0xa,
         0xb,
         0x25,
         0x26,
         0x27,
         0x28,
         0x29,
         0x2a,
         0x2b,
         0x2c,
         0x30,
         0xb3
        };

          //"0//"
          //"0.00//"
          //"#,##0//"
          //"#,##0.00//"
          //"$#,##0;($#,##0)//"
          //"$#,##0;($#,##0)//"
          //"$#,##0.00;($#,##0.00)//"
          //"$#,##0.00;($#,##0.00)//"
          //"0%//"
          //"0.00%//"
          //"0.00E00//"
          //"#,##0;(#,##0)//"
          //"#,##0;(#,##0)//"
          //"#,##0.00;(#,##0.00)//"
          //"#,##0.00;(#,##0.00)//"
          //"#,##0;(#,##0)//"
          //"$#,##0;($#,##0)//"
          //"#,##0.00;(#,##0.00)//"
          //"$#,##0.00;($#,##0.00)//"
          //"##0.0E0//")};
          // '0.0000%'

    }
}



Added IsDate and IsDouble properies on XF.cs and FORMAT.cs:

public partial class XF : Record
	{
		public XF(Record record) : base(record) {    
        }

		public XF()
		{
			this.Type = RecordType.XF;
		}

        private bool isDate = false;
        public bool IsDate {
            get {
                return isDate;
            }
        }


        private bool isDouble = false;
        public bool IsDouble {
            get {
                return isDouble;
            }
        }

		public UInt16 FontIndex;

		public UInt16 FormatIndex;

		public UInt16 CellProtection;

		public Byte Alignment;

		public Byte Rotation;

		public Byte Indent;

		public Byte Attributes;

		public UInt32 LineStyle;

		public UInt32 LineColor;

		public UInt16 Background;

        public override void Decode() {
            MemoryStream stream = new MemoryStream(Data);
            BinaryReader reader = new BinaryReader(stream);
            this.FontIndex = reader.ReadUInt16();
            this.FormatIndex = reader.ReadUInt16();
            this.CellProtection = reader.ReadUInt16();
            this.Alignment = reader.ReadByte();
            this.Rotation = reader.ReadByte();
            this.Indent = reader.ReadByte();
            this.Attributes = reader.ReadByte();
            this.LineStyle = reader.ReadUInt32();
            this.LineColor = reader.ReadUInt32();
            this.Background = reader.ReadUInt16();

            for (int i = 0; i < FormatType.DateFormats.Length && isDate == false; i++) {
                if (FormatIndex == FormatType.DateFormats[i]) {
                    isDate = true;
                }
            }

            for (int i = 0; i < FormatType.NumberFormats.Length && isDouble == false; i++) {
                if (FormatIndex == FormatType.NumberFormats[i]) {
                    isDouble = true;
                }
            }
        }

		public override void Encode()
		{
			MemoryStream stream = new MemoryStream();
			BinaryWriter writer = new BinaryWriter(stream);
			writer.Write(FontIndex);
			writer.Write(FormatIndex);
			writer.Write(CellProtection);
			writer.Write(Alignment);
			writer.Write(Rotation);
			writer.Write(Indent);
			writer.Write(Attributes);
			writer.Write(LineStyle);
			writer.Write(LineColor);
			writer.Write(Background);
			this.Data = stream.ToArray();
			this.Size = (UInt16)Data.Length;
			base.Encode();
		}

	}


public partial class FORMAT : Record
	{
		public FORMAT(Record record) : base(record) { }

		public FORMAT()
		{
			this.Type = RecordType.FORMAT;
		}

        private bool isDate = false;
        public bool IsDate {
            get {
                return isDate;
            }
        }


        private bool isDouble = false;
        public bool IsDouble {
            get {
                return isDouble;
            }
        }

		/// <summary>
		/// Format index used in other records
		/// </summary>
		public UInt16 FormatIndex;

		/// <summary>
		/// Number format string
		/// </summary>
		public String FormatString;

		public override void Decode()
		{
			MemoryStream stream = new MemoryStream(Data);
			BinaryReader reader = new BinaryReader(stream);
			this.FormatIndex = reader.ReadUInt16();
			this.FormatString = this.ReadString(reader, 16);

            for (int i = 0; i < FormatType.DateFormats.Length && IsDate == false; i++) {
                if (FormatIndex == FormatType.DateFormats[i]) {
                    isDate = true;
                }
            }

            for (int i = 0; i < FormatType.NumberFormats.Length && isDouble == false; i++) {
                if (FormatIndex == FormatType.NumberFormats[i]) {
                    isDouble = true;
                }
            }
		}

		public override void Encode()
		{
			MemoryStream stream = new MemoryStream();
			BinaryWriter writer = new BinaryWriter(stream);
			writer.Write(FormatIndex);
			Record.WriteString(writer, FormatString,16);
			this.Data = stream.ToArray();
			this.Size = (UInt16)Data.Length;
			base.Encode();
		}

	}



Also i had to add some helpers on SharedResource.cs

public Hashtable CellFormats = new Hashtable();
public bool IsDate(int pos) {
           XF xfr = (XF)ExtendedFormats[pos];
           if (xfr.IsDate) {
               return true;
           }

           if (CellFormats.ContainsKey(xfr.FormatIndex)) {
               return ((FORMAT)CellFormats[xfr.FormatIndex]).IsDate;
           }

           return false;        }

public DateTime DecodeDateTime(double value) {
           double days = value;
           if (days > 366) days--;
           return BaseDate.AddDays(days);
       }


Now the cell has the value saved with the excel type string, double or date time so i had to change the getters for some properties. So this is the changed code :

public object Value
       {
           get { return _value; }
           set
           {
               if (IsEmpty) throw new Exception("Can not set value to an empty cell.");
               _value = value;
           }
       }

       public string StringValue
       {
           get
           {
               if (_value == null)
               {
                   return String.Empty;
               }
               if(_value is String)
               {
                   return _value.ToString();
               }
               return string.Empty;
           }
       }

       public DateTime DateTimeValue
       {
           get {
               if (_value == null) {
                   return DateTime.MinValue;
               }
               if (_value is DateTime) {
                   return (DateTime)_value;
               }
               else {
                   return DateTime.MinValue;
               }
           }
       }


And now here is how you can read the excel in a DataTable in the most common scenario :

public class NetExcelDataAdapter {
      private DateTime mindate = new DateTime(1900, 1, 1);
      private Workbook workbook = null;

      public NetExcelDataAdapter(string filename) {
          CompoundDocument doc = null;
          MemoryStream streamdata = null;
          try {
              doc = CompoundDocument.Open(filename);
              byte[] bookdata = doc.GetStreamData("Workbook");
              if (bookdata != null) {
                  streamdata = new MemoryStream(bookdata);
                  workbook = WorkbookDecoder.Decode(streamdata);
              }
          }
          catch (Exception ex) {
              throw new Exception(string.Format("Eroare la incarcarea   fisierului {0} \n\r cu exceptia : {1}"
                 , filename
                 , ex.Message));
          }
          finally {
              doc.Close();

              if (streamdata != null) {
                  streamdata.Close();
                  streamdata.Dispose();
              }
          }
      }

      public DataTable GetExcelContent(int index) {
          if (workbook != null && workbook.Worksheets.Count > index + 1) {
              Worksheet sheet = workbook.Worksheets[index];
              return GetExcelContent(sheet);
          }
          return new DataTable();
      }

      public DataTable GetExcelContent(string sheetName) {
          if (workbook != null) {
              foreach (Worksheet sheet in workbook.Worksheets) {
                  if (sheet.Name == sheetName) {
                      return GetExcelContent(sheet);
                  }
              }
          }
          return new DataTable();
      }

      private DataTable GetExcelContent(Worksheet sheet) {
          DataTable result = new DataTable();

          if (sheet.Cells.LastColIndex > 0 && sheet.Cells.LastRowIndex > 1) {
              result = new DataTable(sheet.Name);
              for (int columnno = 0; columnno <= sheet.Cells.LastColIndex; columnno++) {
                  try {
                      DataColumn column = new DataColumn(sheet.Cells[0, columnno].Value.ToString());

                      if (sheet.Cells[1, columnno].Value is double) {
                          column.DataType = typeof(Double);
                      }
                      else if (sheet.Cells[1, columnno].Value is DateTime) {
                          column.DataType = typeof(DateTime);
                      }
                      else {
                          column.DataType = typeof(String);
                      }

                      result.Columns.Add(column);
                  }
                  catch (Exception ex) {
                      throw new Exception();
                  }
              }

              foreach (Pair<Pair<int, int>, Cell> cell in sheet.Cells) {
                  try {
                      if (cell.Left.Left > 0) {
                          if (result.Rows.Count < cell.Left.Left) {
                              result.Rows.Add(result.NewRow());
                          }
                          if (result.Columns[cell.Left.Right].DataType == typeof(DateTime)) {
                              result.Rows[cell.Left.Left - 1][cell.Left.Right] = cell.Right.DateTimeValue;
                          }
                          else {
                              result.Rows[cell.Left.Left - 1][cell.Left.Right] = cell.Right.Value;
                          }
                      }
                  }
                  catch (Exception ex) {
                      throw new Exception();
                  }
              }
          }

          result.AcceptChanges();
          return result;
      }
  }


The only problem that i had with this till now is that it crashes for more than 20000 records with more than 15 columns. I wish you good luck.
AnswerRe: Date Value/Format Solution Pin
Liu Junfeng3-Feb-09 22:48
Liu Junfeng3-Feb-09 22:48 
GeneralThank U Pin
Watson Jason18-Jan-09 20:07
Watson Jason18-Jan-09 20:07 
QuestionAny plan to move the code into a separate open source project? Pin
jetcat_au18-Jan-09 12:27
jetcat_au18-Jan-09 12:27 
AnswerRe: Any plan to move the code into a separate open source project? Pin
Liu Junfeng20-Jan-09 0:50
Liu Junfeng20-Jan-09 0:50 
GeneralExcel upload error Pin
Member 407264316-Jan-09 19:40
Member 407264316-Jan-09 19:40 
GeneralRe: Excel upload error Pin
bharati Terse9-Mar-09 20:15
bharati Terse9-Mar-09 20:15 
GeneralIssues with writting large excel file Pin
jetcat_au15-Jan-09 20:47
jetcat_au15-Jan-09 20:47 
GeneralRe: Issues with writting large excel file Pin
rcoronel2-Jun-11 21:58
rcoronel2-Jun-11 21:58 
GeneralCell format extension Pin
jetcat_au15-Jan-09 16:59
jetcat_au15-Jan-09 16:59 
GeneralPossible fix for getting 'NaN' is returned when reading string value of a formula field Pin
jetcat_au15-Jan-09 16:45
jetcat_au15-Jan-09 16:45 
GeneralGreat work !!! Pin
Ashutosh Phoujdar13-Jan-09 0:57
Ashutosh Phoujdar13-Jan-09 0:57 
GeneralQuestion about Dates Pin
TMerrifield12-Jan-09 12:28
TMerrifield12-Jan-09 12:28 
GeneralRe: Question about Dates Pin
Liu Junfeng14-Jan-09 0:14
Liu Junfeng14-Jan-09 0:14 
QuestionWhat namespaces are you using? Pin
Jonathan Killen12-Jan-09 8:50
Jonathan Killen12-Jan-09 8:50 
Generalvery good, but some bugs Pin
Huisheng Chen6-Jan-09 18:25
Huisheng Chen6-Jan-09 18:25 
GeneralProblem with Protected Excel Sheets [modified] Pin
Shah Nazar4-Dec-08 9:01
Shah Nazar4-Dec-08 9:01 
Questionproblem in reading Decimal data Pin
bharati Terse2-Nov-08 22:35
bharati Terse2-Nov-08 22:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.