Click here to Skip to main content
Click here to Skip to main content

Yet another way to Generate Excel documents programmatically

, 13 Dec 2005
Rate this:
Please Sign up or sign in to vote.
XML documents can represent Microsoft Excel spreadsheets. This article provides a working example of a relatively low-tech C# program generating a properly formatted XML document that Microsoft Excel loads and views as one would expect.

Introduction

I needed to be able to generate a simple report showing data from a collection of objects. My initial thought was to generate a simple .txt file and view it in Notepad. However, this seemed very boring and some of my users would want to do some analysis on the data (e.g. averages, sub-totals and so forth). Of course, the natural place for someone to do that kind of work is Microsoft Excel.

I did a bit of searching for a very simple low-tech Microsoft Excel interface example. I didn't try very hard, since in fact, I wanted to figure this out myself anyway.

I knew that one can save Excel spreadsheets in a variety of formats, so I pulled up Excel, created the spreadsheet I wanted, including formatting and multiple data types and saved it as an XML document. I then pulled that up into my favorite text editor and started to reverse engineer it.

As sharp-eyed codeproject.com readers know, this subject has been thoroughly discussed already. I had already written this article before it occurred to me to check codeproject.com for Excel-related articles. There are many of them. However, almost all of them involve a more or less direct interface with Excel (typically by adding a reference to your VS project, instantiating Excel, etc). The approach described in this article provides two advantages:

  1. It's lower tech, and
  2. This approach does not require that Microsoft office be installed / available to the C# program that generates the XML. In some cases, this is a significant advantage. A reader kindly pointed this out to me and I am grateful.

As a final note, all we are doing here is generating XML. C# is not required to do that, so this is applicable to VB and any other language that is capable of generating XML.

Background

Several MS office products allow you to save the associated document to an XML file instead of the usual binary format (e.g. .doc, .xls). The XML that Office 2003 generates is somewhat complicated but still quite readable. To save an Excel file as .xml, go to File, Save-As and then change the output format to "XML Spreadsheet".

Microsoft Office identifies a particular XML file as being an "XML Spreadsheet" via the following:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?> 

The first line shows the usual xml version information and the second line causes Office to open Excel to process the XML.

The XML itself is broken up into several major sections:

  • Workbook: Root node of the XML, parent to all the other sections.
  • DocumentProperties: Most of the importation one sees when accessing File -> Properties is defined here.
  • Styles: Formatting information defined here is available to different rows, columns and specific cells in the worksheets.
  • Worksheet(s): As many worksheets as you want are defined here.

Worksheets consist of two major sections:

  • Table: This is where all visible data in the spreadsheet is stored.
  • WorksheetOptions: Global options for the worksheet itself.

Finally, the Table section defines two components:

  • Column
  • Row / Cell

With this information in hand, one can plan out how to create the XML document that will then load into Microsoft Excel and look, feel and operate like a document created by a sophisticated, loving user.

Using the code

The code for this project does the following:

  1. Creates a set of objects with data that we want to appear in the spreadsheet. This sample data is modeled after a simple price list.
  2. Generates properly formatted XML that one can then open with Excel.
  3. Launches Excel.

The demo includes a simple menu screen with a link label. Upon clicking the link, it instantiates an ExcelReport object.

The constructor of the ExcelReport object instantiates a handful of SampleData objects.

The simple menu object then invokes the GenerateSelf() method on the ExcelReport object which saves the Excel XML to a hard-coded file name, c:\temp\prices.xml

Once generated, the simple menu program launches Excel.

Points of Interest

  • The technique here has been successfully used to generate far more complicated Excel documents, including multi-sheet documents with relatively complicated graphs.
  • Though I have never done so, Microsoft Word can also save documents in an XML format and this same technique can be used to create Word XML documents with all the Microsoft Word formatting as well.
  • Excel does present one significant problem with documents created this way. Happily, it's quite easy to work around. It is also very strange.
    In the generated document, you will notice in the Table section, a line like this:
    <Cell><Data ss:Type="Number">25</Data></Cell>
    

    Though it's not necessary, my preference when generating XML is to split nodes out on their own lines. I find it's easier to read in a text editor. So, I would prefer this:

    <Cell>
      <Data ss:Type="Number">25</Data>
    </Cell>
    

    As far as I'm concerned, that's valid XML. However Excel 2003 complains when it opens up that XML document with a message in a dialog box, "Problems During Load". It complains in a bit more detail, "Problems came up in the following areas during load:" and lists the "Table" area.

    It then kindly tells you that the errors are listed in a deep-down directory:
    "This file cannot be opened because of errors. Errors are listed in: C:\Documents and Settings\pagalvin.DFKBZ221\Local Settings\Temporary Internet Files\Content.MSO\D74388D6.log"

    Sadly, in a Win2K environment, you can't copy/paste that file name, so it's a bit tricky to find. (A reader has pointed out that you can do so in WinXP). Worse, it's not so helpful once you do find it:

    XML PARSE ERROR: Extraneous end-tag
    Error occurs at or below this element stack:
    <ss:Workbook>
      <ss:Worksheet>
        <ss:Table>
          <ss:Row>
            <ss:cell>
    

    Despite that quirk, it's easy to work around. Keep it all on one line.

History

  • 12/13/2005: Created
  • 12/14/2005: Added some comments in response to user feedback
    1. Pointed out that Excel need not be present where this program runs. This makes this solution useful on a server that does not have Microsoft Office installed on it.
    2. Fixed a typo in the XML where I describe the problem in the Table section.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Paul Galvin
Web Developer
United States United States
Partner at Integrated Systems and Services group (www.issgroup.net).
 
Graduated Lafayette Collage (Easton, PA) with B.S. C.S. in 1991.
 
Married with one son.
 
Living in the great state of New Jersey.
 

Comments and Discussions

 
GeneralGood work! Pinmemberrosnay25-Jun-06 14:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.141220.1 | Last Updated 13 Dec 2005
Article Copyright 2005 by Paul Galvin
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid