Click here to Skip to main content
Click here to Skip to main content

How to Read and Write .xlsx (Excel 2007) File - Part II

By , 27 Jun 2011
 

Introduction

In Part 1, we discussed about the facts and logic behind the Excel 2007 file. In this article, we will see how we write the Excel 2007 file.

As we have already seen, sharedString.xml and Sheet1.xml now let us explore some other files that are important in Excel 2007 file writing.

1. [ContentType].xml

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
 <Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
  <Override PartName="/xl/theme/theme1.xml" 
	ContentType="application/vnd.openxmlformats-officedocument.theme+xml" /> 
  <Override PartName="/xl/styles.xml" 
  ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.styles+xml" /> 
  <Default Extension="rels" 
 	ContentType="application/vnd.openxmlformats-package.relationships+xml" /> 
  <Default Extension="xml" ContentType="application/xml" /> 
  <Override PartName="/xl/workbook.xml" 
   ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml" /> 
  <Override PartName="/docProps/app.xml" 
   ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml" /> 
  <Override PartName="/xl/worksheets/sheet2.xml" 
   ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml" /> 
	<!-- Sheet Information -->
  <Override PartName="/xl/worksheets/sheet3.xml" 
   ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml" /> 
	<!-- Sheet Information -->
  <Override PartName="/xl/worksheets/sheet1.xml" 
   ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml" /> 
	<!-- Sheet Information -->
  <Override PartName="/xl/sharedStrings.xml" 
   ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sharedStrings+xml" /> 
  <Override PartName="/docProps/core.xml" 
   ContentType="application/vnd.openxmlformats-package.core-properties+xml" /> 
  </Types> 

As the name shows, this file contains the content type information of every file. When we write Excel file, the rest of the file would remain the same but the portion which is related with sheet information would be changed because it depends upon the number of sheets that Excel file contains. For example, if the Excel file contains only one sheet, then it contains only one line that would be representing sheet content information.

2. App.xml

 <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
  <Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/
	extended-properties" 
	xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
  <Application>Microsoft Excel</Application> 
  <DocSecurity>0</DocSecurity> 
  <ScaleCrop>false</ScaleCrop> 
  <HeadingPairs>
  <vt:vector size="2" baseType="variant">
  <vt:variant>
  <vt:lpstr>Worksheets</vt:lpstr> 
  </vt:variant>
  <vt:variant>
  <vt:i4>3</vt:i4> <!-- Total number of Sheets -->
  </vt:variant>
  </vt:vector>
  </HeadingPairs>
  <TitlesOfParts>
  <vt:vector size="3" baseType="lpstr"> <!-- size attribute contains the 
					value of the total number of sheets -->
  <vt:lpstr>Sheet1</vt:lpstr> <!-- Sheet Reference -->
  <vt:lpstr>Sheet2</vt:lpstr> <!-- Sheet Reference -->
  <vt:lpstr>Sheet3</vt:lpstr> <!-- Sheet Reference -->
  </vt:vector>
  </TitlesOfParts>
  <LinksUpToDate>false</LinksUpToDate> 
  <SharedDoc>false</SharedDoc> 
  <HyperlinksChanged>false</HyperlinksChanged> 
  <AppVersion>12.0000</AppVersion> 
  </Properties> 

It is placed at docProp\app.xml, this file contains number and names of the Sheets. As you can see the comments on the above code, when we will write this file we need to change the value of the number of sheets and sheet names according to our Excel file. Suppose we have two Sheets in an Excel file and their names are SampleSheet1 and SampleSheet2, then this file would be like:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
  <Properties xmlns="http://schemas.openxmlformats.org/officeDocument/
	2006/extended-properties" 
	xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
  <Application>Microsoft Excel</Application> 
  <DocSecurity>0</DocSecurity> 
  <ScaleCrop>false</ScaleCrop> 
  <HeadingPairs>
  <vt:vector size="2" baseType="variant">
  <vt:variant>
  <vt:lpstr>Worksheets</vt:lpstr> 
  </vt:variant>
  <vt:variant>
  <vt:i4>2</vt:i4> 
  </vt:variant>
  </vt:vector>
  </HeadingPairs>
  <TitlesOfParts>
  <vt:vector size="2" baseType="lpstr">
  <vt:lpstr>SampleSheet1</vt:lpstr> 
  <vt:lpstr>SampleSheet2</vt:lpstr> 
  </vt:vector>
  </TitlesOfParts>
  <Company /> 
  <LinksUpToDate>false</LinksUpToDate> 
  <SharedDoc>false</SharedDoc> 
  <HyperlinksChanged>false</HyperlinksChanged> 
  <AppVersion>12.0000</AppVersion> 
  </Properties> 

Here we can see the difference between these two XML files, first the Total number of sheet is changed from 3 to 2, and secondly sheet names are changed.

3. WorkBook.xml.rels

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
  <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId3" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet 
  Target="worksheets/sheet3.xml" /> 
  <Relationship Id="rId2" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet 
  Target="worksheets/sheet2.xml" /> 
  <Relationship Id="rId1" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet 
  Target="worksheets/sheet1.xml" /> 
  <Relationship Id="rId6" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings 
  Target="sharedStrings.xml" /> 
  <Relationship Id="rId5" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles 
  Target="styles.xml" /> 
  <Relationship Id="rId4" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme
  Target="theme/theme1.xml" /> 
  </Relationships>   

You can find this file at xl\_rels\workbook.xml.rels. It contains all the relationship information that contains by the WorkBook, and every relationship is assigned an unique Id which set in the value of Id attribute. Let's consider the Excel file which contains two sheets, then how this file would be:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
 <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId3" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme 
  Target="theme/theme1.xml" /> 
  <Relationship Id="rId2" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet 
  Target="worksheets/sheet2.xml" /> 
  <Relationship Id="rId1" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet 
  Target="worksheets/sheet1.xml" /> 
  <Relationship Id="rId5" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings 
  Target="sharedStrings.xml" /> 
  <Relationship Id="rId4" 
  Type=http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles 
  Target="styles.xml" /> 
  </Relationships> 

If you compare both the XMLs, the difference you get is that the second XML contains information only for two sheets while the first one contains for three, and of course, the relationship ids are changed. so when we will write this file we should take care of the Id generation and should generate sheet relations according to the number of sheets which Excel file contains.

4. WorkBook.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
  <workbook xmlns=http://schemas.openxmlformats.org/spreadsheetml/2006/main 
	xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
  <fileVersion appName="xl" lastEdited="4" lowestEdited="4" rupBuild="4505" /> 
  <workbookPr defaultThemeVersion="124226" /> 
  <bookViews>
  <workbookView xWindow="120" yWindow="105" windowWidth="14295" windowHeight="5130" /> 
  </bookViews>
  <sheets>
  <sheet name="Sheet1" sheetId="1" r:id="rId1" /> 
  <sheet name="Sheet2" sheetId="2" r:id="rId2" /> 
  <sheet name="Sheet3" sheetId="3" r:id="rId3" /> 
  </sheets>
  <calcPr calcId="124519" /> 
  </workbook>  

Workbook contains the information about sheets. This is the file where name of sheets are defined, rest of the files consume these sheet names. This file also contains the mapping between relationships and Sheet names by defining the value of attribute r:Id. We have to be careful when generating this file because we have to map correct relationship ids for the sheets else Excel file would be corrupted.

Logic

So the logic is very simple if we write all those files according to our data properly and compress these files into one file, then it can be read by Excel 2007 engine.

Using the Code

xlsx_basic.jpg

As we know, Excel file is a WorkBook which contains multiple sheets and every sheet contains a collection of rows and columns, while intersection of row and column is called cell.

So to make our code easy to adapt, let's break up our code into three main classes:

  1. WorkBook (that is the root class, like Excel file which contains all the information about sheets and data.)
  2. WorkSheet (this class represents single WorkSheet of Excel file).
  3. Cell (this class represents Cell information of Excel file).

classdiag.jpg

Let's see how we use the code:

Excel2007.WorkBook workBook = new Excel2007.WorkBook();
       
Response.Clear();
Response.BufferOutput = false;

string archiveName = "test.xlsx"; //setting name of Excel File
Response.ContentType = "application/zip";
Response.AddHeader("content-disposition", "filename=" + 
	archiveName); //header information to export as zip.

workBook.WorkSheets.Add("FirstSheet"); //add sheet
workBook.WorkSheets["FirstSheet"].Cells[0, 0].Value = 
	"first value"; //set value on first cell of Sheet
workBook.WorkSheets["FirstSheet"].Cells[0, 1].Value = "second value";
        
workBook.WorkSheets.Add("SecondSheet");
workBook.WorkSheets["SecondSheet"].Cells[0, 0].Value = "third value";

workBook.Save(Response.OutputStream); //writing content of sheet on Response
Response.End();  

So what our code did is, first it took all the information about the data and Sheets and saved it in memory, and when WorkBook.Save method called it wrote all the information to the required XML files and then compress those files via DotNetZip library into a single file.

So this is a very simple library that can write basic Excel 2007 file.

Some More Solutions

For complex Excel writing, you can use different solutions which are available for free. Some are as follows:

  1. ExcelPackage
  2. EPPlus
  3. Open XML

Microsoft also provided System.IO.Packaging library from which you can write Excel file, you can use that library just by adding WindowsBase DLL reference in your project.

History

  • 27th July, 2011: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

M I developer
Technical Lead
Pakistan Pakistan
Member
<Profile>
<Profession>
I am a Software Engineer from Karachi Pakistan, Being a programmer obviously I love to code but I love to code that adds something new in my knowledge else I do copy paste usually Smile | :)
</Profession>
<Education>
Have done Master and Bachelors of Computer Science from Karachi University Pakistan
</Education>
<Interests>
Anything I found interesting technically or non-technically so nothing specific Wink | ;)
</Interests>
</Profile>

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionDetermining whether value is a date? [modified]memberMember 435337130 Apr '13 - 17:58 
Hi M I developer,
 
thanks for the clearly presented example. I was wondering how you test whether a value is a date?
For instance, I know the value in this cell is a date:
-<c r="G4" s="3"><v>41057</v></c>
 
and this is a number:
-<c r="N2" s="5"><v>67.650000000000006</v></c>
 

but I'm unsure how to distinguish it from a number.
 
presumably this has something to do with the s attribute
 
thanks
 
Matt

modified 1 May '13 - 0:07.

GeneralMy vote of 1memberrctaubert3 Sep '12 - 2:27 
Although you article was very informative you code was a waste of time. Your title says Read and Write but does not read an Excel workbook. Your sameples indicate that your code works properly but in fact it does not. It is almost like after having spent so much time writing a good article you couldn't take the time to write code to match. Sorry.
GeneralRe: My vote of 1memberM I developer4 Sep '12 - 20:44 
Well as I said earlier this article and code are only for educational purpose If you want to read and write complex excel files you can use free tools to read and write excel files which I have also mentioned in my article under "SOME MORE SOLUTIONS" Or you can change the code according to your requirements with the help of description that I have written on How to Read and Write .xlsx (Excel 2007) file - Part I[^]
 
Hope it helps
Happy Coding Smile | :)
QuestionDoesn't save non-string valuesmemberrctaubert31 Aug '12 - 6:50 
Your class as downloaded from CodeProject does not save cell values that are anything but string.
 
Your sample of sheet.xml in Part 1 shows that at some time it was able to store non-string values.
 
This sample code is from the sub void writeWorksheetData
 
                        string val = "";
                        if (cell.Value is string)
                        {
                              writer.WriteAttributeString("t", "s");
                              val = sharedStringValues.IndexOf(cell.Value.ToString()).ToString();
                        }
 
                        writer.WriteElementString("v", val.ToString());
 
As you can see if cell.value is not a string "" gets written
 
It also seems that writer.WriteElementString only accepts string values so how do you write an numeric or date value?
 
There doesn't seem to be anywhere else in your class that writes cell values.
 
Do you have a newer version of your source code or a version with a fix for this?
AnswerRe: Doesn't save non-string valuesmemberM I developer2 Sep '12 - 20:31 
No I haven't updated the code i wrote this code for educational purpose only, Feel free to update that code according to your needs
Thanks
QuestionError on the word Responsememberrctaubert30 Aug '12 - 6:08 
M I Developer or anyone,
 
I have tried using your sample code and keep getting errors.   Using the code as is 'Response' shows the error 'Reference to non-shared member requires an object reference."   I am not a C# programmer so have used a C# to VB converter to change the code to VB.Net.
 
I have done many searches and cannot come up with an answer that works.
 
I have tried importing various System.Web configurations.   I have also tried different variations of 'Dim Response as New System.Web.xxx" without finding the right combination.
 
Can you tell me what Imports and Dim I should be using to get this to work.
AnswerRe: Error on the word ResponsememberM I developer30 Aug '12 - 19:39 
The sample code that I have posted is for web applications, where Response is the HttpResponse and I am writing XLSX stream on the HttpResponse output stream. so If you want to write XLSX file on filesystem you only need to Save workbook on FileStream
Hope it helps
GeneralRe: Error on the word Responsememberrctaubert31 Aug '12 - 3:05 
Thank you for your reply.   Your explanation helped very much.   I have it working now.
QuestionCell Formatmembershariflalon22 Jul '12 - 21:18 
How can Excel cell format like border can apply with this library ?
AnswerRe: Cell FormatmemberM I developer22 Jul '12 - 21:25 
well this library is very simple one it doesn't support complex operations, for complex operation you can use different libraries which I have mentioned in my article you can find them under "Some More Solutions"

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web03 | 2.6.130523.1 | Last Updated 27 Jun 2011
Article Copyright 2011 by M I developer
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid