Often we are in a need to convert Excel data to XML stream/XML file which can be used as a feed to various applications like web services or middle tiers such as BizTalk 2004. There will be many situations where we need to validate the format of Excel data sheet against a specified XML schema. We will also be required to generate XML schema based on an Excel Work Sheet. This utility along with the library will help you to accomplish the same.
The following are the salient features of this library:
- Usage of Microsoft Jet Engine to connect to Excel.
- Conversion of Excel Worksheet/Workbook to XML file and XML Schema.
- Generation of XML file and XML Schema based on provided range.
- Validation of Excel Worksheet/Workbook against the provided XML Schema.
- Provision of batch processing capability.
In this article, we will discuss the implementation of the library functions. The library contains the core functionality to access and manipulate Excel data.
The utility will merely call the appropriate functions from the library. In this way, one can use this same library in ASP.NET applications also with minute changes.
There are two ways to manipulate an Excel file. It can be done either by using Microsoft Office Component (check out here) or with Microsoft Jet Engine.
As per Microsoft recommendation, it is not advisable to use Office components on the server. It means that if you want to use this library for a server application, it’s not a good idea to use the Office component. So the connection will be done using Jet Engine.
Connection to Excel using Jet Engine
To connect to Excel, one can use OleDb objects that will treat Excel as a database, and then the required information can be easily fetched by using SQL queries. The important steps that have to be considered while connecting to Excel are as follows:
- Connection String:
The connection string should be set to the
OleDbConnection object. This is very critical as Jet Engine might not give a proper error message if the appropriate details are not given.
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=<Full Path of Excel File>; Extended Properties="Excel 8.0; HDR=No; IMEX=1".
- Definition of Extended Properties:
Excel = <No>
One should specify the version of Excel Sheet here. For Excel 2000 and above, it is set it to Excel 8.0 and for all others, it is Excel 5.0.
This property will be used to specify the definition of header for each column. If the value is ‘Yes’, the first row will be treated as heading. Otherwise, the heading will be generated by the system like F1, F2 and so on.
IMEX refers to IMport EXport mode. This can take three possible values.
IMEX=2 will result in
ImportMixedTypes being ignored and the default value of ‘Majority Types’ is used. In this case, it will take the first 8 rows and then the data type for each column will be decided.
IMEX=1 is the only way to set the value of
Text. Here, everything will be treated as text.
For more info regarding Extended Properties, check this out.
Loading of data in to Dataset
After successfully connecting to Excel using Jet Engine, it is easy to the load the data in to
DataSet. One has to write a query similar to ANSI-92 with the only changes being that each Excel sheet will be treated as a table and the table name will be the sheet name with “$”. The range can also be specified after the “$” sign.