Click here to Skip to main content
15,867,686 members
Articles / Programming Languages / XML
Article

Convert Excel to XML file/XML Schema, and validate Excel against XML Schema

Rate me:
Please Sign up or sign in to vote.
4.63/5 (39 votes)
2 Jun 20054 min read 550.9K   33K   143   52
Provides a utility and code library to convert an Excel file to XML file as well as generate XML Schema. It validates an Excel file against XML Schema.

Sample Image - Excel2XML.jpg

Introduction

Often we are in a need to convert Excel data to XML stream/XML file which can be used as a feed to various applications like web services or middle tiers such as BizTalk 2004. There will be many situations where we need to validate the format of Excel data sheet against a specified XML schema. We will also be required to generate XML schema based on an Excel Work Sheet. This utility along with the library will help you to accomplish the same.

The following are the salient features of this library:

  1. Usage of Microsoft Jet Engine to connect to Excel.
  2. Conversion of Excel Worksheet/Workbook to XML file and XML Schema.
  3. Generation of XML file and XML Schema based on provided range.
  4. Validation of Excel Worksheet/Workbook against the provided XML Schema.
  5. Provision of batch processing capability.

In this article, we will discuss the implementation of the library functions. The library contains the core functionality to access and manipulate Excel data.

The utility will merely call the appropriate functions from the library. In this way, one can use this same library in ASP.NET applications also with minute changes.

Connectivity Options

There are two ways to manipulate an Excel file. It can be done either by using Microsoft Office Component (check out here) or with Microsoft Jet Engine.

As per Microsoft recommendation, it is not advisable to use Office components on the server. It means that if you want to use this library for a server application, it’s not a good idea to use the Office component. So the connection will be done using Jet Engine.

Connection to Excel using Jet Engine

To connect to Excel, one can use OleDb objects that will treat Excel as a database, and then the required information can be easily fetched by using SQL queries. The important steps that have to be considered while connecting to Excel are as follows:

  • Connection String:

    The connection string should be set to the OleDbConnection object. This is very critical as Jet Engine might not give a proper error message if the appropriate details are not given.

    Syntax: Provider=Microsoft.Jet.OLEDB.4.0;Data Source=<Full Path of Excel File>; Extended Properties="Excel 8.0; HDR=No; IMEX=1".

  • Definition of Extended Properties:
    • Excel = <No>

      One should specify the version of Excel Sheet here. For Excel 2000 and above, it is set it to Excel 8.0 and for all others, it is Excel 5.0.

    • HDR= <Yes/No>

      This property will be used to specify the definition of header for each column. If the value is ‘Yes’, the first row will be treated as heading. Otherwise, the heading will be generated by the system like F1, F2 and so on.

    • IMEX= <0/1/2>

      IMEX refers to IMport EXport mode. This can take three possible values.

      • IMEX=0 and IMEX=2 will result in ImportMixedTypes being ignored and the default value of ‘Majority Types’ is used. In this case, it will take the first 8 rows and then the data type for each column will be decided.
      • IMEX=1 is the only way to set the value of ImportMixedTypes as Text. Here, everything will be treated as text.

For more info regarding Extended Properties, check this out.

Loading of data in to Dataset

After successfully connecting to Excel using Jet Engine, it is easy to the load the data in to DataSet. One has to write a query similar to ANSI-92 with the only changes being that each Excel sheet will be treated as a table and the table name will be the sheet name with “$”. The range can also be specified after the “$” sign.

C#
public DataTable GetDataTable(string strSheetName)
{
  try
  {
    string strComand;

    if(strSheetName.IndexOf("|") > 0)
    {
      /* if Range is provided.*/
      SheetName = strSheetName.Substring(0,strSheetName.IndexOf("|"));
      Range = strSheetName.Substring(strSheetName.IndexOf("|")+1);
      strComand="select * from ["+SheetName+"$"+Range+"]"; 
    }
    else
    {
      strComand="select * from ["+strSheetName+"$]"; 
    }

    daAdapter=new OleDbDataAdapter(strComand,cn);
    DataTable dt=new DataTable(strSheetName);
    daAdapter.FillSchema(dt,SchemaType.Source);
    daAdapter.Fill(dt); 
    cn.Close(); 
    return dt ;
  }

  catch(Exception ex)
  {
    throw new Exception(ex.Message); 
  }
}

Conversion of Dataset to XML data and XML Schema

After loading the data in to DataSet, GetXML and GetXMLSchema functions will be used to generate XML data as well as XML Schema. This tool can also be used to generate a single file with XML Schema and XML data together. But this should not be done as XML schema is also an XML data. The following code snippet shows the same which should be avoided at any cost.

C#
public string GetXML(string strSheetName, Boolean _blnSchema)
{
  DataSet ds=new DataSet(); 
  ds.Tables.Add(this.GetDataTable(strSheetName)); 
  if(_blnSchema)
    return ds.GetXmlSchema()+ds.GetXml(); 
  else
    return ds.GetXml(); 
}

Validation against XML Schema

Once the data is generated in the form of XML, we can validate the same using the XmlValidatingReader object from System.Xml.Schema.

C#
public string ValidateXML(string strSchemaFile,string WorkSheet)
{
  strParseError="";
  XmlParserContext context = new XmlParserContext(null, 
                   new XmlNamespaceManager(new NameTable()), 
                   null, XmlSpace.None);
  XmlTextReader xmlReader = new XmlTextReader(this.GetXML(WorkSheet, 
                false), XmlNodeType.Element, context);
  XmlValidatingReader objValidator = new XmlValidatingReader(xmlReader);

  /* set the validation type to use an XSD schema */
  objValidator.ValidationType = ValidationType.Schema;
  XmlSchemaCollection objSchemaCol = new XmlSchemaCollection();
  objSchemaCol.Add("", strSchemaFile);
  objValidator.Schemas.Add(objSchemaCol);
  objValidator.ValidationEventHandler += 
     new ValidationEventHandler(ShowCompileErrors);

  try
  {
    while (objValidator.Read()) { }
  }
  catch(Exception objError)
  {
    throw new Exception(objError.Message); 
  }
  finally
  {
    xmlReader.Close();
  }

  return strParseError;
}

private void ShowCompileErrors(object sender, ValidationEventArgs args)
{
  strParseError+="::"+args.Message+"\r\n" ; 
}

Batch Process

Batch process will take the array of strings, and then by using the above functions will generate XML file and Schema per Workbook.

Future Enhancements

Generation of Excel Worksheet based on given XML Schema.

Download Binaries

You can download the working application here [9.48 Kb].

History

  • Version 1.0 released on 3rd June, 2005.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Architect
United States United States
Currently he is working as Senior Soft. Engineer at Cognizant Technology Solution.He is involved in various project activities like System Architecture, Design, and Development. He is fond of conduction training for various technologies. He has have worked on various language and platforms. He is Microsoft and Oracle Certified professional. He is spending quantity and quality time in .Net world. He had also spoiled his hand with java, too.
If work is not demanding, he spends good time with his wife, Purvi.He
blogs
at WebDevs.com.

Comments and Discussions

 
Generalerror in excel uploading Pin
spydeehunk6-May-11 1:49
spydeehunk6-May-11 1:49 
GeneralRe: error in excel uploading Pin
spydeehunk8-May-11 23:21
spydeehunk8-May-11 23:21 
GeneralCan u please help Pin
Member 788609730-Apr-11 20:57
Member 788609730-Apr-11 20:57 
GeneralMy vote of 5 Pin
kiranmayi pinupa15-Feb-11 21:03
kiranmayi pinupa15-Feb-11 21:03 
GeneralGreat Job! Pin
Luca1009014-Jun-10 22:30
Luca1009014-Jun-10 22:30 
GeneralMy vote of 1 Pin
ProtoBytes1-Apr-10 2:24
ProtoBytes1-Apr-10 2:24 
GeneralMy vote of 1 Pin
Carson Palmer4-Jan-10 10:34
Carson Palmer4-Jan-10 10:34 
GeneralExcel files .xlsx is not supported !! Pin
WaleedS25-Aug-09 0:08
WaleedS25-Aug-09 0:08 
Questionfrom where i get ExcelXML reference? Pin
Your Friend Manoj22-Jun-09 18:55
Your Friend Manoj22-Jun-09 18:55 
GeneralIt's Cool! Pin
chinaontology26-Nov-08 20:42
chinaontology26-Nov-08 20:42 
GeneralConnection to Excel using Jet Engine Pin
Rebecca Fox28-Jul-08 10:26
Rebecca Fox28-Jul-08 10:26 
QuestionI am not able to open the binary code . Pin
Savvy16-May-08 0:58
Savvy16-May-08 0:58 
NewsRe: I am not able to open the binary code . Pin
cnataraj24-May-08 2:15
cnataraj24-May-08 2:15 
GeneralIt's Great Article Pin
DEEPCHAND KOSTA17-Apr-07 1:31
DEEPCHAND KOSTA17-Apr-07 1:31 
Generaljava code to generate xml scehma from xml file Pin
Sameer_Iqbal3-Feb-07 7:25
Sameer_Iqbal3-Feb-07 7:25 
GeneralGeneral Conversion of tabular data into XML (VBA) Pin
rkapl9-Jan-07 5:11
rkapl9-Jan-07 5:11 
GeneralMissing some data from first row Pin
QAKiani24-Nov-06 22:27
QAKiani24-Nov-06 22:27 
GeneralRe: Missing some data from first row Pin
QAKiani25-Nov-06 10:22
QAKiani25-Nov-06 10:22 
GeneralRe: Missing some data from first row Pin
ahuitt5-Mar-12 10:28
ahuitt5-Mar-12 10:28 
GeneralCould not find installable ISAM OleDB.Net Pin
Hameer Saleem2-Sep-06 23:53
Hameer Saleem2-Sep-06 23:53 
GeneralRe: Could not find installable ISAM OleDB.Net Pin
Gaurang Desai7-Sep-06 18:51
Gaurang Desai7-Sep-06 18:51 
Generalhelp Pin
zhanglei.net200417-Apr-06 2:11
zhanglei.net200417-Apr-06 2:11 
GeneralRe: help Pin
QAKiani24-Nov-06 20:48
QAKiani24-Nov-06 20:48 
GeneralRe: help Pin
zhanglei.net20047-Jan-07 14:53
zhanglei.net20047-Jan-07 14:53 
GeneralA Caution about Mixed Data Types Pin
Aleks Kravetsker11-Jan-06 15:01
Aleks Kravetsker11-Jan-06 15:01 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.