Click here to Skip to main content
Click here to Skip to main content

Schema.ini override and bulk upload using SQLXML

By , 15 Mar 2007
 

Introduction

This article shows how to override Schema.ini file on the fly to facilitate bulk uploads and also shows how to use SQLXML feature to do bulk upload into SqlServer 2000 table.

Background

Ever had issues in your web application with excel bulk uploads where the jet engine was not treating the column values as text?

In this article we will see how we can overcome this issue and also see a sample code for uploading bulk data into a SQL Server 2000 table, using the SQLXML features (though there are hundred different ways to do this same thing!).

Using the code

I get this huge file once every two weeks which I need to upload to SQL Server table. The columns have values with leading zeros.

The file with sample data looks something like this:

// sample Provider_file.txt
provider ID,License
0000018, 079376
0000019, 076185
0000021, 034733
0000024, 032921
0000025, 034551
0000043, 079331
0000107, 035911
0000272, 075555
0000295, 074591
0000331, 057518
0000342, 074673
0000371, 079491

Now, I could make use of the OPENROWSET functionality or DTS to upload this file to the target table. But, I have this little file upload interface in my web application that I use to upload the file. This would also enable my clients to go online and upload data whenever they wish.

What I do in the code is save this file with a date time stamp on it to make sure that I do not overwrite the previous file.

<label>UploadFile</label>
<INPUT id=txtFileToUpload type=file name=txtFileToUpload 

runat="server" /> 
&nbsp;&nbsp; 
<INPUT id=btnUpLoad onclick=doProcess(); 
       type=button value=Process name=btnUpLoad 
       Runat="server" />     

Then read the file contents into a data set.

Dim cnn As OleDb.OleDbConnection

Dim dsCSV As New DataSet("Providers") cnn = 
    New OleDb.OleDbConnection("Provider= Microsoft.Jet.OLEDB.4.0;Data

Source=" + filePath + ";Extended Properties=Text")
Dim mySelect As String = String.Empty

mySelect = "Select * from [" & fileName & "]"

Dim da As New OleDb.OleDbDataAdapter(mySelect, cnn)
da.Fill(dsCSV)

Now, while selecting the data into a dataset I observed that the column having data with leading zeros was not treated like a text column type, despite changing the column type to Text in excel and then converting the file to a CSV file. This obviously means that when data is read into a dataset, the leading zeros would vanish. Then I figured out that I needed to put the Schema.ini file with column data types into the folder containing the data file, in order for the jet engine know the data types of the data columns specifically. Otherwise Jet engine would interpret the column as numeric based on the first few rows of data.

Schema.ini override

[PROVIDER_FILE_02232007090418.txt]
ColNameHeader = True
Format = CSVDelimited
CharacterSet=ANSI
Col1=IfProv Char Width 20
Col2=License Char Width 20

The first line of the Schema.ini file contains the target file name including the file extension on which the schema needs to be applied. In the second line, mentions whether the column headers are present in the target file or not. The third line contains the format of the target file. Fourth line contains the character set. After the fourth line, you can start declaring the column names and their data types. Note that the Schema.ini file should be present in the same folder as the target data file you are trying to create.

In my application, since the upload happens biweekly, and since I needed the previously uploaded file to be untouched, I had to come up with a way to change the file name (first line in the schema file) in the Schema.ini file. So I decided to create the Schema.ini file on the fly every time the upload happens by feeding in the first line. And the code for that would look something like this:

fileStream = New FileStream(fileName, 

FileMode.Truncate)
writer = New StreamWriter(fileStream)
writer.WriteLine("[" & UploadfileName & "]")
writer.WriteLine("ColNameHeader = True")
writer.WriteLine("Format = CSVDelimited")
writer.WriteLine("CharacterSet=ANSI")
writer.WriteLine("Col1=IfProv Char Width 20")
writer.WriteLine("Col2=License Char Width 20")

Once the data is read to a dataset, rename table and the column names (if need be) before getting the XML output.

If Not dsCSV Is Nothing Then
   dsCSV.Tables(0).TableName = "Provider"
dsCSV.Tables(0).Columns(0).ColumnName = "IfProv"
   dsCSV.Tables(0).Columns(1).ColumnName = "License"
End If

And the XML data would look something like this after renaming the table names and columns:

<Providers>
    <Provider>
        <License>79376</License>
        <ifprov>0000018</ifprov>
    </Provider>
    <Provider>
        <License>76185</License>
        <ifprov>0000019</ifprov>
    </Provider> 
</Providers>

Send the XML data to Stored Procedure

Pass on the XML to the stored proc to process.

SqlHelper.ExecuteNonQuery(ConnectionString, _
        CommandType.StoredProcedure, _
        "BULK_UPLOAD_XML_DATA", _
        New SqlParameter() { _
        New SqlParameter("@PROVDATA", xmlProviderData), 

returnParameter _
            })

Uploading data using SQLXML

Create a table with two columns as shown below.

CREATE TABLE TBL_PROVIDER(IfProv VARCHAR(20) 

NULL,License VARCHAR(20) NULL)

The stored procedure to upload the XML data in to the table using SQLXML feature would be something like this:

CREATE PROC BULK_UPLOAD_PROVIDER_DATA
@PROVDATA 

    NTEXT 
AS 
    /*PARSE XML AND KEEP IT READY FOR CONSUMPTION IN THE MEMORY */ 
    EXEC SP_XML_PREPAREDOCUMENT @IDOC OUTPUT,@PROVDATA 
    
    --DUMP DATA INTO THE TBL_PROVIDER 
    INSERT INTO 
        TBL_PROVIDER (IfProv,License)
        SELECT * FROM OPENXML
              (@IDOC, '/Providers/Provider')  WITH (IfProv
                    VARCHAR(10)   'IfProv', License
        VARCHAR(10) 'License') WHERE IfProv IS NOT NULL AND License IS NOT _ 
            NULL 
        
    /*REMOVE THE DOCUMENT FROM THE    MEMORY */ 
    
    EXEC SP_XML_REMOVEDOCUMENT @IDOC

Wrapping it all up

I have always enjoyed using SQLXML features of SQL Server and I hope that the Schema.ini override along with this SQLXML features in this article help a lot of people by resolving their day-to-day bulk upload issues.

Disclaimer: The source code contains Microsoft's SQLHelper helper class for making database calls which I use in most of my applications (I'm sure many readers use this!) and it is not written by me. I included it in the source code zip file just to make sure that the code works after downloading it.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Aravind Rajagopal K
Software Developer (Senior) UMASS Medical School
United States United States
Member
A Certified .NET MCSD (Microsoft Certified Solution Developer)that specializes in Microsoft technologies architecting/designing/developing applications using .NET (C# or VB), ASP.NET, SQL Server and BizTalk.
I have contributed as a designer/developer for benchmarked products in the finance industry benchmarked for best transaction engine design and products that offer end to end solution to research based content. Currently working with UMASS Medical School at Worcester MA.
 
Hobbies include music,reading books and playing cricket.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralRewrite intromvpDave Kreskowiak6 Jan '10 - 2:01 
I'd start by rewriting the intro to this article. What is a "Schema.ini" file and what is it used for?
 
Other than that, it's a pretty decent article.
 

A guide to posting questions on CodeProject[^]



Dave Kreskowiak
Microsoft MVP
Visual Developer - Visual Basic
     2006, 2007, 2008
But no longer in 2009...




GeneralRe: Rewrite intromemberAravind Rajagopal K6 Jan '10 - 9:25 
I agree....I didn't do a good job of explaining that in the intro..
 
Thanks
Aravind
 
Aravind Rajagopal

GeneralCreate .csv file programmatically using C#memberJan Palmer11 Sep '07 - 19:23 
Please help,
I was trying to generate a .csv file on my C# code below
 

string filepath ="c:\";
string filename = "myCsv.csv";
 
System.Data.OleDb.OleDbConnection Conn =
new System.Data.OleDb.OleDbConnection(
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filepath + ";" +
"Extended Properties=\"text;HDR=Yes;FMT=Delimited\""); //(;) FMT=Delimited;FMT=FixedLength;FMT=Custom
 
//System.Data.OleDb.OleDbCommand Cmd = new System.Data.OleDb.OleDbCommand("create table " + filename + " ([column1] nvarchar(255), [column2] nvarchar(255), [column3] nvarchar(255))", Conn, null);
 
for (int a = 0; a < mapping_src_import.Count ; a++) {
resulta = resulta + " [" + mapping_src_import[a].ToString() + "] nvarchar(255),";//"[column1] nvarchar(255),"
}
resulta =resulta.TrimEnd(',');
System.Data.OleDb.OleDbCommand Cmd = new System.Data.OleDb.OleDbCommand("create table " + filename + "(" + resulta + ")", Conn, null);
 
//" create table file1.csv ([column1] char(255), [column2] char(255), [column3] char(255))", Conn,null); //([column1] char(255), [column2] char(255), [column3] char(255))
Conn.Open();
Cmd.ExecuteNonQuery();
Cmd = null;
//Cmd.Dispose();
Conn.Close();
Conn = null;
 

 
It did generate a .csv file. The problem is when I do run the generated file to do an "Insert into Openrowset". I encounter error msg:
"OLE DB provider 'Microsoft.Jet.OLEDB.4.0' reported an error. The provider did not give any information about the error."
 
I tried manually creating a .csv file by opening Excel and Saving the file As myCsv.csv. I also included the correct headers on the Sheet1 worksheet. When I run my my insert into OpenRowSet query in SQL Query Analyser to check if its running:
 
Insert into Openrowset ('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=C:\\myCsv.csv;','Select Partner_Id, PPRO_ID, Location_Full_Name, HQ_SO2, Period_Short, High_BU, Target_CLC from [Q3_07_Colt_Export_Report$]') SELECT distinct ltable.compid AS Partner_Id, vendor.pproid AS PPRO_ID, vendor.Location_full_name AS Location_Full_Name, ltable.so2 AS HQ_SO2, periods.period_short AS Period_Short, parts.bu AS High_BU, ltable.target_clc AS Target_CLC FROM ltable INNER JOIN dbo.vendor ON dbo.ltable.compid = dbo.vendor.company_Id INNER JOIN dbo.periods ON dbo.ltable.period = dbo.periods.period_id INNER JOIN dbo.parts ON dbo.ltable.parts = dbo.parts.l4segshort WHERE (dbo.ltable.period = 29) AND (dbo.ltable.combined = 'N') AND (dbo.ltable.LTypeTbl = 3) AND parts.bu in ('PPP','BBB','CCC','GGG') AND ltable.status1='10' AND ltable.eligible='Y' AND ltable.takespart='Y'
 
It WORKED! but the sad part is, the .csv file I have programmatically generated DIDN'T :c
 

My doubts lies on the C# code to generate the .csv because when I right click Properties of the Manual .csv file I have created compared to the code program generated, I noticed there is a Tab name Custom.
 
My question is from my C# code is how can I customize the command in such a way it will also create a Custom properties on the .csv generated.
 

Thank you .. Kindly Post your reply and email me : janverge@gmail.com for the answer. My boss is killing me.

 
nice one..

QuestionBulk insert not recognizing end of filemembersearching for solution.....6 May '13 - 0:32 
I have data in input file given below -->
1,abc
2,xyz

 
------- End of data----

Notice i have an empty rows after second row.
I m doing bulk insert with FIELDTERMINATOR = ',',ROWTERMINATOR = '\n'. It works perfectly fine when bulk inserting after removing the empty records. But when there is empty record at end of file, it is throwing error given below and no records are inserted in db. I cannot ask the user to remove empty line at EOF.

Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
GeneralUTFmemberastaykov17 Aug '07 - 21:30 
Actually the schema.ini supports Unicode with the:
CharacterSet=Unicode
But it is the UTF-16 code page.
So you must find a way out to transform your files to UTF-16.
 
Anton Staykov
http://blogs.staykov.net/
QuestionProblem with charactersetmemberklamu20 Jun '07 - 23:45 
Thanks for your artical,it's veryhelpfull.
I have the following issue.
I amworking with Access and jet engine DAO.
I have to import data in UTF-8 format.
I couldn't find any way to import it since the schema.ini doesn't seem to support it.
Do you have any clue about this?
Thanks
Klaus

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 15 Mar 2007
Article Copyright 2007 by Aravind Rajagopal K
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid