Schema.ini override and bulk upload using SQLXML






3.25/5 (5 votes)
Schema.ini override and bulk upload using SQLXML
Introduction
This article shows how to override Schema.ini file on the fly to facilitate bulk uploads and also shows how to use SQLXML feature to do bulk upload into SqlServer 2000 table.
Background
Ever had issues in your web application with excel bulk uploads where the jet engine was not treating the column values as text?
In this article we will see how we can overcome this issue and also see a sample code for uploading bulk data into a SQL Server 2000 table, using the SQLXML features (though there are hundred different ways to do this same thing!).
Using the code
I get this huge file once every two weeks which I need to upload to SQL Server table. The columns have values with leading zeros.
The file with sample data looks something like this:
// sample Provider_file.txt
provider ID,License
0000018, 079376
0000019, 076185
0000021, 034733
0000024, 032921
0000025, 034551
0000043, 079331
0000107, 035911
0000272, 075555
0000295, 074591
0000331, 057518
0000342, 074673
0000371, 079491
Now, I could make use of the OPENROWSET functionality or DTS to upload this file to the target table. But, I have this little file upload interface in my web application that I use to upload the file. This would also enable my clients to go online and upload data whenever they wish.
What I do in the code is save this file with a date time stamp on it to make sure that I do not overwrite the previous file.
<label>UploadFile</label>
<INPUT id=txtFileToUpload type=file name=txtFileToUpload
runat="server" />
<INPUT id=btnUpLoad onclick=doProcess();
type=button value=Process name=btnUpLoad
Runat="server" />
Then read the file contents into a data set.
Dim cnn As OleDb.OleDbConnection
Dim dsCSV As New DataSet("Providers") cnn =
New OleDb.OleDbConnection("Provider= Microsoft.Jet.OLEDB.4.0;Data
Source=" + filePath + ";Extended Properties=Text")
Dim mySelect As String = String.Empty
mySelect = "Select * from [" & fileName & "]"
Dim da As New OleDb.OleDbDataAdapter(mySelect, cnn)
da.Fill(dsCSV)
Now, while selecting the data into a dataset I observed that the column having data with leading zeros was not treated like a text column type, despite changing the column type to Text in excel and then converting the file to a CSV file. This obviously means that when data is read into a dataset, the leading zeros would vanish. Then I figured out that I needed to put the Schema.ini file with column data types into the folder containing the data file, in order for the jet engine know the data types of the data columns specifically. Otherwise Jet engine would interpret the column as numeric based on the first few rows of data.
Schema.ini override
[PROVIDER_FILE_02232007090418.txt]
ColNameHeader = True
Format = CSVDelimited
CharacterSet=ANSI
Col1=IfProv Char Width 20
Col2=License Char Width 20
The first line of the Schema.ini file contains the target file name including the file extension on which the schema needs to be applied. In the second line, mentions whether the column headers are present in the target file or not. The third line contains the format of the target file. Fourth line contains the character set. After the fourth line, you can start declaring the column names and their data types. Note that the Schema.ini file should be present in the same folder as the target data file you are trying to create.
In my application, since the upload happens biweekly, and since I needed the previously uploaded file to be untouched, I had to come up with a way to change the file name (first line in the schema file) in the Schema.ini file. So I decided to create the Schema.ini file on the fly every time the upload happens by feeding in the first line. And the code for that would look something like this:
fileStream = New FileStream(fileName,
FileMode.Truncate)
writer = New StreamWriter(fileStream)
writer.WriteLine("[" & UploadfileName & "]")
writer.WriteLine("ColNameHeader = True")
writer.WriteLine("Format = CSVDelimited")
writer.WriteLine("CharacterSet=ANSI")
writer.WriteLine("Col1=IfProv Char Width 20")
writer.WriteLine("Col2=License Char Width 20")
Once the data is read to a dataset, rename table and the column names (if need be) before getting the XML output.
If Not dsCSV Is Nothing Then
dsCSV.Tables(0).TableName = "Provider"
dsCSV.Tables(0).Columns(0).ColumnName = "IfProv"
dsCSV.Tables(0).Columns(1).ColumnName = "License"
End If
And the XML data would look something like this after renaming the table names and columns:
<Providers>
<Provider>
<License>79376</License>
<ifprov>0000018</ifprov>
</Provider>
<Provider>
<License>76185</License>
<ifprov>0000019</ifprov>
</Provider>
</Providers>
Send the XML data to Stored Procedure
Pass on the XML to the stored proc to process.
SqlHelper.ExecuteNonQuery(ConnectionString, _
CommandType.StoredProcedure, _
"BULK_UPLOAD_XML_DATA", _
New SqlParameter() { _
New SqlParameter("@PROVDATA", xmlProviderData),
returnParameter _
})
Uploading data using SQLXML
Create a table with two columns as shown below.
CREATE TABLE TBL_PROVIDER(IfProv VARCHAR(20)
NULL,License VARCHAR(20) NULL)
The stored procedure to upload the XML data in to the table using SQLXML feature would be something like this:
CREATE PROC BULK_UPLOAD_PROVIDER_DATA
@PROVDATA
NTEXT
AS
/*PARSE XML AND KEEP IT READY FOR CONSUMPTION IN THE MEMORY */
EXEC SP_XML_PREPAREDOCUMENT @IDOC OUTPUT,@PROVDATA
--DUMP DATA INTO THE TBL_PROVIDER
INSERT INTO
TBL_PROVIDER (IfProv,License)
SELECT * FROM OPENXML
(@IDOC, '/Providers/Provider') WITH (IfProv
VARCHAR(10) 'IfProv', License
VARCHAR(10) 'License') WHERE IfProv IS NOT NULL AND License IS NOT _
NULL
/*REMOVE THE DOCUMENT FROM THE MEMORY */
EXEC SP_XML_REMOVEDOCUMENT @IDOC
Wrapping it all up
I have always enjoyed using SQLXML features of SQL Server and I hope that the Schema.ini override along with this SQLXML features in this article help a lot of people by resolving their day-to-day bulk upload issues.
Disclaimer: The source code contains Microsoft's SQLHelper helper class for making database calls which I use in most of my applications (I'm sure many readers use this!) and it is not written by me. I included it in the source code zip file just to make sure that the code works after downloading it.