Click here to Skip to main content
Click here to Skip to main content

Getting the Place Name that Goes with a Zip Code

, 10 Feb 2009
Rate this:
Please Sign up or sign in to vote.
How to import a database-dump from GeoNames.org into SQL Server

Introduction

There are various Web Services that give you a corresponding place name if you feed them a postal-code. GeoNames is such a service.

As an extra service, GeoNames provides the databases under the Creative Commons 3.0 license. First, you download a country-specific zip file from their recent database dumps and extract that.

Using the Code

If you're just interested in importing the data into your database, then run the demo. Enter a valid connection string, point the file dialog to your extracted download, and wait for the progress bar to fill up.

The source code is relatively simple, as this is a straightforward import-job. The import logic is nicely tucked away in the file "FormMain.ImportProcedure.cs". This file holds the backgroundworker-logic, which is based on a foreach-loop:

private void backgroundWorker1_DoWork
	(object sender, System.ComponentModel.DoWorkEventArgs e)
{
    [...]

    string[] lines = File.ReadAllLines(args.SourceFileName, Encoding.UTF8);

    foreach (string line in lines)
    {
        //The data format is tab-delimited text in utf8 encoding..
        string[] items = line.Split('\t');

        IDbCommand sqlInsertItem = db.NewCommand(sqlInsertIntoImportTable);
        sqlInsertItem.Parameters.Add(db.NewParam("countrycode", items[0]));
        [...]
        sqlInsertItem.Parameters.Add(db.NewParam("longitude", items[9]));
        sqlInsertItem.Parameters.Add(db.NewParam("accuracy", items[10]));
        sqlInsertItem.ExecuteNonQuery();

Supporting Different Database Drivers

This version supports multiple database-formats, and has been tried with SQL Server, Microsoft Access, SQL CE, SQLite, Oracle XE and Microsoft Excel.

The import-logic doesn't need to know the specifics of the database-driver used, and is using an interface. The interfaces get replaced by the equivalent driver that the enduser has selected, replacing the IConnection with an object created at runtime. The creation of the appropriate objects is done by the internal DataInterface-class, by calling the static New method.

This method then switch-es to the driver that the user has selected in the dropdownlist. If the user wants a SqlClient, we return a SqlConnection. If the user wants an OracleClient connection, we return a connection that facilitates Oracle.

    switch (dbType)
    {
        case DataProvider.ODBC:
            {
                result._con = new System.Data.Odbc.OdbcConnection(connectionString);
                break;
            }
        case DataProvider.OleDB:
            {
                result._con = new System.Data.OleDb.OleDbConnection(connectionString);
                break;
            }
        case DataProvider.OracleClient:
            {
                result._con = 
		new System.Data.OracleClient.OracleConnection(connectionString);
                break;
            }
        case DataProvider.SqlClient:
            {
                result._con = new System.Data.SqlClient.SqlConnection(connectionString);
                [...]

Each of these databases can be accessed by opening a connection and posting IDbCommands that contain a SQL statement. Most databases can handle ANSI SQL commands. Take a look at Asher Barak's article "Server Indifferent SQL" for more information on the ANSI-standards.

ConnectionStrings

Some sample connections that have been used while testing this code and their testresults:

SQL Server Express 2005 (Provider: SqlClient)

Server=.\SQLEXPRESS;Database=master;Trusted_Connection=True;

16377 records in 1 minute, 4 seconds - averaging 0.0039 seconds per record.

Microsoft Access (Provider: OleDB)

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\database1.mdb;User Id=admin;Password=;

16377 records 54 seconds, averaging 0.0033 seconds per record.

SQL CE (Provider: SqlCe)

Data Source=C:\MyDatabase1.sdf;

16377 records in 38 seconds, averaging 0.023 seconds per record.

SQLite (Provider: SQLite)

Data Source=C:\Database.sqlite;Version=3;

16377 records in 22 minutes and 31 seconds, averaging 0.0825 seconds per record.

Oracle XE 10g (Provider: OracleClient)

Data Source=XE;User Id=Anonymous;Password=Password;

16377 records in 48 seconds, averaging 0.0030 seconds per record.

Microsoft Excel 2007 (Provider: OleDb)

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\Map1.xlsx;
	Extended Properties="Excel 12.0 Xml;HDR=YES";

16377 records in 1 minute and 3 seconds, averaging 0.0039 seconds per record.

How Long is This Going to Take?

There's a ProgressBar to give visual feedback on the progress that we make. This progressbar will fill up slow or fast, depending on the speed of our machine. The value of our feedback declines if the progressbar fills up too slow - since the user isn't getting a real feedback - he's just waiting for another bar to appear in the progressbar, without any clue when this next bar might arrive.

One option is to make the bar longer, effectively giving a more fine-grained progressindication. Another option is to throw in an "estimated waiting time", like the one that you see when copying files with Windows Explorer. This will work best when the operation consists of blocks of work that take an equal amount of time. We need to track at what time the operation started. The next thing we want to know is how long we've been working on the current operation, so we subtract the startMoment from the current date/time:

DateTime now = DateTime.Now;
TimeSpan span = now.Subtract(startMoment);

Now that we know how long we've been processing, we can calculate how long a single item took on average:

// how long does it take to do a single item?
double timePerItem = (span.TotalSeconds / progressBar1.Value);

Once you know how long the processing of the average item takes (being a single step in the progressBar), we can calculate how much time we're (probably) going to need to perform the rest of the operation. You can see the result when you hover the mousecursor over the progressBar.

Notes

  • SQLite was quite a bit slower than expected. There must be a more efficient way to talk to a SQLite database.
  • Oracle doesn't like the @-sign in the parameter-declaration. The runtime haunted me with the error "ORA-01036: illegal variable name/number", until the @-character in the SQL-statement was replaced with a ":"-character.

History

  • 21st January, 2009 - Initial version
  • 8th February, 2009 - Decoupled database-dependencies and moved the importlogic to a backgroundworker

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Eddy Vluggen
Software Developer
Netherlands Netherlands
I'm a Delphi-convert, mostly into WinForms and C#. My first article is from 2001, extending the Delphi-debugger, which is still visible on the WayBackMachine[^] and even available in Russian[^] Smile | :)

Comments and Discussions

 
GeneralMy vote of 1 PinmemberMike Marynowski18-Aug-09 15:16 
Generalbcp PinmemberPIEBALDconsult22-Jan-09 5:30 
After taking a look at the file, I still say use bcp, especially if you do this infrequently and only into SQL Server, as I think is the case.
 
Another option would be to write a more general data import utility that can import various formats into various databases.
 
I definitely wouldn't write a custom single-purpose program for this task.
 
And you don't explain the code at all.
 

P.S. And that data is incomplete; my zipcode covers several communities, but is only listed once.
GeneralRe: bcp PinmemberEddy Vluggen22-Jan-09 5:40 
GeneralRe: bcp PinmemberEddy Vluggen22-Jan-09 5:49 
GeneralRe: bcp PinmemberPIEBALDconsult22-Jan-09 5:54 
GeneralRe: bcp PinmemberEddy Vluggen22-Jan-09 6:13 
GeneralRe: bcp PinmemberPIEBALDconsult22-Jan-09 6:26 
GeneralMy vote of 1 PinmemberNorm .net22-Jan-09 4:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140721.1 | Last Updated 10 Feb 2009
Article Copyright 2009 by Eddy Vluggen
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid