Click here to Skip to main content
Click here to Skip to main content
Go to top

Weather Logger

, 12 Dec 2012
Rate this:
Please Sign up or sign in to vote.
Automated process of downloading a XML file and inserting the contents into a database.

Introduction

This article describes the process of downloading and inserting XML file into a database and automating the task using two different approaches, i.e., SQL Server Jobs and Windows Task Scheduler.    

Background 

National Oceanic and Atmospheric Administration (NOAA) provides a wide range of public web services which can be accessed programmatically. Specifically, this link [1] got my attention. NOAA provides access to observed weather conditions across 1800 locations mostly airports (organized by a station ID) in the United States. Various weather sensors record the values such as barometric pressure, wind speed, humidity, temperature, and so on at these locations. The observations are bundled into a XML file primarily targeted for machine to machine transfer [2]. For each hour, RSS and XML files are generated for every observation station. The suggested pick up time is 15 minutes after the hour. This scenario presented an opportunity to automate the task of downloading the new XML file which is getting updated every hour. The next section describes the overview of the application. 

Overview  

The process is divided into four steps as shown below in figure 1. 

Before I jump into the implementation details, would like to discuss the boundaries on which the application is designed. The XML file generated by NOAA is about ~2.25 KB. Normally there should be 24 new files every day because the file is getting updated every hour but what I observed is file doesn’t change for hours together. Check out this link [3] to see the XML file used for the application and I was interested in the following nodes: ObservationTime, Temp_F, Temp_C, Relative Humidity, Wind MPH, pressure_mb, dewpoint_F.

The main logic is embedded into the WeatherLogger.exe code which is step two. This is a console based C# application developed in Visual Studio 2012 with .NET 4.5 framework. In the next section I will go over the design strategies incorporated. From the beginning I wanted to store the values from the XML file to a table in database. For step three, I wrote two stored procs and created two tables called “Observations” and “AppErrors” in SQL Server 2012. For the final part, step four, the automation is taken care in two ways. I used windows task scheduler and SQL Server Job to run the WeatherLogger.exe every one hour to pick the new file. Both approaches are good and better when compared with windows service application.           

The source code is on GitHub, check out this link [4] Feel free to pull the code, try it out and I appreciate criticism on the code. If there is room for improvement, would like to know. The next section describes the implementation details of step two. 

Weather Logger 

This was my initial approach which I scrapped.  The heavy lifting was done by the console app and the database portion was light weight. I thought of poling the NOAA server periodically, download the XML file, parse it, remove the unused nodes and insert the nodes required into a table. I was able to achieve all of it but the tricky part was polling the server at a regular interval. I battled through the Last-Modified and If-Last-Modified http headers [5]. Tuned a timing belt which runs at the suggested pick time. Although the suggested pick up time was 15 minutes after the hour, the pickup time was not reliable. This messed up my timing belt, then I resorted for a second option.

For the second option I decided to go database intensive and keep the console app light weight. I heard lot of good things about the XML data type and the merge statement in SQL Server and decided to use it. This choice significantly reduced the code size and solution became elegant. The main logic is bundled into a stored procedure CheckAndInsert which takes @xml file as a parameter.     

The idea is convert the XML/URI into C# string and call the strod proc by passing the string as an argument. The values from the required nodes are inserted into a #staging table which is identical to the main table which holds the observations data.  

select
current_observation.value('(observation_time)[1]',
'varchar(50)') as ObsTime,
current_observation.value('(temp_f)[1]', 'float') as
Temp_f,
current_observation.value('(temp_c)[1]', 'float') as
Temp_c,
current_observation.value('(relative_humidity)[1]', 'float')
as RelHum,
current_observation.value('(wind_mph)[1]', 'float') as
Wind_mph ,
current_observation.value('(pressure_mb)[1]', 'float')
as Pressure_mb,
current_observation.value('(dewpoint_f)[1]', 'float')
as Dewpt_f
into #Staging
from 
@Raw_Xml.nodes('current_observation') as rawdata(current_observation) 

Then use merge tsql command by choosing the Observations as target table and #Staging as source table. Comparison is done on the observation time, which is most reliable column. If not matched then the data from the XML file is inserted into the database. Finally the #Staging table is dropped.

;merge into Observations
using #Staging
on #Staging.ObsTime = Observations.ObsTime
when not matched then
insert (ObsTime,Temp_f,Temp_c,RelHum,Wind_mph,Pressure_mb,Dewpt_f)
values
  (ObsTime,Temp_f,Temp_c,RelHum,Wind_mph,Pressure_mb,Dewpt_f);

This approach is scalable and suitable to this scenario as the size of the XML is really low, the execution time for merge statement is very fast. There is no need for storing or parsing the XML file. This stored procedure is called in the InsertData method from the InsertIntoDb class. The class diagram for the console app is shown in figure 2. 

There are three main classes in the application, DownloadXML, InserIntoDb, and LogErrors, the Program class orchestrates the flow of the application and it is the entry point. The XML data type in SQL Server is quite powerful, this data type reads XML file as blob and in C# it should be well formatted XML file stored as a simple C# string.

The DownloadWeatherXML method in the DownloadXML uses the WebClient class to download the requested url as a string.

string xml = new WebClient().DownloadString(_url);
string modifiedxml = xml.Replace(" \r\n", "");

If I directly pass this string without replacing the encoding then I got the following error from the sp_CheckAndInsert stored procedure. I was unable to resolve it, but it works if I replace the encoding.

“XMLparsing: line 1, character 43, unable to switch the encoding”

LogErrors is a generic helper class and has a method InsertErrors(). This method is used for recording any errors that are caught by any catch block throughout the code. The data is inserted by InsertIntoAppErros stored procedure. I was compelled to use this class after reading the book Joel on Software [6].   

Automation   

The WeatherLogger.exe has to run every hour, which calls in for an automation. I wanted to work with both the SQL Server Job and Windows Task Scheduler. I took screenshots on how to setup the job, check the links below, uploaded them on flickr 

Scheduling SQL Server Job
Windows Task Scheduler  

Results  

I ran the scheduled jobs and task on two different days for about couple hours, below is the screenshot of the observations table.

Conclusion

Overall, I personally feel this approach is scalable, the choice of executing the exe can be delegated to the system admin or a database administrator. I have done XML parsing in C# but ease provided by the merge statement and XML data type from SQL Server is unbeatable.  For future, I would like to set up a SSRS web service and render the report for a client using the observations data. Also, would like to implement the complete process as SQL CLR stored procedure.    

References

[1] http://w1.weather.gov/xml/current_obs/

[2] http://products.weather.gov/PDD/NWS_Current_Observations_RSS_XML.pdf

[3] http://w1.weather.gov/xml/current_obs/KEWR.xml

[4] https://github.com/tkmallik/WeatherLogger.git

[5] http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.ifmodifiedsince.aspx

[6] http://www.amazon.com/Joel-Software-Occasionally-Developers-Designers/dp/1590593898

[7] http://www.flickr.com/photos/90662753@N08/sets/72157632124494853/

[8] http://www.flickr.com/photos/90662753@N08/sets/72157632128725180/

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Venkata Turlapati
Database Developer
United States United States
No Biography provided

Comments and Discussions

 
-- There are no messages in this forum --
| Advertise | Privacy | Mobile
Web01 | 2.8.140926.1 | Last Updated 12 Dec 2012
Article Copyright 2012 by Venkata Turlapati
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid