
Introduction
I am always amazed with space-related images that NASA shares on a daily basis (see http://antwrp.gsfc.nasa.gov/apod/astropix.html). I am a frequent visitor to that site and it was very common (if not always) to download the images to my computer for later offline visualization (we still don't have an always-on world - at least a cost effective one). Moreover, I frequently had several images I wanted to download (all pictures of a month) and it was a tedious clicking process.
Hence, I developed my own automated software for that!
The software and lessons shared here consist of the following:
- Perform HTTP requests to APOD website (for a selected date)
- Retrieve and process HTML code
- Identify and retrieve JPG images contained in HTML (currently only JPG)
- Store JPG image in the same directory as the executable. Image is stored as <year><month><day>.jpg
Moreover, I tried to develop a user friendly window. However, only date selection is allowed at present.
Background
No specific background is required, other than basic knowledge of C# (or any other OO language and principles) and Microsoft Visual Studio.
Note that the code is 'quick' - hence short in documentation and cleanness.
Using the Code
The program is simple. The user is currently allowed to select a specific date. Then three commands are currently allowed:
- Get image
- Get all images from selected month (whose date is earlier than selected date)
- Exit
2 is actually an extension of 1, as I will explain later. 1 does the most complicated stuff.
But before I start, we need to understand how information and pages in APOD are structured - in a very simple and intuitive way:
APOD base URL is:
- http://antwrp.gsfc.nasa.gov/
The people who run the site are both smart and kind for they provide an extensive archive of pictures (current and past), totally free, which may be accessed as:
- http://antwrp.gsfc.nasa.gov/apod/ap<2-digit year><2-digit month><2 digit day>.html
For example, image for year 2009, month 09 and day 09 is accessed through the following URL:
- http://antwrp.gsfc.nasa.gov/apod/ap090916.html
So, having the rule to fetch pages for any given day, it's a matter of starting using nice C#,.NET and HTTP stuff.
Let's Start !
Actually action starts when a user presses the 'GET !' button. The click event calls the following code:
private void buttonGET_Click(object sender, EventArgs e)
{
GetAPODImage(dateTimePicker.Value);
}
GetAPODimage method does the main work:
First, we generate the URL for the selected date, following the rule explained above:
string url = GenerateAPOD_URL(“http:
This method builds the URL as follows:
private string GenerateAPOD_URL(string rootURL, System.DateTime date)
{
return rootURL
+ Get2DigitYear(date)
+ Get2DigitNumberAsString(date.Month)
+ Get2DigitNumberAsString(date.Day)
+ “.html”;
}
Note that Get2DigitNumberAsString returns a two digit string. For example, if selected day is 1, it returns ‘01’.
Next, we are going to retrieve the HTML-page:
GetHTTP(url,5000,ref resultHTML);
GetHTTP method will fetch the HTML page from ‘url’ and store it to ‘resultHTML’. Note that we pass ‘5000’ (i.e., 5 seconds) which indicates the timeout period to wait to fetch the page – this is very important to prevent the possibility to wait a long time for a page.
GetHTTP code is as follows:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Timeout = timeout;
req.ReadWriteTimeout = timeout;
WebResponse resp = req.GetResponse();
Stream resStream = resp.GetResponseStream();
int count = 0;
byte[] buf = new byte[8192];
resultHTML = "";
do
{
count = resStream.Read(buf, 0, buf.Length);
if (count != 0)
{
resultHTML += Encoding.ASCII.GetString(buf, 0, count);
}
}
while (count > 0); resp.Close();
Straightforward. OK – now, if no exceptions were raised, we must parse ‘resultHTML’ and look for ‘JPGs’ (which are the nice pictures we wish to download):
GetHTTPFilesByTypeSuffix(resultHTML, ‘.JPG’, ref list);
I will not go into excessive details in this function, except that work performed is to look for ‘href=’ and fetch the referenced file. It will only consider it valid if it is a ‘.JPG’.
For example: if var ‘resultHTML’ contains “<a href="image/0909/tarantula_gleason.jpg">”, image/0909/tarantula_gleason.jpg will be added to the list (var 'list').
Now that we have the image files, we should download and store them on disk (look at the code in bold):
try
{
....
foreach (string file in list)
{
String source = “http: string dest = GetYYYYMMDD(date)+’.JPG’;
WebClient Client = new WebClient();
Client.DownloadFile(source, dest);
pictureBox.Image = Image.FromFile(dest);
}
}catch (Exception ex)
{
Console.WriteLine("An exception occurred: " + ex.Message);
pictureBox.Image = Properties.Resources.g_close;
}
Note that:
- After the picture is stored, I show it in
pictureBox (in the main window)
- If an error occurs,
pictureBox shows an error image
Finally, how do you fetch all images for a selected month (which are earlier than the selected date)? Very simple:
private void buttonGETMONTH_Click(object sender, EventArgs e)
{
DateTime date = dateTimePicker.Value;
for (int i = 1; date.Month == dateTimePicker.Value.Month; i++ )
{
GetAPODImage(date);
date = dateTimePicker.Value.AddDays(-i);
}
}
The trick is to initialize temporary variable date to selected date and then decrease n-days until the month is changed.
That’s it ! I hope you enjoyed the article.
Final Remarks
You may download the win32 binary and run it directly (you need .NET 3.5 Framework). If you open the solution, you'll note a dependency on WeAreUtils (which is also included). This library actually has several of the utilities referred in the article. However, I decided to present a simplified version to make it easier to understand.
If you liked the software, you may find it at SourceForge (http://sourceforge.net/projects/weareapod/)! Join the project!
History
- 16th September, 2009: Initial post