Introduction

IIS logs have always been the ubiquitous starting point for diagnosing issues with your website. More seasoned developers usually also add application specific logging to their code using one of the variety of logging solutions available or a custom logging solution. The cloud based paradigm of Windows Azure brings extra complexity to how and where these logs can be stored. This article discusses how to implement custom logging using Azure Table Storage.

In an Azure deployment, logged data needs to be external to the Azure instance the code is running on. Azure instances are transient entities that run parts of your application for a finite time. Any data stored on the local instance should be considered liable to disappear at any moment. There are various techniques available for moving logged data periodically into Azure table storage but these can also come with potential data loss liabilities because of the latency involved. Logging directly to Azure table storage avoids this.

When deploying anything other than the simplest Azure application, it is important to be aware of the intrinsic multi-hosted nature of cloud deployments. Although having separate logs for each instance does have some benefits, most cloud application analysis is going to be concerned with the holistic view of how the entire application is performing. Therefore a single shared location for log storage, common to all instances of the Azure application, has many advantages. If necessary, instance specific information can also be identified in the log to help with diagnostics.

Azure table storage is part of Microsoft's Azure offering and offers a cheap, quick, non-relational storage solution based on a table archetype. Microsoft does charge for the volume of data stored and the number of data transactions that take place. However, the charges for these are extremely reasonable. At the time of writing, 10GByte of storage would cost you less than $1 per month and another $1 would buy you 10 million storage transactions. Obviously, these prices are subject to constant change and are included simply to give an illustration of the competitive nature of Azure table storage. The current charges can be seen at Pricing Details: Windows Azure.

Once you have logs stored on your Azure storage account, these can be accessed directly from your log analysis software or downloaded to a more convenient location within your own network. An important point to note is that an Azure storage account is totally separate from your Azure services. The storage account sits separately from your services and can be accessed independently by any Internet connected application, providing that either the application has the appropriate private keys or that parts of the storage account have been given public access. Your Azure application is just another consumer of the storage service.

Background

The information in this article was researched during my first Azure project. It soon became apparent that the usual methods of logging to the local file store or event log were not going to cut it.

I have always been a big fan of using the event log in the past for log storage as this seems a very natural place to keep this information. I usually create a new event log for each website or web service, which again I feel is a natural place to keep the information and in keeping with the rest of the Windows operating system ethos. Therefore I first attempted to get my new Azure service to log using this method.

This turned out to be quite a complicated task, involving adding start-up code to the Azure instance to initialize the new event log. Ordinarily, when performing a deployment to a Windows environment, I often deploy via a setup program and get this setup program to initialise the event log. In the case of Azure, this is not possible as everything is initialised when the Azure instance is created. Also throughout the lifetime of an Azure deployment, the instances may be replaced/re-imaged by the Azure system, meaning that any initialisation code needs to be run every time. This means that privileged start-up code needs to be added to your Azure deployment to make sure the event log is re-initialised on every Azure instance.

However, after getting the above event logging code to work correctly, it soon became apparent that this was not going to be the correct approach for two fundamental reasons. Firstly, the transient nature of the Azure instances means that local storage of logging data can only be considered temporary. If the Azure instance is moved to another virtual machine by the Azure system, all your logging data will be lost. Secondly, the nature of an Azure deployment is a highly distributed architecture. With a website or web service distributed over multiple instances, the granularity of logging you are really concerned with is at the application level not the instance level.

I then moved on to various other solutions involving the local storage being periodically transferred to other storage locations before finally settling on Azure table storage as my final solution. Initially, I had been loath to use table storage as I was aware there was a "per transaction" charge levied on it. However, once I realised that the actual cost of the transaction was so low, it seemed like an almost no-brainer after that.

Using the Code

The easiest way of interacting with Azure table storage is by deriving a class from Microsoft.WindowsAzure.StorageClient.TableServiceEntity. The public properties of any such class will be automatically stored and retrieved from the Azure table, with columns being automatically generated to match each property. When reading data back, columns with no matching property are simply ignored. Azure table columns are a superset of all the properties in the rows stored within them.

An example of a simple class to store our log data is shown below.

public class Message : TableServiceEntity
{
    private const string MessagePartionKey = "LogEntry";
    private const string DateFormat = "yyyyMMdd ; HH:mm:ss:fffffff";
    private const string RowKeyFormat = "{0} - {1}";

    public string LogMessage { get; set; }

    public Message()
    {
    }

    public Message(string logMessage)
    {
        PartitionKey = MessagePartionKey;
        string date = DateTime.Now.ToUniversalTime().ToString(DateFormat);
        RowKey = string.Format(RowKeyFormat, date, Guid.NewGuid().ToString());
        LogMessage = logMessage;
    }
}

There are two inherited properties from TableServiceEntity that we must set. PartitionKey identifies the partition that the data will be stored in. Keeping all our log data in the same partition makes sense, so we have a set partition key of "LogEntry". RowKey is the unique key for this record and is the primary (and only) key for the row. To make sure the data is returned in the correct order, we start the primary key with an ordered date/time stamp. However, because we need to guarantee its uniqueness, especially in a multi-instanced cloud application where it is certainly possible for the data construction code to be run at exactly the same time in more than one instance, we also suffix a GUID onto the key.

The LogMessage property is the only other property we have used in this simple class. However, any number of other properties can be added in here; for example "Date", "Severity", "Requested URL", etc.

N.B. The empty default constructor is necessary so that instances of this class can be created by the Azure class library when reading the data back from the Azure table.

To store a log message from our Azure application, we can use the following method:

public void StoreNewLogMessage(Message logMessage)
{
    CloudStorageAccount account;
    if (CloudStorageAccount.TryParse(CloudStorageAccountName, out account))
    {
        CloudTableClient tableClient = account.CreateCloudTableClient();
        tableClient.CreateTableIfNotExist(LogName);
        TableServiceContext tableContext = tableClient.GetDataServiceContext();
        tableContext.AddObject(LogName, logMessage);
        tableContext.SaveChangesWithRetries();
    }
    else
        HandleInternalError("Cloud storage could not be opened.", logMessage);
}

The Azure classes in the above are all contained in the Microsoft.WindowsAzure.StorageClient namespace. The above code presumes there are properties or constants defined elsewhere to provide:

CloudStorageAccountName (a connection string to the Azure storage account)
LogName (the name of the log table within the Azure table storage account)

First, the code opens a writeable connection to the Azure table storage account. The CloudStorageAccountName property/constant in the above should return an Azure connection string including a key with permission to modify it. For example: AccountName=[Your Storage Account Name];AccountKey=[Your Primary Access Key];DefaultEndpointsProtocol=https.

Next, a CloudTableClient instance is created to provide access to the tables in the specified storage account. The CreateTableIfNotExist call is pretty self explanatory and is a defensive way of ensuring that the table is always present. The code then creates a TableServiceContext which allows access to any of the tables in the account. The AddObject call adds a given object (inherited from TableServiceEntity) to the named table. Finally, we must commit the change using SaveChangesWithRetries. The "retries" designation comes from the fact that this function will counter the inherent problematic nature of Internet based connections and perform a number of retries in the event of a transient problem.

The function ends with a call to another function HandleInternalError which you need to implement as a catch all error method to indicate something has gone seriously wrong with your storage account. In my Azure sites, I send periodic SMTP emails out from this function as an emergency technique, making sure the number of emails delivered is throttled to prevent a spam deluge!

All the above code is maintained and updated on my logging using Azure table storage page.

Summary

To conclude, Azure table storage is an extremely cost-effective method of implementing a cloud based shared logging solution. As always, a number of off-the-shelf solutions are available for logging functionality, but there will always be a need for a custom written solution in various scenarios. The techniques above should provide a sound starting point for anyone wishing to develop a custom logging solution for their Azure cloud deployment.