The latest Windows Azure SDKs v1.7.1 and 1.8 have a nice feature called “StartCopyFromBlob” that enables us to instruct Windows Azure data center to perform cross-storage accounts blob copy. Prior to this, we need to download chunks of blob content, then upload into the destination storage account. Hence, “
StartCopyFromBlob” is more efficient in terms of cost and time as well.
The notable difference in version 2012-02-12 is that copy operation is now asynchronous. It means once you made a copy request to Windows Azure Storage service, it returns a copy ID (a GUID
string), copy state and HTTP status code 202 (Accepted). This means that your request is scheduled. Post to this call, when you check the copy state immediately, it is most probably in “pending” state.
StartCopyFromBlob – An TxnCompensation Operation
Extra care is required while using this API, since this is one of the real world transaction compensation service operations. After making the copy request, you need to verify the actual status of the copy operation at a later point in time. The later point in time would be varied from very few seconds to 2 weeks based on various constraints like source blob size, permission, connectivity, etc.
The figure below shows a typical sequence of
StartCopyFromBlob operation invocation.
(Click on the above image to see full view)
CloudBlockBlob and CloudPageBlob classes in Windows Azure storage SDK v1.8 provide
StartCopyFromBlob() method which in turn calls the WAS REST service operation. Based on the Windows Azure Storage Team blog post, this request is placed on internal queue and it returns copy ID and copy state. The copy ID is a unique ID for the copy operation. This can be used later to verify the destination blob copy ID and also the way to abort copy operation later point in time. CopyState gives you copy operation status, number of bytes copying, etc.
Note that sequence 3 “
PushCopyBlobMessage” in the above figure is my assumption about the operation.
ListBlobs – Way for Compensation
Although, copy ID is in your hand, there is no simple API that receives array of copy IDs and to return the appropriate copy states. Instead, you have to call CloudBlobContainer‘s
GetXXXBlobReference() to get the copy state. If the blob is created by the copy operation, then it will have the
CopyState might be
null for blobs that are not created by copy operation.
The compensation action here is to take what we need to do when a blob copy operation is neither succeeded nor in pending state. Mostly, the next call of
StartCopyFromBlob() will end up with successful blob copy. Otherwise, further remedy should be taken.
It is a great pleasure to use
StartCopyFromBlob(). It would be more of a pleasure if the SDK or REST version would provide simple operations like the following:
GetCopyState(string copyIDs) : CopyState
RetryCopyFromBlob(string failedCopyId) : void