Click here to Skip to main content
Click here to Skip to main content

Some tips and tricks for azure table storage

, 21 Oct 2013 CPOL
Rate this:
Please Sign up or sign in to vote.
Leveraging .net framework for delete azure entities faster

Introduction

This was all about fetching the data quickly. Let’s walk through the options available for deleting the data in a jiffy.

When we have a never ending list of entities and we want to delete most of it, going sequentially can be really expensive. That will make 1 url with delete verb for each entity and each delete request takes around 17sec to respond successfully, so it could take more than an hour to delete an hour’s logs.

What we can do is, divide the whole list of entities into smaller lists holding entities belonging to the same partition together.

var partitions = entities.Distinct(new GenericEntityComparer()).Select(p => p.PartitionKey);
IEnumerable<IEnumerable<GenericEntity>> ChunksOfWork = null;
foreach (string partition in partitions)
{
    var ThisPartitionEntities = entities.Where(en => en.PartitionKey == partition).ToList();
} 

Then, chunk these partition specific lists into chunks of 100 entities. Why 100? Because that’s the upper limit on the number of operations allowed per batch. Rules of the game.

var partitions = entities.Distinct(new GenericEntityComparer()).Select(p => p.PartitionKey);
IEnumerable<IEnumerable<GenericEntity>> ChunksOfWork = null;
foreach (string partition in partitions)
{
    var ThisPartitionEntities = entities.Where(en => en.PartitionKey == partition).ToList();
    if (ChunksOfWork != null)
        ChunksOfWork = ChunksOfWork.Union(ThisPartitionEntities.Chunk(100));
    else
        ChunksOfWork = ThisPartitionEntities.Chunk(100);
}  

public static class IEnumerableExtension
{
    public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
    {
        while (source.Any())
            {
                yield return source.Take(chunksize);
                source = source.Skip(chunksize);
            }
        }
    
    } 

Create a context and attach all these entities in a single chunk to the context and delete trigger the delete request in batch.

TableServiceContext tsContext = CreateTableServiceContext(tableClient);
foreach (GenericEntity entity in chunk)
{
    tsContext.AttachTo(SelectedTableName, entity,"*");
    tsContext.DeleteObject(entity);
}
tsContext.SaveChangesWithRetries(SaveChangesOptions.Batch); 

To make it faster, we can trigger the requests for each partition, in parallel using .net framework’s “Parallel” class. This is because operation going on in each partition is independent of the other one and batch operations could be done on one partition each batch.

//   const bool forceNonParallel = true;
//   var options = new ParallelOptions { MaxDegreeOfParallelism = forceNonParallel ? 1 : -1 };
Parallel.ForEach(ChunksOfWork, chunk =>
{
    TableServiceContext tsContext = CreateTableServiceContext(tableClient);
    foreach (GenericEntity entity in chunk)
    {
         tsContext.AttachTo(SelectedTableName, entity,"*");
         tsContext.DeleteObject(entity);
    }
    tsContext.SaveChangesWithRetries(SaveChangesOptions.Batch);
});

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Purbasha Ghosh
Software Developer (Senior)
India India
Hi, I am Purbasha and I have been into web development since 5 yrs now, mostly exploiting Microsoft technologies. I love writing, and hope to be a voracious blogger Smile | :) , although struggling to squeeze out time right now.
Follow on   LinkedIn

Comments and Discussions

 
-- There are no messages in this forum --
| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.141223.1 | Last Updated 21 Oct 2013
Article Copyright 2013 by Purbasha Ghosh
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid