This is the first article of what may become a series of articles about Specialized Collections. The first two articles are about Queues - a Cyclic Queue presented here and a Prioritized Queue I hope to present soon enough. Other articles in the same spirit may follow depending on time availability and the responses I get from these articles.
Please note: the code described here refers to the .NET Framework 1.1. I may follow-up with a version using Generics for the .NET Framework 2.0 in a few months, we'll see.
So what is this about? Well, the broad purpose is to explain how you should write specialized collections, do's and don'ts and some tips.
In a more specific view, this article describes a Queue similar to the
System.Collections.Queue, except that it has a fixed buffer size. This means, of course, that the buffer could not be large enough to hold all the items added to the Queue, in which case the oldest items are being dropped.
What Is It Good For?
Loads of things! It can make you coffee, bake you a chocolate cake, massage your back, etc...
As far as I'm concerned, I'm using it to show a log at run-time. I have an application that is writing stuff to a central logging mechanism. This logging mechanism dispatches the logs to various clients (log-files, emails, SMS, etc.). One of the clients is a Windows application that displays the last logs in the system. The thing is that some of the clients like to leave this window open in the background, in order to check on the status of some running process once in a while. This causes the log window to keep enormous amounts of logs, which in the end becomes highly prohibitive in terms of memory requirements and performance. Moreover, due to the nature of the application, only the latest logs are really relevant. If you want to go further back - you can always open the log files.
I always wanted something like that in the SQL Server's Profiler. This very useful application keeps all the logs in memory, so in most cases you can't keep it running at all times. If there was such a feature in the Profiler, you could keep it open at all times (during QA tests for example), defined with a rather large buffer, and then when a problem occurs - you have the information you need at your disposal.
The implementation is very simple, really. The
CyclicQueue is created with a buffer-size. This is the size of the buffer (unless you call the
TrimToSize() method) for the whole lifetime of the object. Internally, we refer to the buffer by increasing indexes using modulus operations, which provides us with a cyclic behavior. So basically you can look at it as a cyclic linked-list, implemented with an array buffer.
So, as long as there is space left in the buffer, new items are added to it in their order of arrival. Once the buffer is full, and a new item arrives, the first item in the buffer (i.e. the 'oldest' one) is removed and the new item takes its place.
Things to Remember
1. Use Decompilers
None of us is perfect, and there are many people out there with nice ideas I couldn't have come up with. So the most important tip I can give you is to try and learn from others. In my case, it was very simple. Since I just wanted to change some of the functionality of the
System.Collections.Queue class, I inherited from it and used the Reflector to decompile the code of the original Queue in the Framework. This decompiled code guided me throughout the process of writing my own code, and helped me not to forget anything.
2. Keep Things Simple
This may sometimes come as a contradiction to the previous tip, but it's important nonetheless. The
System.Collections.Queue has some functionality I didn't need. For instance, its mechanism for handling the buffer is much more complicated, since its buffer increases automatically when needed. My buffer has a fixed size, so it's much easier to handle. So whenever you can - keep things simple.
Each collection that implements
IEnumerable must keep track of internal versioning. Whenever the collection is altered in any way, the version should be changed as well. This provides the enumerator the ability to fail enumerating on the collection if a change occurred during enumeration. The simplest way to support versioning is to have an integer member variable and increase it each time a method is called that changes the collection.
A collection should not be synchronized by default. That is, access to the collection should not be made thread-safe by default. Why? Well, in most cases the collection will be accessed from a single thread, so why add synchronization code that will harm your performance even though you don't need it?
On the other hand, you should provide a means to obtain a synchronized version of the collection. This is done by adding a
Synchronized, which receives the collection and wraps it with a synchronization helper class (exposing it as if it's the same type as the original collection). This synchronization wrapper simply forwards each call to the underlying (unsynchronized) collection, wrapped in a
For example, the call to
CyclicQueue.Synchronized(new CyclicQueue(10000)) will return a new object of type
SynchronizedCyclicQueue. The caller isn't even aware of that, since
SynchronizedCyclicQueue is an inner class and it inherits
CyclicQueue, the caller believes he has got a
CyclicQueue that has been magically made synchronized by someone, somehow...
Clear() method, for example, called on a synchronized version of the
CyclicQueue (i.e. the method called is
SynchronizedCyclicQueue.Clear() ) does the following:
public override void Clear()
Most collections provide a means to enumerate through their items. An enumerator cannot enumerate through a list that has changed. Therefore, as mentioned before, we need to keep track of the changes done. This is done with a single integer variable that is increased each time the collection is altered.
The actual enumeration is performed by another
private class -
CyclicQueueEnumerator. Why? First, it provides nice encapsulation of functionality. The second reason, however, is the most important one - by using a separate class for enumeration, you can use multiple enumerators on the same collection. As long as the collection is not altered, all the enumerators will work and can be used independently from one another.
On instantiation, the enumerator class receives the collection it should enumerate. It keeps the version value of the collection internally (using a readonly variable is a good idea). Each time an enumeration operation is required, it first checks that the version variable hasn't changed.
The Tester Application
I have written a small tester application. The first objective is to test all the methods and properties and make sure they behave properly. The second is to check that the synchronized version doesn't break when it is used with multiple threads.
So the first button creates a
CyclicQueue with a buffer of 4 and adds 6 items to it. It then displays the items so you can see that only the last 4 items remain. Then it runs the various methods and makes sure that the behavior is as expected.
The second button creates a
SynchronizedCyclicQueue with buffer size of 10000 and initiates 100 threads. Part of the threads add items to the queue and part remove items from it. Actually, they all spend quite some time waiting for the lock to be freed, which is exactly what I wanted to check.
Well, that's about it. I hope this article and the attached code will be useful, or at least interesting to some of you.
- 20th November, 2005: Initial post