This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

I presented the design and initial implementation of the Datum[^] object in my previous Alchemy post. A Datum object provides the user with a natural interface to access each data field. This entry focuses on the message body that will contain the Datum objects, as well as a message buffer to store the data. I prefer to get a basis prototype up and running as soon as possible in early design & development in order to observe potential design issues that were not initially considered. In fact, with a first pass implementation that has had relatively little time invested, I am more willing to throw away work if it will lead to a better solution.

Many times we are reluctant to throw away work. However, we then continue to throw more good effort to try to force a bad solution to work; simply because we have already invested so much time in the original solution. One rational explanation for this could be because we believe it will take the same amount of time to reach this same point with a new solution. Spending little time on an initial prototype is a simple way to mitigate this belief. Also remember, the second time around we have more experience with solving this problem, so I would expect to move quicker with more deliberate decisions. So let's continue with the design of a message container.

Experience is invaluable

When I design an object, I always consider Item 18 in, Scott Meyer's, Effective C++:

Item 18: Make interfaces easy to use correctly and hard to use incorrectly.

The instructions for how to use an object's interface may be right there in the comments or documentation, but who the hell read's that stuff?! Usually it isn't even that good when we do take the time to read through it, or at least it first appears that way. For these very reasons, it is important to strive to make your object's interfaces intuitive. Don't surprise the user. And even if the user is expecting the wrong thing, try to make the object do the right thing. This is very vague advice, and yet if you can figure out how to apply it, you will be able to create award winning APIs.

This design and development journey that I am taking you on is record of my second iteration for this library. The first pass proved to me that it could be done, and that it was valuable. I also received some great feedback from users of my library. I fixed a few defects, and discovered some short-comings and missing features. However, the most important thing that I learned was that despite the simplicity of the message interface, I had made the memory management too complicated. I want to make sure to avoid that mistake this time.

Message Design

Although the Message object is the primary interface that the user will interact with Alchemy, it is a very simple interface. Most of the user interaction will be focused on the Datum fields defined for the message. The remainder of the class almost resembles a shared_ptr. However, I did not recognize this until this second evaluation of the Message.

Pros/Cons of the original design

Let me walk you through the initial implementation with rational. I'll describe what I believe I got right, and what seemed to make it difficult for the users. Here is a simplified definition for the class, omitting constructors and duplicate functions for brevity:

template < typename MessageT,  typename ByteOrderT >
class Message
  : public MessageT
{
public:
  //  General ************************
  bool       empty() const;
  size_t     size() const;
  void       clear();
  Message    clone() const;

  //  Buffer Management **************
  void       attach (buffer_ptr sp_buffer);
  buffer_ptr detach();
  buffer_ptr buffer();
  void       flush();

  //  Byte-order Management **********
  bool   is_host_order() const;

  Message < MessageT, HostByteOrder > to_host()    const;
  Message < MessageT, NetByteOrder >  to_network() const;
};

Pros

First the good, it's simpler to explain and sets up the context for the Cons. One of the issues I have noticed in the past is a message is received from the network, and multiple locations in the processing end up converting the byte-order of parameters. For whatever reason the code ended up this way, when it did, it was a time consuming issue to track down. Therefore I thought it would be useful to use type information to indicate the byte-order of the current message. Because of the parameterized definitions I created for byte-order management, the compiler elides any duplicate conversions attempted to the current type.

I believe that I got it right when I decided to manage memory internally for the message. However, I did not provide a convenient enough interface to make this a complete solution for the users. In the end they wrote many wrapper classes to adapt to the short-comings related to this. What is good about the internal memory management is that the Message knows all about the message definition. This allows it to create the correct size of buffer and verify the reads and writes are within bounds.

The Alchemy Datum fields have shadow buffers of the native type they represent to use as interim storage, and to help solve the data-alignment issue when accessing memory directly. At fixed points these fields would write their contents to the underlying packed message buffer that would ultimately be transmitted. The member function flush was added to allow the user to force this to occur. I will keep the shadow buffers, however, I must find a better solution for synchronizing the packed buffer.

Cons

Memory Access

I did not account for memory buffers that were allocated outside of my library. I did provide an attach and detach function, but the buffers these functions used had to be created with Alchemy. I did provide a policy-based integration path that would allow users to customize the memory management, however, this feature was overlooked, and probably not well understood. Ultimately what I observed, was duplicate copies of buffers that could have been eliminated, and a cumbersome interface to get direct access to the raw buffer.

This is also what caused issues for the shadow buffer design, keeping the shadow copy and packed buffer synchronized. For the most part I would write out the data to the packed buffer when I updated the shadow buffer. For reads, if an internal buffer was present I would read that into the shadow copy, and return the value of the shadow copy. The problems arose when I found I could not force the shadow copies to read or write from the underlying buffer on demand for a few types of data.

This led to the users discovering that calling flush would usually solve the problem, but not always. When I finally did resolve this issue correctly, I could not convince the users that calling flush was no longer necessary. My final conclusion on this uncontrolled method of memory access was brittle and became too complex.

Byte-Order Management

While I think I made a good decision to make the current byte-order of the message part of the type, I think it was a mistake to add the conversion functions, to_host and to_network, to the interface of the message. Ultimately, this complicates the interface for the message, and they can just as easily be implemented outside of the Message object. I also believe that I would have arrived at a cleaner memory management solution had I gone this route with conversion.

Finally, it seemed obvious to me that users would only want to convert between host and network byte-orders. It never occurred to me about the protocols that intentionally transmit over the wire in little-endian format. Google's recently open-source, Flat Buffers, is one of the protocol libraries that does it this way. However, some of my users are defining new standard protocols that transmit in little-endian format.

Message Redesign

After some analysis, deep-thought and self-reflection, here is the redesigned interface for the Message:

template < typename MessageT,  typename ByteOrderT >
class Message
  : public MessageT
{
public:
  //  General ************************
  //  ... These functions remain unchanged ...

  //  Buffer Management **************
  void assign(const data_type* p_buffer, size_t count);
  void assign(const buffer_ptr sp_buffer, size_t count);
  void assign(InputIt first, InputIt last);
  const data_type* data()   const;

  //  Byte-order Management **********
  bool   is_host_order() const;
};

// Now global functions
Message < MessageT, HostByteOrder>    to_host(...);
Message < MessageT, NetByteOrder>     to_network(...);
Message < MessageT, BigEByteOrder>    to_big_end(...);
Message < MessageT, LittleEByteOrder> to_little_end(...);

Description of the changes

Synchronize on Input

The new interface is modelled after the standard containers, with respect to buffer management. I have replaced the concept of attaching and detaching a user managed buffer with the assign function. With this function, users will be able to specify an initial buffer they would like to initialize the contents of the message with. At this point, the input buffer will be processed and the data copied to the shadow buffers maintained by the individual Datum objects. Ownership of the memory will not be taken for the buffers passed in by the user, the content will simply be read in order to initialize the message values. This solves half of the synchronization issue.

Synchronize on Output

The previous design was not strict enough with control of the buffers. The buffers should only be accessed by a single path between synchronization points. This forced me to basically write the data down to the buffer on ever field set, and read from the buffer on every get, that is if there was an internal buffer to reference.

Now there is a single accessor function to get access to the packed format of the message, data(). data() will behave very much like std::string::data(). A buffer will be allocated if necessary, the shadow copies will be synchronized, and the user will be given a raw-pointer to the data buffer. The behavior of modifying this buffer directly is undefined. This pointer will become invalid upon another synchronization function being called (to be specified once all calls are determined). Basically, do not hold onto this pointer. Access the data, and forget about the buffer.

This decision will also make a new feature to this version simpler to implement, dynamically-sized messages. This is one of the short-comings of the initial implementation, it could only operate on fixed-format message definitions. I will still be holding off the implementation of this feature until after the full prototype for this version has been developed. However, I know that the path to implement will be simpler now that the management of memory is under stricter control.

Byte-Order Conversion

I'm still shaking my head over this (I understand it, it's just hard to accept.) I will create a more consistent interface by moving the conversion functions to global scope rather than member functions of the message. It will then be simpler to add new converters, such as to_big_end and to_little_end. This also removes some functions from the interface making it even simpler.

Eliminate Flush

flush was an interim solution that over-stayed its invitation. In an attempt to figure things out on their own, users discovered that it solved their problem, most of the time. When I finally solved the real problem, I couldn't pry that function from their hands, and they were still kicking and screaming. Rightly so, they did not trust the library yet.

Adding refined control over synchronization, or even a partial solution usually is a recipe for disaster. I believe that I provided both with the flush member function. The new design model that I have created will allow me to remove some of the complicated code that I refused to throw away, because it was so close to working. This is proof to me yet again, sometimes it is better to take a step back and re-evaluate when things become overly complex and fragile. Programming by force rarely leads to a successful and maintainable solution.

Summary

So now you know this whole time I have actually been redesigning a concept I have worked through once with mediocre success. The foundation in meta-template programming you have read about thus far has remained mostly unchanged. Once we reached the Message class, I felt it would be more valuable to present the big-picture of this project. Now hopefully you can learn from my previous mistakes if you find yourself designing a solution with similar choices. I believe the new design is cleaner, more robust, and definitely harder to use incorrectly. Let me know what you think. Do you see any flaws in my rationale? The next topic will be the internal message buffer, and applying the meta-template processing against the message definition.

Original post blogged at Code of the Damned.