Alchemy: Data

Paul M Watt

0/5 (0 vote)

Jul 2, 2014

CPOL

7 min read

7185

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^].

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

By using the template construct, Typelist, I have implemented a basis for navigating the individual data fields in an Alchemy message definition. The Typelist contains no data. This entry describes the foundation and concepts to manage and provide the user access to data in a natural and expressive way.

Value Semantics

I would like briefly introduce value semantics. You probably know the core concept of value semantics by another name, copy-by-value. Value semantics places importance on the value of an object and not its identity. Value semantics often implies immutability of an object or function call. A simple metaphor for comparison is money in a bank account. We deposit and withdraw based on the value of the money. Most of us do not expect to get the exact same bills and coins back when we return to withdraw our funds. We do however, expect to get the same amount, or value, of money back (potentially adjusted by a formula that calculates interest).

Value semantics is important to Alchemy because we want to keep interaction with the library simple and natural. The caller will provide the values they would like to transfer in the message, and Alchemy will copy those values to the destination. There are many other important concepts related to value semantics. however, for now I will simply summarize the effects this will have on the design and caller interface.

The caller will interact with the message sub-fields, as if they were the same type as defined in the message. And the value held in this field should be usable in all of the same ways the caller would expect if they were working with the field's type directly. In essence, the data fields in an Alchemy message will support:

Copy
Assignment
Equality Comparison
Relative Comparison

Datum

Datum is the singular form of the word data. I prefer to keep my object names terse yet descriptive whenever possible. Datum is perfect in this particular instance. A Datum entry will represent a single field in a message structure. The Datum object will be responsible for providing an opaque interface for the actual data, which the user is manipulating. An abstraction like Datum is required to hide the details of the data processing logic. I want the syntax to be as natural as possible, similar to using a struct.

I have attempted to write this next section a couple of times, describing the interesting details of the class that I'm about to show you. However, the class described below is not all that interesting, yet. As I learned a bit later in the development process, this becomes a pivotal class in how the entire system works. At this point we are only interested in basic data management object that provides value semantics. Therefore I am simply going to post the code with a few comments.

Data management will be reduced to one Datum instance for each field in a message. The Datum is ultimately necessary to provide the natural syntax to the user and hide the details of byte-order management and portable data alignment.

Class Body

I like to provide typedefs in my generic code that are consistent and generally compatible with the typedefs used in the Standard C++ Library. With small objects, you would be surprised the many ways new solutions can be combined from an orthogonal set of compatible type-definitions:

	template < size_t   IdxT,
	           typename FormatT
	         >
	class Datum
	{
	public:
	  //  Typedefs ***********************************
	  typedef FormatT                        format_type;
	  typedef typename
	    TypeAt< IdxT, FormatT>::type         value_type;
	 
	  //  Member Functions ...
	 
	private:
	  value_type         m_value;
	};

The Datum object itself takes two parameters, 1) the field-index in the Typelist, 2) the Typelist itself. This is the most interesting statement from the declaration above:

	typedef typename
	    TypeAt&lt; IdxT, FormatT>::type         value_type;

This creates the typedef value_type, which is the type the Datum object will represent. It uses the TypeAt meta-function that I demonstrated in the previous Alchemy entry to extract the type. In the sample message declaration at the end of this entry you will see how this all comes together.

Construction

	// default constructor
	Datum()
	  : m_value(0)        
	{ }
	 
	// copy constructor
	Datum(const Datum &rhs)
	  : m_value(rhs.m_value)      
	{ }
	 
	// value constructor
	Datum(value_type rhs)
	  : m_value(rhs)      
	{ }

Generally it is advised to qualify all constructors with single parameters with explicit because they can cause problems that are difficult to track down. These problems occur when the compiler is attempting to find "the best fit" for parameter types to be used in function calls.In this case, we do not want an explicit constructor. This would eliminate the possibility of the natural syntax that I am working towards.

Conversion and Assignment

Closely related to the value constructor, is type-conversion operator. This operator provides a means to typecast an object, into the type defined by the operator. The C++ standard did not specify an explicit keyword for type-conversion operations before C++ 11. Regardless of which version of the language you are using, this operator will not be declared explicit in the Datum object,

	operator value_type() const {
	  return m_value;
	};
	 
	// Assign a Datum object
	Datum& operator=(const Datum& rhs) {
	  m_value =  rhs.m_value;
	  return *this;
	};
	 
	// Assign the value_type directly
	Datum& operator=(value_type rhs) {
	  m_value =  rhs;
	  return *this;
	};

Comparison operations

All of the comparison operators can be implemented in terms of less-than. Here is an example for how to define an equality test:

	bool operator==(const value_type& rhs) const {
	  return !(m_value < rhs.m_value)
	      && !(rhs.m_value < m_value);
	}

I will generally implement a separate equality test because in many situations, simple data such as the length of a container could immediately rule two objects as unequal. Therefore, I use two basic functions to implement relational comparisons:

	bool equal(const Datum &rhs) const {
	  return m_value == rhs.m_value;
	}
	 
	bool less(const Datum &rhs) const {
	  return m_value < rhs.m_value;
	}

All of the comparison operators can be defined in terms of these two functions. This is a good thing, because it eliminates duplicated code, and moves maintenance into two isolated functions.

	bool operator==(const Datum& rhs) const {
	  return  equal(rhs);
	}
	bool operator!=(const Datum& rhs) const {
	  return !equal(rhs);
	}
	bool operator< (const Datum& rhs) const {
	  return  less (rhs);
	}
	bool operator<=(const Datum& rhs) const {
	  return  less (rhs) || equal(rhs);
	}
	bool operator>= (const Datum& rhs) const {
	  return  !less (rhs);
	}
	bool operator> (const Datum& rhs) const {
	  return  !operator<=(rhs);
	}

Buffer read and write

One set of functions is still missing. These two functions are a read and a write operation into the final message buffer. I will leave these to be defined when I determine how best to handle memory buffers for these message objects.

Proof of concept message definition

Until now, I have only built up a small collection of simple objects, functions and meta-functions. It's important to test your ideas early, and analyze them often in order to evaluate your progress and determine if corrections need to be made. So I would like to put together a small message to verify the concept is viable. First we need a message format:

	typedef TypeList
	<
	  uint8_t;
	  uint16_t;
	  uint32_t;
	  int8_t;
	  int16_t;
	  int32_t;
	  float;
	  double;
	> format_t;

This is a structure definition that would define each data field. Notice how simple our definition for a data field has become, given that we have a pre-defined Typelist entry to specify as the format. The instantiation of the Datum template will take care of the details based on the specified index:

	struct Msg
	{
	  Datum<  0, format_t > one;
	  Datum<  1, format_t > two;
	  Datum<  2, format_t > three;
	  Datum<  3, format_t > four;
	  Datum<  4, format_t > five;
	  Datum<  5, format_t > six;
	  Datum<  6, format_t > seven;
	  Datum<  7, format_t > eight;
	};

Finally, here is a sample of code that interacts with this Msg definition:

	Msg msg;
	 
	msg.one   = 1;
	msg.two   = 2;
	msg.three = 3;
	 
	// Extracts the value_type value from each Datum,
	// and adds all of the values together.
	uint32_t sum = msg.one
	             + msg.two
	             + msg.three;

Summary

All of the pieces are starting to fit together rather quickly now. There are only a few more pieces to develop before I will be able to demonstrate a working proof-of-concept Alchemy library. The library will only support the fundamental types provided by the language. However, message format definitions will be able to be defined, values assigned to the Datum fields, and the values written to buffers. These buffers will automatically be converted to the desired byte-order before transmitting to the destination.

To reach the working demo, I still need to implement a memory buffer mechanism, the parent message object, and integrate the byte-order operations that I developed early on. Afterwards, I will continue to document the development, which will include support for these features:

Nested messages
Simulated bit-fields
Dynamically sized message fields
Memory access policies (allows adaptation for hardware register maps)
Utility functions to simplify use of the library

This feature set is called Mercury (Hg), as in, Mercury, Messenger of the Gods. Afterwards, there are other feature sets that are orthogonal to Hg, which I will explore and develop. For example, adapters that will integrate Hg messages with Boost::Serialize and Boost::Asio, as well as custom written communication objects. There is also need for utilities to translate an incoming message format to an outgoing message format forwarding.

Feel free to send me comments, questions and criticisms. I would like to hear your thoughts on Alchemy.