Alchemy Typelist Operations

Paul M Watt

Rate me:

5.00/5 (2 votes)

20 Jun 2014CPOL12 min read

10.4K

Alchemy Typelist operations

I discussed the Typelist with greater detail in my previous post. However, up to this point, I haven't demonstrated any practical uses for the Typelist. In this entry, I will further develop operations for use with the Typelist. In order to implement the final operations in this entry, I will need to rely on, and apply the operations that are developed at the beginning in order to create a simple and elegant solution.

Useful Operations

The operations I implemented in the previous entry were fundamental operations that closely mimic the same operations you would find on a traditional runtime linked-list. More sophisticated operations could be composed from those basic commands, however, the work would be tedious, and much of the resulting syntax would be clumsy for the end user. These are two things that I am trying to avoid in order to create a maintainable library. I will approach the implementation of these operations with a different tack, focused on ease of use and maintenance.

We need operations to navigate the data format definitions defined in Alchemy. My goal is to be able to iterate through all of the data fields specified in a network packet format, and appropriately process the data at each location in the packet. This includes identifying the offset of a field in the packet, performing the proper action for a field entry, and also performing static-checks on the types used in the API. There are the operations that I believe I will need:

Length: The number of type entries in the list
TypeAt: The defined type at a specified index
SizeAt: The sizeof of type at a specified index
OffsetOf: The byte offset of the type at a specified index
SizeOf: The total size in bytes for all of the types specified in the list

This list of operations should give me the functionality that I need to programmatically iterate through the well defined fields of a message packet structure. I have already demonstrated the implementation for Length[^]. That leaves four new operations to implement. As you will see, we will use some of these new operations as blocks to build other more complex operations. This is especially important in the functional programming environment that we are working within.

Style / Guidelines

I have a few remarks before I continue. It is legal to use integral types in parameterized-type declarations such as size_t, however floating-point values, string literals, and pointers are not legal syntax. struct types are typically used when creating meta-functions because they use public-scope by default, and this saves a small amount of verbosity in the code definitions.

The keyword typename and class are interchangeable in template parameter declarations. As a matter of preference, I choose typename to emphasize that the required type need not be a class-type. Finally, it is legal to provide default template values using the same rules for default function parameters; all of the parameters after an entry with a default type must also have default values. However, this can only be used for class and struct templates, but not for function templates.

TypeAt

Traditionally, nodes in linked-lists are accessed by iterating through the list until the desired node is found, which makes this a linear operation, O(n). Something to keep in mind when meta-programming is many times operations will reduce to constant runtime, O(1). However, the traditional linked-list iteration method required to traverse the Typelist is inconveniently verbose and will still require O(n) compile-time operations. The code below demonstrates the explicit extraction of the three types for our example Typelist.

C++

// This syntax is required to access the three types:
integral_t::type::head_t                   charVal  = 0; 
integral_t::type::tail_t::head_t           shortVal = 0; 
integral_t::type::tail_t::tail_t::head_t   longVal  = 0;

Because the message definitions in Alchemy may have 20-30 type entries, I want to create access methods that rely on node index rather than explicit initialization. The underlying implementation will be basically the same as a linked-list iteration. However, because the compiler will reduce most of these operations to O(1) time, we will not pay the run-time cost of providing random access to a sequential construct.

If you are worried about creating an enormous increase in your compile times, stop worrying, for now at least. Compilers have become very efficient at instantiating templates, and remembering what has already been generated. If a recursive access of the 25^th node has been generated, then later access to the 26^th node must be generated, modern compilers will detect they have already generated up to the 25^th node. Therefore, only one new instantiation is generated to reach the 26^th element. Unfortunately, this cost must be paid for each compilation unit that these templates are instantiated. There is no need to pre-maturely optimize your template structures until you determine the templates have become a problem.

The first step is to define the function signature for TypeAt. This code is called a template declaration, the definition has not been provided yet.

C++

/// Return the type at the specified index
template < size_t IdxT,
           typename ContainerT,
           typename ErrorT = error::Undefined_Type >
struct Type At;

Generic Definition

We will provide a catch-all generic definition that simply declares the specified Error Type. This is the instance that will be generated for any instance of TypeAt defined with the ContainerT parameter that does not have a defined specialization:

C++

/// Return the type at the specified index
template < size_t IdxT,
           typename ContainerT,
           typename ErrorT = error::Undefined_Type >
struct TypeAt
{
  typedef ErrorT               type;
};

The Final Piece

The first step is to determine what the correct implementation would look like for a type at a specified index. This code shows what the usage would look like to access the third element in our integral_t Typelist

C++

TypeAt < 2, integral_t >::type

It's important to remember that integral_t is actually a typedef for the template instantiation with our defined types. Also, initially there will be 32 available entries in the Typelist definition for an Alchemy message.

The definition of our Typelist contains every type in the index that it is defined. Rather than writing a recursive loop and terminator to count and extract the correct type, we can refer to the indexed type directly. The only catch, is that this method of implementation will require a template definition for each possible index. Therefore, the actual template definition for this operation must be defined like this:

C++

template 
< typename 
    TypeList 
    < typename T0,  typename T1,  typename T2,  typename T3, 
      typename T4,  typename T5,  typename T6,  typename T7, 
      typename T8,  typename T9,  typename T10, typename T11, 
      typename T12, typename T13, typename T14, typename T15, 
      typename T16, typename T17, typename T18, typename T19, 
      typename T20, typename T21, typename T22, typename T23, 
      typename T24, typename T25, typename T26, typename T27, 
      typename T28, typename T29, typename T30, typename T31
    >, 
    typename ErrorT = error::Undefined_Type 
> 
struct TypeAt< (2), 
               TypeList < T0,  T1,  T2,  T3,  T4,  T5,  T6,  T7,
                          T8,  T9,  T10, T11, T12, T13, T14, T15,
                          T16, T17, T18, T19, T20, T21, T22, T23,
                          T24, T25, T26, T27, T28, T29, T30, T31
                         >
             >
{
  typedef T2 type;
};

After I reached this implementation, I decided that I had to find a simpler solution. To implement the final piece of the TypeAt operation, we will rely on MACRO code generation[^]. The work will still be minimal with the MACROs that I introduced earlier. This is the MACRO that will define the template instance for each index.

C++

#define tmp_ALCHEMY_TYPELIST_AT(I)                 \
template < TMP_ARRAY_32(typename T),               \
          typename ErrorT >                        \
struct TypeAt< (I),                                \
               TypeList< TMP_ARRAY_32(T) >,        \
               ErrorT                              \
             >                                     \
{                                                  \
  typedef TypeList < TMP_ARRAY_32(T) >  container; \
  typedef T##I                        type;        \
}

The declaration of the MACRO in this way will define the entire structure above:

C++

tmp_ALCHEMY_TYPELIST_AT(2);

Therefore, I declare an instance of this MACRO for as many indices are allowed in the Typelist.

C++

//  MACRO Declarations for each ENTRY 
//  that is supported for the TypeList size 
tmp_ALCHEMY_TYPELIST_AT(0);
tmp_ALCHEMY_TYPELIST_AT(1);
tmp_ALCHEMY_TYPELIST_AT(2);
tmp_ALCHEMY_TYPELIST_AT(3);
// ... 
tmp_ALCHEMY_TYPELIST_AT(30);
tmp_ALCHEMY_TYPELIST_AT(31);

//  Undefining the MACRO to prevent its further use. 
#undef tmp_ALCHEMY_TYPELIST_AT

Boost has libraries that are implemented in similar ways, however, they have expanded their code to actually have MACROs define each code generation MACRO at compile-time based on a constant. I have simply hand defined each instance because I have not created as sophisticated of a pre-processor library as Boost has. Also, at the moment, my library is a small special purpose library. If it becomes more generic with wide-spread use, it would probably be worth the effort to make the adjustment to a dynamic definition.

SizeOf

Although there already exists a built-in operator to report the size of a type, we will need to acquire the size of a nested structure. We are representing structure formats with a Typelist, which does not actually contain any data. Therefore we will create a sizeof operation that can report both the size of an intrinsic type, as well as a Typelist. We could take the same approach as TypeAt with MACROs to generate templates for each sized Typelist, and a generic version for intrinsic types. However, if we slightly alter the definition of our Typelist definition, we can implement this operation with a few simple templates.

Type-traits

We will now introduce the concept of type-traits[^] into our implementation of the Typelist to help us differentiate the different types of objects and containers that we create. This is as simple as creating a new type.

C++

struct container_trait{};

Now derive the Typelist template from container_trait. The example below is from an expansion of the 3 node declaration for our Typelist:

C++

template< typename T0,  typename T1,  typename T2> 
struct TypeList< T0,  T1,  T2>
  : container_trait
{ 
  typedef TypeNode< T1, 
          TypeNode< T2, 
          TypeNode< T3, empty> 
        > >                          type;
};

We now need a way to be able to discriminate between container_trait types and all other types. We will make use of one of the type templates found in the < type_traits > header file, is_base_of. This template creates a Boolean value set to true if the type passed in derives from the specified base class.

C++

//  Objects derived from the container_trait 
//  are considered type containers.
template < typename T >
struct type_container
  : std::integral_constant
      < bool, std::is_base_of< container_trait, T >::value >
{  };

This type discriminator can be used to discern and call the correct implementation for our implementation of sizeof.

C++

template < typename T >
struct SizeOf
  : detail::SizeOf_Impl
    < T, 
     type_container< T >::value 
    >
{ };

That leaves two distinct meta-functions to implement. One that will calculate the size of container_types, and the other to report the size for all other types. I have adopted a style that Boost uses in its library implementations, which is to enclose helper constructs in a nested namespace called detail. This is a good way to notify a user of your library that the following contents are implementation details since these constructs cannot be hidden out of sight.

C++

namespace detail
{

// Parameterized implementation of SizeOf
template < typename T, bool isContainer = false >
struct SizeOf_Impl
  : std::integral_constant< size_t, sizeof(T) >
{ };

// SizeOf implementation for type_containers
template < typename T>
struct SizeOf_Impl< T, true >
  : std::integral_constant< size_t,
                            ContainerSize< T >::value
                          >
{ };

} // namespace detail

The generic implementation of this template simply uses the built-in sizeof operator to report the size of type T. The container_trait specialization calls another template meta-function to calculate the size of the Typelist container. I will have to wait and show you that after a few more of our operations are implemented.

SizeAt

The implementation for SizeAt builds upon both the TypeAt and sizeof implementations. The implementation also returns to the use of MACRO code generation to reduce the verbosity of the definitions. This implementation queries for the type at the specified index, then uses sizeof to record the size of the type. Here is the MACRO that will be used to define the template for each index:

C++

#define tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(I)                   \
template < TMP_ARRAY_32(typename T) >                          \
struct SizeAt< (I), TypeList< TMP_ARRAY_32(T) > >              \
{                                                              \
  typedef TypeList< TMP_ARRAY_32(T) >  Container;              \
  typedef typename TypeAt< (I), Container >::type TypeAtIndex; \
  enum { value = SizeOf< TypeAtIndex >::value };               \
}

Once again, there is a set of explicit MACRO declarations that have been made to define each instance of this meta-function.

C++

// MACRO Declarations for each ENTRY that is supported for the TypeList  size **
tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(0);
tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(1);
tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(2);
// ... 
tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(30);
tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(31);

// Undefining the declaration MACRO 
// to prevent its further use. 
#undef tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY

OffsetOf

Now we're picking up steam. In order to calculate the offset of an item in a Typelist, we must start from the beginning, and calculate the sum of all of the previous entries combined. This will require a recursive solution to perform the iteration, as well as the three operations that we have implemented up to this point. Here's the prototype for this meta-function:

C++

//  Forward Declaration 
template < size_t Index,
           typename ContainerT >
struct OffsetOf;

If you have noticed that I do not always provide a forward declaration, it is because it usually depends on if my general implementation will be the first instance encountered by the compiler, dependencies, or if I have a specialization that I would like to put in place. In this case, I am going to implement a specialization for the offset of index zero; the offset will always be zero. This specialization will also act as the terminator for the recursive calculation.

C++

/// The item at index 0 will always have an offset of 0. 
template < typename ContainerT >
struct OffsetOf< 0, ContainerT >
{
  enum { value = 0 };
};

One of the nasty problems to tackle when writing template meta-programs, is that debugging your code becomes very difficult. The reason being, many times by the time you are actually able to see what is generated, the compiler has reduced your code to a single number. Therefore, I like to try and write a traditional function that performs a similar calculation, then convert it to a template. Pretty much the same as if I were trying to convert a class into a parameterized object. This is essentially the logic we will need to calculate the byte offset of an entry from a Typelist definition.

C++

// An array of values to stand-in
// for the Typelist.
size_t elts[] = {1,2,3,4,5,6};

size_t OffsetOf(size_t index)
{
  return OffsetOf(index - 1)
       + sizeof(elts[index-1]);
};

This code adds the size of the item before the requested item, to its offset to calculate the offset if the requested item. In order to get the offset of the previous item, it recursively performs this action until index 0 is reached, which will terminate the recursion. This is what the OffsetOf function looks like once it is converted to the template and code-generating MACRO.

C++

#define tmp_ALCHEMY_TYPELIST_OFFSETOF(I)                  \
template < TMP_ARRAY_32(typename T) >                     \
struct OffsetOf< (I), TypeList< TMP_ARRAY_32(T) > >       \
{                                                         \
  typedef TypeList< TMP_ARRAY_32(T) >   container;        \
                                                          \
  enum { value = OffsetOf< (I)-1, container >::value      \
               + SizeAt  < (I)-1, container >::value };   \
}

This operation also requires the series of MACRO declarations to properly define the template for every index. However, this time we do not define an entry for index 0 since we explicitly implemented a specialization for it.

C++

//  Offset for zero is handled as a special case above
tmp_ALCHEMY_TYPELIST_OFFSETOF(1);
tmp_ALCHEMY_TYPELIST_OFFSETOF(2);
// ... 
tmp_ALCHEMY_TYPELIST_OFFSETOF(31);

//  Undefining the declaration MACRO to prevent its further use.
#undef tmp_ALCHEMY_TYPELIST_OFFSETOF

ContainerSize

Only one operation remains. This operation is one that we had to put aside until we completed more of the operations for the Typelist. The purpose of ContainerSize is to calculate the size of an entire Typelist. This will be very important to be able to support nested data structures. Here is the implementation:

C++

template < typename T >
struct ContainerSize
  : type_check< type_container< ContainerT >::value >
  , std::integral_constant
    < size_t, 
      OffsetOf< Hg::length< T >::value, T >::value
    >
{ };

I will give you a moment to wrap your head around this.

The first think that I do is verify that the type T that is passed into this template is in fact a type container, the Typelist. type_check is a simple template declaration that verifies the input predicate evaluates to true. There is no implementation for any other type, which will trigger a compiler error. In the actual source I have comments that indicate what would cause an error related to type_check and how to resolve it.

Next, the implementation is extremely simple. A value is defined to equal the offset at the item one passed the last item defined in the Typelist. This behaves very much like end interators in STL. It is ok to refer to the element passed the end of the list, as long as it is not dereferenced for a value. The last item will not be dereferenced by OffsetOf because it refers to the specified index minus one.

Summary

This covers just about all of the work that is required of the Typelist for Alchemy. At this point, I have a type container that I can navigate its set of types in order, determine their size and offset in bytes, and I can even support nested Typelists with these operations.

What is the next step? I will need to investigate how I want to internally represent data, provide access with value semantics to the user in an efficient manner. I will also be posting on more abstract concepts that will be important to understand as we get deeper into the implementation of Alchemy, such as SFINAE and ADL lookup for templates.

Original post blogged at Code of the Damned.

This article was originally posted at http://codeofthedamned.com/index.php/alchemy-typelist-operations

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Paul M Watt

Engineer

United States

I am a software architect and I have been developing software for nearly two decades. Over the years I have learned to value maintainable solutions first. This has allowed me to adapt my projects to meet the challenges that inevitably appear during development. I use the most beneficial short-term achievements to drive the software I develop towards a long-term vision.

C++ is my strongest language. However, I have also used x86 ASM, ARM ASM, C, C#, JAVA, Python, and JavaScript to solve programming problems. I have worked in a variety of industries throughout my career, which include:
• Manufacturing
• Consumer Products
• Virtualization
• Computer Infrastructure Management
• DoD Contracting

My experience spans these hardware types and operating systems:
• Desktop
o Windows (Full-stack: GUI, Application, Service, Kernel Driver)
o Linux (Application, Daemon)
• Mobile Devices
o Windows CE / Windows Phone
o Linux
• Embedded Devices
o VxWorks (RTOS)
o Greenhills Linux
o Embedded Windows XP

I am a Mentor and frequent contributor to CodeProject.com with tutorial articles that teach others about the inner workings of the Windows APIs.

I am the creator of an open source project on GitHub called Alchemy[^], which is an open-source compile-time data serialization library.

I maintain my own repository and blog at CodeOfTheDamned.com/[^], because code maintenance does not have to be a living hell.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.