Typelist Operations

Paul M Watt

Rate me:

4.87/5 (5 votes)

21 Jun 2014CPOL12 min read

15.3K

I would like to devote this entry to further discuss the Typelist data type. Previously, I explored the Typelist[^] for use in my network library, Alchemy[^]. I decided that it would be a better construct for managing type info than the std::tuple. The primary reason is the is no data associated with the types placed within the. On the other hand, std::tuple manages data values internally, similar to std::pair. However, this extra storage would cause some challenging conflicts for problems that we will be solving in the near future. I would not have foreseen this, had I not already created an initial version of this library as a proof of concept. I will be sure to elaborate more on this topic when it becomes more relevant.

Linked-Lists

Linked lists are one of the fundamental data structures used through-out computer science. After the fixed-size array, the linked-list is probably the simplest data structure to implement. The Typelist is a good structure to study if you are trying to learn C++ template meta-programming. While the solutions are built in completely different ways, the structure, operations and concepts are quite similar between the two.

Any class or structure can be converted to a node in a linked list by adding a member pointer to your class, typically called next or tail, to refer to the next item in the list. Generally this is not the most effective implementation, however, it is common, simple, and fits very nicely with the concepts I am trying to convey. Here is an example C struture that we will use as a basis for comparison while we develop a complete set of operations for the Typelist meta-construct:

C++

// An integer holder
struct integer
{
  long   value;
};

This structure can become a node in a linked-list by adding a single pointer to a structure of the same type:

C++

// A Node in a list of integers
struct integer
{
  long     value;
  integer *pNext;
};

Given a pointer to the first node in the list called, head, each of the remaining nodes in the list can be accessed by traversing the pNext pointer. The last node in the list should set its pNext member to 0 to indicate the end of the list. Here is an example of a loop that prints out the value of every point node in a list:

C++

void PrintValues (integer *pHead)
{
  integer *pCur = pHead;
  while (pCur)
  {
    printf("Value: %d\n", pCur->value);
    pCur = pCur->pNext;
  }
}

This function is considered to be written with an imperative style because of pCur state variable that is updated with each pass through the loop. Recall that template meta-programming does not allow mutable state; therefore, meta-programs must rely on functional programming techniques to solve problems. Let's modify the C function above to eliminate the use of mutable state. This can be accomplished with recursion.

C++

void PrintValues (integer *pHead)
{
  if (pHead)
  {
    printf("Value: %d\n", pHead->value); 
    PrintValues(pHead->pNext);
  }
}

This last function makes a single test for the validity of the input parameter, then performs the print operation. Afterwards it will call itself recursively for the next node in the list. When the last node is reached, the input test will fail, and the function will exit with no further actions. Since that is the last operation in the function, each instance of the call will pop off of the call stack until the stack frame the call originated from is reached. Incidentally, this type of recursion is called tail recursion. As we saw earlier, this form of recursion can easily be written as a loop in imperative style programs.

Typelists

Let's turn our focus to the main topic now, Typelists. Keep in mind that the goal of using a construct like a Typelist is to manage and process type information at compile-time, rather than process data at run-time. Here is the node definition I presented in a previous post to build up a Typelist with templates:

C++

template < typename T, typename U >
struct Typenode
{
  typedef T        head_t;
  typedef U        tail_t;
};

// Here is a sample declaration of a Typelist.
// Refer to my previous blog entry on Typelists for the details.
typedef Typelist < char, short, long >    integral_t;

// This syntax is required to access the type, long:
integral_t::type::tail_t::tail_t::head_t   longVal = 0;

Object Structure

The concepts and ideas in computer science are very abstract. Even when code is presented, it is merely a representation of an abstract concept in some cryptic combination of symbols. It may be helpful to create a visualization of the structure of the objects that we are dealing with. Figure 1 illustrates the nested structure that is used to construct the Typelist we have just defined:

Another way to relate this purely conceptual type defined with templates, is to define the same structure without the use of templates. Here is the definition of the Typelist above defined using C++ without templates:

C++

struct integral_t
{
  typedef struct type
  {
    typedef char   head_t;
    typedef struct tail_t
    {
      typedef short  head_t;
      typedef struct tail_t
      {
        typedef long    head_t;
        typedef empty_t tail_t;
      };
    };
  };
};

This image depicts the nested structure of this Typelist definition.

Figure 1. Non-template Typelist Structure

There is one final definition that I think will be helpful to demonstrate the similarities shared between the structures of the linked-list and the Typelist. It may be useful to think about how you would solve a problem with the nested linked-list definition when trying to compose a solution for the templated Typelist. Imagine what the structure of the linked-list would look like if the definition for the next node in the list was defined in place, inside of the current Integer holder rather than a pointer. We will replace the zero-pointer terminator with a static constant that indicates if the node is the last node. Finally, I have also changed the names of the fields from value to head and next to tail. Here is the definition required for a 3-entry list.

C++

// A 3-node integer list
struct integer
{
  long   head;
  struct integer
  {
    long   head;
    struct integer
    {
      long.  head;
      static const bool k_isLast = true;
    } tail;
    static const bool k_isLast = false;
  } tail;
  static const bool k_isLast = false;
};

Here is an illustration for the structure of this statically defined linked-list.

Figure 2. Statically defined Linked-List

Take note of the consequences of the last change in structure that we made to the linked-list implementation. It is no longer a dynamic structure. It is now a static definition that is fixed in structure and content once it is compiled. Each nested node does contain a value that can be modified, unlike the Typelist. However, in all other aspects, these are two similar constructs. Hopefully these alternative definitions can help you gain a better grasp of the abstract structures we are working with as we work to create useful operations for these structures.

Basic Operations

Let's run through building a few operations for the Typelist that are similar to operations that are commonly used with a linked-list. The structure of the Typelist that we have defined really only leaves one useful goal for us to accomplish, to access the data type defined inside of a specific node. This is more complicated than it sounds because we have to adhere to the strict rules of functional programming; ie. No mutable state or programming side-effects.

Error Type

Before we continue, it might be useful to define a type that can represent an error has occurred in our meta-program. This will be useful because the Error Type will appear in the compiler error message. This could help us more easily deduce the cause of the problem based on the Error Type reported in the message. We will simply define a few new types, and we can add to this list as necessary:

C++

namespace error
{
struct Undefined_Type;
struct Invalid_Type_Size;
}

The type definitions do not need to be complete definitions because they are never intended to be instantiated. Remember, type declarations are the mechanism we use to define constants and values in meta-programming.

Syntactic Sugar

I wrote my previous entry on using the preprocessor for code generation[^]. I demonstrated how to use the preprocessor to simplify declarations for some of the verbose Typelist definitions that we have had to use up to this point. I make use of these MACROs to provide a syntactic sugar for the definition of some of the implementations below. For example, the regular form of a Typelist declaration looks like this:

C++

template < T0 , T1 , T2 , T3 , T4 , T5 , T6 , T7 ,
           T8 , T9 , T10, T11, T12, T13, T14, T15,
           T16, T17, T18, T19, T20, T21, T22, T23,
           T24, T25, T26, T27, T28, T29, T30, T31>

The previous declaration can be shortened to the following form with the code-generation MACRO:

C++

template < TMP_ARRAY_31(typename T) >

There will be an additional specialization defined for many of the function implementations below that match this format. That is because this form of the definition is the outer wrapper that contains the internally defined TypeNode. All of the implementations below are developed to work upon the TypeNode. If we did not provide this syntactic sugar, a different implementation of each operation would be required for each Typelist of a different size. For 32 nodes, that would require 32 separate implementations.

Length

I showed the implementation for the Length operation in the blog entry that I introduced the Typelist. Here is a link to that implementation for you to review Length[^]. With the Length operation we now have our first meta-function to extract information from the Typelist. Here is what a call to Length looks like:

C++

// Calls the meta-function Length 
// to get the number of items in integral_t. 
size_t count = Length < integral_t >::value;

// count now equals 3

front

Because of the nested structure used to build up the Typelist, accessing the type of the first node will be imperative for us to be able to move on to more complex tasks. There are two fields in each Typenode, the head_t, the current type, and tail_t, the remainder of the list. The name the C++ standard uses to access the first element in a container is, front. Therefore, that is what we will name our meta-function.

The implementation of front is probably the simplest function that we will encounter. There are only two possibilities when we go to access the head_t type in a node; 1) it will contain a type, 2) it will contain empty. Furthermore, the first node is always guaranteed to exist. To implement front, a general template definition will be required, as well as a specialization to account for the empty type.

C++

/// General Implementation
template < TMP_ARRAY_32(typename T) >
struct front < Typelist < TMP_ARRAY_32(T) > >
{
  // The type of the first node in the Typelist.
  typedef T0 type;
};

Here is the specialization definition to handle the empty case:

template < >
struct front < empty >
{ };

Why is there not a definition for the type within the empty specialization? That is because the type of code that we are developing will all be resolved at compile-time. A compiler error will be generated if front < empty > ::type is accessed, because it's invalid. However, if we had defined a type definition for the empty specialization, we would need to then write code to detect this case at run-time. Detecting potential errors at compile-time eliminates unnecessary run-time checks that would add extra processing. The final result is that we are detecting programming logic errors in our code, and the use of our code, by making the logic-errors invalid syntax.

back

Just as we were able to access the first element in the list, we can extract the type from the last node. This is also relatively simple since we have already implemented a method to count the length of the Typelist.

/// This allows the last type of the list to be returned.
template < TMP_ARRAY_32(typename T) >
struct back < TypeList < TMP_ARRAY_32(T) > >
{
  typedef 
    TypeList < TMP_ARRAY_32(T) >  container;
  typedef 
    TypeAt < length < container >::value-1, 
             container 
           >                      type;
};

// This is the specialization for the empty node.
template < >
struct back < empty >
{ };

pop_front

To navigate through the rest of the list we will need to dismantle it node-by-node. The simplest way to accomplish this is to remove the front node until we reach the desired node. A new Typelist is created from the tail of the current Typelist node as a result of the pop_front operation. Because the meta-functions are built from templates, this new Type list will be completely compatible with all of the operations that we develop for this type. Here is the forward-declaration of the meta-function:

C++

template < typename ContainerT > 
struct pop_front;

Up to this point, the meta-functions that we have developed only extracted a single type. The implementation for pop_front differs slightly in structure from the other templates that we have created up to this point. Remember that a new Typelist must be defined as the result. In order to do this, the instantiation of a new Typelist type must be defined within our meta-function definition. The primary implementation is actually a specialization of the template definition that we forward declared.

C++

template < typename TPop, TMP_ARRAY_31(typename T) >
struct pop_front < TypeList < TPop, TMP_ARRAY_31(T) > >
{
  typedef TypeList < TMP_ARRAY_31(T) > type;
};

template < typename T1, typename T2 >
struct pop_front < TypeNode < T1, T2 > >
{
  typedef T2 type;
};

I realize that there are two template parameters in this implementation, as opposed to the single type parameter in the forward declaration. I believe this is perplexing for two reasons

1. Why is there a second parameter?

Our ultimate goal is to decompose a single node into it's two parts, and give the caller access to the interesting part of the node. In this case, the tail, the remainder of the list.

2. Why create a function if we know both types?

The short answer is: Type Deduction.
We will not call this version of the function directly. In fact, we most likely will not know the parameterized types to use in the declaration. This is an important concept to remember when programming generics. We want to focus on the constraints for the class of types to be used with our construct, rather than implementing our construct around a particular type. To design for a particular type, often leads to assumptions about the data that will be used, which in turn leads to a rigid implementation. I will be sure to revisit the topic of genericityin a future post. For now, suffice it to say that most of generic programming would not be possible if the compiler were not capable of type deduction.

All calls to the pop_front function will use the single parameter template. While the compiler is searching it's set of template instantiations for the best fit, it will deduce the two types from the Typelist that we provide. This function becomes a helper method, and is called indirectly by the compiler to create the final type. A direct instantiation would equate to the verbose syntax of the nested linked-list example from above. We use templates to put the compiler to work and generate all of this tedious code.

A specialization to handle the empty node is all that remains to complete the pop_front method.

C++

template < TMP_ARRAY_31(typename T) >
struct pop_front < TypeList< empty, TMP_ARRAY_31(T) > >
{
  typedef empty type;
};

push_front

One final operation that I would like to demonstrate is how to implement is push_front. This will allow use to programmatically build a Typelist in our code. This operation appends a new type at the front of and existing list. Here is the forward declaration of the meta-function defined with the form the caller will use:

C++

 // forward declaration
template < typename ContainerT,
           typename T>
struct push_front;

The primary implementation of this template also contains one more parameter type than we expect. This gives the compiler mechanism to recursively construct the Typelist from a set of nodes. Eventually the existing Typelist sequence will be constructed, and finally the new type T that we specify will be added to the node at the front of the final list.

C++

template < typename T1, typename T2, typename T >
struct push_front < TypeNode < T1, T2 >, T >
{
  typedef TypeNode < T, TypeNode < T1, T2 > > type;
};

// The syntactic sugar definition of this operation.
template < TMP_ARRAY_32(typename T), typename T >
struct push_front < TypeList < TMP_ARRAY_32(T) >, T >
{
  typedef TypeList < T, TMP_ARRAY_31(T) > type;
};

Finally, we must provide a specialization that contains the empty terminator.

C++

template < typename T >
struct push_front < empty, T >
{
  typedef TypeNode < T, empty > type;
};

Summary

The Typelist will be the basis of my implementation for alchemy. In this entry I demonstrated in more detail how a Typelist is constructed, and design rationale for implementing generic programming constructs. I showed how the Typelist itself is not that much different compared to a traditional link-linked list that you most likely have worked with at some point in your career. In truth, these operations barely scratch the possibilities for what is possible for operating on the Typelist. You will find even more Typelist operations in Andrei Alexandrescu's book, Modern C++ Design. Such as rotating the elements of the list, and pruning the list to remove all of the duplicate types.

We are not done with our study of Typelists. In my next Typelist entry, we will move beyond the abstract and academic into the practical. I will explain new operations that I will need to implement Alchemy, and I will demonstrate how they can be applied to accomplish something useful.

Original post blogged at Code of the Damned.

This article was originally posted at http://codeofthedamned.com/index.php/typelist-operations

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Paul M Watt

Engineer

United States

I am a software architect and I have been developing software for nearly two decades. Over the years I have learned to value maintainable solutions first. This has allowed me to adapt my projects to meet the challenges that inevitably appear during development. I use the most beneficial short-term achievements to drive the software I develop towards a long-term vision.

C++ is my strongest language. However, I have also used x86 ASM, ARM ASM, C, C#, JAVA, Python, and JavaScript to solve programming problems. I have worked in a variety of industries throughout my career, which include:
• Manufacturing
• Consumer Products
• Virtualization
• Computer Infrastructure Management
• DoD Contracting

My experience spans these hardware types and operating systems:
• Desktop
o Windows (Full-stack: GUI, Application, Service, Kernel Driver)
o Linux (Application, Daemon)
• Mobile Devices
o Windows CE / Windows Phone
o Linux
• Embedded Devices
o VxWorks (RTOS)
o Greenhills Linux
o Embedded Windows XP

I am a Mentor and frequent contributor to CodeProject.com with tutorial articles that teach others about the inner workings of the Windows APIs.

I am the creator of an open source project on GitHub called Alchemy[^], which is an open-source compile-time data serialization library.

I maintain my own repository and blog at CodeOfTheDamned.com/[^], because code maintenance does not have to be a living hell.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.