Click here to Skip to main content
13,632,276 members
Click here to Skip to main content
Add your own
alternative version

Tagged as

Stats

13.1K views
143 downloads
16 bookmarked
Posted 21 Jun 2015
Licenced BSD

Clean Reflective Enums – C++ Enum to String with Nice Syntax

, 21 Jun 2015
Rate this:
Please Sign up or sign in to vote.
A concise new technique for getting the compiler to generate reflective enum information

Introduction

Reflection on C++ enums is a long-standing problem. It's hard to convert enums to strings, loop over them, count them, or validate them.

The best attempts so far have involved forbidding initializers, ugly macro syntax, or resorting to external scripts (examples). This article presents a simple new technique that does away with all that. The code lives in a single header and lets you use nice syntax like this:

ENUM(ColorChannel, Red = 1, Green, Blue)

int main()
{
    ColorChannel channel = ColorChannel::Green;

    std::cout << channel._to_string() << std::endl;

    return 0;
}

The program above prints Green.

We have turned the compiler into a reflective enum generator. You can try it online here.

Note the use of an initializer on Red. Any initializer that is possible with built-in enums is allowed. That includes arbitrary constant expressions. And, we are using the built-in syntax, so these enums have a very low learning overhead. That is the contribution of this article. They are also easy to use, since we will have a single-header-only library.

Many readers avoid macros, which is understandable. I will explain why this one macro is the minimum and cannot be done away with.

To perform the conversion, we will generate two arrays (of names and values). This is typical in other approaches – but we have to take care of initializers. Once we have those arrays, we can do much more than just convert enums to strings. We can:

  • Convert from strings.
  • Iterate over enums.
  • Check that an integer is a valid enum value.
  • Report the number of constants in an enum.

and so on. I won't show how to do all that here – it's easy once you have the arrays. This article focuses on generating them.

The solution presented here can also be taken in several other directions, such as:

  • Underlying types, including non-integral ones.
  • Maintaining type safety, such as case checking in switch. This will be touched on in the article.
  • Compile-time reflection for use in metaprogramming.
  • Stream operators.

Complete Implementation

I have created a full-featured single-header enum library based around the technique (docs). You are welcome to try it online here. It takes care of providing all the features listed above in a portable fashion. It takes advantage of C++11 when it is available. It is free to use and fork under the BSD license.

Towards Reflection

This section explains why the code looks the way it does – what problems have to be solved along the way to generating a reflective enum.

Do we need a macro?

The first question is: is it absolutely necessary to use a macro? I believe the answer is yes. To the best of my knowledge, the only way to convert a token such as Red into a string "Red" in C++ is the preprocessor stringization operator #. So, we have to pass the enum constant list to a macro, so that the macro can then stringize the constants in the right place.

Outline

With that out of the way, we want the macro to expand to something like this:

ENUM(ColorChannel, Red = 1, Green, Blue)

// Becomes:

struct ColorChannel {
    enum _enumerated { Red = 1, Green, Blue };

    _enumerated _value;

    static const int  values[] = { 1, 2, 3 };
    static const char *names[] = { "Red", "Green", "Blue" };

    const char* to_string() const { /* Straightforward implementation. */ }
}

This is done using variadic macros, so the declaration begins with:

ENUM(EnumName, ...)                   \
struct EnumName {                     \
    enum _enumerated { __VA_ARGS__ }; \
                                      \
    _enumerated value;                \
                                      \
    /* We will refer to __VA_ARGS__ again here. */
};

The declaration of _enumerated above already triggers the compiler's value assignment procedure, so we have that taken care of: ColorChannel::Red is now 1, ColorChannel::Green is 2, and ColorChannel::Blue is 3.

Values array

This is the first reflection challenge. To generate the values array {1, 2, 3}, we can try to simply refer to the constants of enum _enumerated, since they are in scope:

static const int  values[] = { __VA_ARGS__ };

But that doesn't work. It expands to:

static const int  values[] = { Red = 1, Green, Blue };

The initializer on Red makes this invalid C++, since Red is not an assignable expression.

We can solve that by casting Red to a dummy type, whose only purpose is to have an overloaded assignment operator. The operator will just ignore the assignment. Here is such a type:

struct ignore_assign {
    ignore_assign(int value) : _value(value) { }
    operator int() const { return _value; }

    const ignore_assign& operator =(int dummy) { return *this; }

    int _value;
};

As you can see, an object of type ignore_assign can be constructed from an integer (such as Red), can then be assigned to (which will do nothing), and then can be converted back to an integer. Now, we just need to generate this code for the values array:

static const int  values[] =
    { (ignore_assign)Red = 1,
      (ignore_assign)Green,
      (ignore_assign)Blue };

That is, we have to prefix each of the arguments in __VA_ARGS__ with the cast (ignore_assign). For that, we can use a preprocessor mapping macro, which is a sort of "higher-order" macro. It applies another macro to each of its arguments. I will show it in detail when presenting the final code. For now, just assume that we have a macro MAP(macro, ...) that works like this:

MAP(FOO, a, b, c)

// Expands to:

FOO(a) FOO(b) FOO(c)

Now, we can define:

#define IGNORE_ASSIGN_SINGLE(expression) (ignore_assign)expression,
#define IGNORE_ASSIGN(...) MAP(IGNORE_ASSIGN_SINGLE, __VA_ARGS__)

which finally allows us to declare the contents of the values array:

static const int  values[] = { IGNORE_ASSIGN(__VA_ARGS__) };

This is still not quite right, because you can't define static arrays inside a struct. You can only declare them. I will address that in the section on linkage. For now, at least, we know how to define the contents – the sequence of values – and handle the presence of initializers.

Names array

With the values out of the way, we need the sequence of names. They have to be in the same order as the values. This is pretty straightforward. We need to apply the preprocessor operator # to each of the arguments in __VA_ARGS__. We can do that using the same MAP macro mentioned above:

#define STRINGIZE_SINGLE(expression) #expression,
#define STRINGIZE(...) MAP(STRINGIZE_SINGLE, __VA_ARGS__)

and

static const char *names[] = { STRINGIZE(_VA_ARGS__) };

Again, we can't literally just define an array of strings in a struct, but at least we know how to get the constant names. There is one wrinkle, however. What we actually have in the array right now is:

static const char *names[] = { "Red = 1", "Green", "Blue" };

So we will need to trim off the initializers (such as " = 1") before returning strings from the array.

Now that both the values and names are available, it is easy to write to_string. A simple implementation just walks the values array until it finds the enum we are trying to convert, then returns the name string with the same index. A lot of other translations and useful algorithms can be written as well.

Linkage

Of course, we can't define the names and values arrays inside the "body" of the struct. Declaring and defining them like this is also not an option:

struct EnumName {               \
    static const int  values[]; \
}                               \
                                \
static const int EnumName::values[] = { /* ... */ };

because then, if we then try to use the macro in two translation units, we will get duplicate symbols for values. The same will happen with names. A nice way to solve this is to wrap both arrays inside static inline member functions of each struct. For example:

static const int* values()
{
    static const int values[] =
        { IGNORE_ASSIGN(__VA_ARGS__) };
    return values;
}

And similarly with names.

Name trimming

Since we are wrapping names in a function to solve the linkage problem, it is also a good place to trim the initializers off the stringized names:

static const char* const* names()
{
    static const char* const    raw_names[] =
        {STRINGIZE(__VA_ARGS__) };

    static char*                processed_names[_count];
    static bool                 initialized = false;

    if (!initialized) {
        for (size_t index = 0; index < _count; ++index) {
            size_t length =
                std::strcspn(raw_names[index], " =\t\n\r");

            processed_names[index] = new char[length + 1];

            std::strncpy(
                processed_names[index], raw_names[index], length);
            processed_names[index][length] = '\0';
        }
    }

    return processed_names;
}

Now, names()[0] is simply "Red", as opposed to "Red = 1". The constant _count is just the number of constants in the enum. It is easily declared using COUNT, one of the macros used internally by the MAP macro. COUNT is shown in the section that presents the full code.

Conflicts

Declaring member arrays called names and values makes it impossible to have regular enum constants called names and values because they are in the same scope.

// Does not compile:

ENUM(Column, names, phone_numbers)

To avoid this, I recommend prefixing the members with underscores, i.e. _names_values, and so on.

There is actually an alternative solution, in which the macro doesn't create a struct that wraps an enum, but instead creates a traits type alongside an enum. Then, the names won't clash because they are in different scopes.

The traits approach may be good for certain purposes, but I chose not to use it for reasons described here. The macro presented in this article is so simple that traits and wrapping are equivalent. The traits approach can, however, be more verbose.

Playing nice with switch

Since the struct generated by ENUM wraps an enum value, it would be nice if that struct could be used inside switch statements and still trigger the compiler's case exhaustiveness checking. This is, after all, one of the defining features of enum types. We want this to give us a warning or an error:

ENUM(ColorChannel, Red = 1, Green, Blue)

ColorChannel channel = // ...
switch(channel) {
    case ColorChannel::Red:   // ...
    case ColorChannel::Green: // ...
}

because the Blue case is missing. To get this, we just need to add a converting operator to the struct:

operator _enumerated() const { return _value; }

While we are at it, we should add a constructor so that we can initialize the struct using enum constants:

EnumName(_enumerated value) : _value(value) { }

MSVC compatibility

Visual C++ doesn't conform to the standard in how it expands __VA_ARGS__. To get around that, we need to wrap every use of a macro where we pass __VA_ARGS__ with an identity macro, i.e.

FOO(__VA_ARGS__)

// Becomes:

#define IDENTITY(x) x
IDENTITY(FOO(__VA_ARGS__))

The Code

All the discussion is out of the way. The code below combines all the previous points and turns your compiler into a working reflective enum type generator. The only difference between this section and the previous is that this one is written in parsing order for the compiler, whereas the previous one is written in development order for the human being.

The only thing not previously explained is the MAP(m, ...) macro. It is based on a well-known technique for counting the number of macro arguments. Once the count N is obtained, MAP expands to MAPN, which expands to N calls of m, as you can see below. The only thing to note is that the largest number N supported is limited. I chose to support 8 in this article to keep the code short. More reasonable values are 64 or 96 – you would just have to write that many copies of MAPN, or write a script to produce them. Also, Boost.Preprocessor provides mapping macros. I didn't use it here only to make the article self-contained (and my library doesn't use it to avoid a dependency).

The code below should be familiar if you read the section above. Everything up to the end of the ENUM macro can be pasted out into a header file for use in multiple translation units. The function main at the bottom prints the string Green, then exits with status 1.

#include <cstddef>
#include <cstring>


#define MAP(macro, ...) \
    IDENTITY( \
        APPLY(CHOOSE_MAP_START, COUNT(__VA_ARGS__)) \
            (macro, __VA_ARGS__))

#define CHOOSE_MAP_START(count) MAP ## count

#define APPLY(macro, ...) IDENTITY(macro(__VA_ARGS__))

// Needed to expand __VA_ARGS__ "eagerly" on the MSVC preprocessor.
#define IDENTITY(x) x

#define MAP1(m, x)      m(x)
#define MAP2(m, x, ...) m(x) IDENTITY(MAP1(m, __VA_ARGS__))
#define MAP3(m, x, ...) m(x) IDENTITY(MAP2(m, __VA_ARGS__))
#define MAP4(m, x, ...) m(x) IDENTITY(MAP3(m, __VA_ARGS__))
#define MAP5(m, x, ...) m(x) IDENTITY(MAP4(m, __VA_ARGS__))
#define MAP6(m, x, ...) m(x) IDENTITY(MAP5(m, __VA_ARGS__))
#define MAP7(m, x, ...) m(x) IDENTITY(MAP6(m, __VA_ARGS__))
#define MAP8(m, x, ...) m(x) IDENTITY(MAP7(m, __VA_ARGS__))

#define EVALUATE_COUNT(_1, _2, _3, _4, _5, _6, _7, _8, count, ...) count

#define COUNT(...) \
    IDENTITY(EVALUATE_COUNT(__VA_ARGS__, 8, 7, 6, 5, 4, 3, 2, 1))


struct ignore_assign {
    ignore_assign(int value) : _value(value) { }
    operator int() const { return _value; }

    const ignore_assign& operator =(int dummy) { return *this; }

    int _value;
};

#define IGNORE_ASSIGN_SINGLE(expression) (ignore_assign)expression,
#define IGNORE_ASSIGN(...) IDENTITY(MAP(IGNORE_ASSIGN_SINGLE, __VA_ARGS__))

#define STRINGIZE_SINGLE(expression) #expression,
#define STRINGIZE(...) IDENTITY(MAP(STRINGIZE_SINGLE, __VA_ARGS__))


#define ENUM(EnumName, ...)                                            \
struct EnumName {                                                      \
    enum _enumerated { __VA_ARGS__ };                                  \
                                                                       \
    _enumerated     _value;                                            \
                                                                       \
    EnumName(_enumerated value) : _value(value) { }                    \
    operator _enumerated() const { return _value; }                    \
                                                                       \
    const char* _to_string() const                                     \
    {                                                                  \
        for (size_t index = 0; index < _count; ++index) {              \
            if (_values()[index] == _value)                            \
                return _names()[index];                                \
        }                                                              \
                                                                       \
        return NULL;                                                   \
    }                                                                  \
                                                                       \
    static const size_t _count = IDENTITY(COUNT(__VA_ARGS__));         \
                                                                       \
    static const int* _values()                                        \
    {                                                                  \
        static const int values[] =                                    \
            { IDENTITY(IGNORE_ASSIGN(__VA_ARGS__)) };                  \
        return values;                                                 \
    }                                                                  \
                                                                       \
    static const char* const* _names()                                 \
    {                                                                  \
        static const char* const    raw_names[] =                      \
            { IDENTITY(STRINGIZE(__VA_ARGS__)) };                      \
                                                                       \
        static char*                processed_names[_count];           \
        static bool                 initialized = false;               \
                                                                       \
        if (!initialized) {                                            \
            for (size_t index = 0; index < _count; ++index) {          \
                size_t length =                                        \
                    std::strcspn(raw_names[index], " =\t\n\r");        \
                                                                       \
                processed_names[index] = new char[length + 1];         \
                                                                       \
                std::strncpy(                                          \
                    processed_names[index], raw_names[index], length); \
                processed_names[index][length] = '\0';                 \
            }                                                          \
        }                                                              \
                                                                       \
        return processed_names;                                        \
    }                                                                  \
};

#include <iostream>

ENUM(ColorChannel, Red = 1, Green, Blue);

int main()
{
    ColorChannel    channel = ColorChannel::Green;
    std::cout << channel._to_string() << std::endl;

    switch (channel) {
        case ColorChannel::Red:   return 0;
        case ColorChannel::Green: return 1;
        case ColorChannel::Blue:  return 2;
    }
}

History

This technique, and the associated library, were originally developed in 2012 while I was working at Hudson River Trading, and I have to thank the awesome people at that company for making it publicly available.

License

This article, along with any associated source code and files, is licensed under The BSD License

Share

About the Author

Anton Bachin
United States United States
No Biography provided

You may also be interested in...

Comments and Discussions

 
Questionunlike enum class? Pin
Joost Geerdink15-Nov-17 5:33
memberJoost Geerdink15-Nov-17 5:33 
QuestionFormatting Pin
Anton Bachin22-Jun-15 8:25
memberAnton Bachin22-Jun-15 8:25 
AnswerRe: Formatting Pin
Nelek26-Oct-15 1:03
protectorNelek26-Oct-15 1:03 
GeneralRe: Formatting Pin
Anton Bachin26-Oct-15 5:57
memberAnton Bachin26-Oct-15 5:57 
AnswerRe: Formatting Pin
Sean Ewington26-Oct-15 9:24
staffSean Ewington26-Oct-15 9:24 
GeneralRe: Formatting Pin
Anton Bachin27-Oct-15 10:50
memberAnton Bachin27-Oct-15 10:50 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web03 | 2.8.180712.1 | Last Updated 22 Jun 2015
Article Copyright 2015 by Anton Bachin
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid