Application of C++11 User-Defined Literals to Handling Scientific Quantities, Number Representation and String Manipulation

Mikhail Semenov

4.98/5 (30 votes)

Aug 27, 2012

CPOL

7 min read

53243

398

keywords: user-defined literals , templates, constant expressions, recursive functions

Download user_defined_literals.zip - 6.1 KB

Introduction

In this article, user-defined literals are explained and examples of their application are given. The examples with user-defined literals can be compiled only in GCC, version 4.7.0 and above.

Templates To Handle Scientific Quantities

For demonstration, we will consider only the quantities that measure Mass, Length and Time and those derived from them: Area, Speed, Acceleration, Frequency, Volume, Force, Pressure, etc. Some of the solutions expressed here are similar to [1,2,3].

template<int M, int L, int T>
class Quantity
{
. . .
};

If we try to add (or subtract) two objects whose classes are based on the Quantity template, but have different values M, L or T, the compiler will signal an error, because their classes will be incompatible. Here is the full definition of the Quantity template:

template<int M, int L, int T>
class Quantity
{
public:
    explicit Quantity(double val=0.0) : value(val){}
 
    Quantity(const Quantity& x) : value(x.value)
    {
        
    }
    Quantity& operator+=(const Quantity& rhs)
    {
        value+=rhs.value;
        return *this;
    }
 
    Quantity& operator-=(const Quantity& rhs)
    {
        value-=rhs.value;
        return *this;
    }
 
    double Convert(const Quantity& rhs)
    {
        return value/rhs.value;                    
    }
 
    double getValue() const
    {
        return value;
    }        
 
private:    
    double value;    
};

We have to define extra functions to manipulate Quantity objects:

template <int M, int L, int T>
Quantity<M,L,T> operator+(const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return Quantity<M,L,T>(lhs)+=rhs;
}
template <int M, int L, int T>
Quantity<M,L,T> operator-(const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return Quantity<M,L,T>(lhs)-=rhs;
}
template <int M1, int L1, int T1, int M2, int L2, int T2>
Quantity<M1+M2,L1+L2,T1+T2> operator*(const Quantity<M1,L1,T1>& lhs, const Quantity<M2,L2,T2>& rhs)
{
    return Quantity<M1+M2,L1+L2,T1+T2>(lhs.getValue()*rhs.getValue());
}
template <int M, int L, int T>
Quantity<M,L,T> operator*(const double& lhs, const Quantity<M,L,T>& rhs)
{
    return Quantity<M,L,T>(lhs*rhs.getValue());
}
 
template <int M1, int L1, int T1, int M2, int L2, int T2>
Quantity<M1-M2,L1-L2,T1-T2> operator/(const Quantity<M1,L1,T1>& lhs, 
            const Quantity<M2,L2,T2>& rhs)
{
    return Quantity<M1-M2,L1-L2,T1-T2>(lhs.getValue()/rhs.getValue());
}
 
template <int M, int L, int T>
Quantity<-M, -L, -T> operator/(double x, const Quantity<M,L,T>& rhs)
{
    return Quantity<-M,-L,-T>(x/rhs.getValue());
}
 
template <int M, int L, int T>
Quantity<M, L, T> operator/(const Quantity<M,L,T>& rhs, double x)
{
    return Quantity<M,L,T>(rhs.getValue()/x);
}
 
template <int M, int L, int T>
bool operator==(const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return (lhs.getValue()==rhs.getValue());
}
 
template <int M, int L, int T>
bool operator!=(const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return (lhs.getValue()!=rhs.getValue());
}
 
template <int M, int L, int T>
bool operator<=(const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return lhs.getValue()<=rhs.getValue();
}
template <int M, int L, int T>
bool operator>=(const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return lhs.getValue()>=rhs.getValue();
}
template <int M, int L, int T>
bool operator< (const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return lhs.getValue()<rhs.getValue();
}
template <int M, int L, int T>
bool operator> (const Quantity<M,L,T>& lhs, const Quantity<M,L,T>& rhs)
{
    return lhs.getValue()>rhs.getValue();
}

Using the Quantity template we can now define the physical quantity classes:

typedef Quantity<1,0,0> Mass;
typedef Quantity<0,1,0> Length;
typedef Quantity<0,2,0> Area;
typedef Quantity<0,3,0> Volume;
typedef Quantity<0,0,1> Time;
typedef Quantity<0,1,-1> Speed;
typedef Quantity<0,1,-2> Acceleration;
typedef Quantity<0,0,-1> Frequency;
typedef Quantity<1,1,-2> Force;
typedef Quantity<1,-1,-2> Pressure;

Now we can define the actual physical units:

constexpr Length metre(1.0);
constexpr Length decimetre = metre/10;
constexpr Length centimetre = metre/100;
constexpr Length millimetre = metre/1000;
constexpr Length kilometre = 1000 * metre;
constexpr Length inch = 2.54 * centimetre;
constexpr Length foot = 12 * inch;
constexpr Length yard = 3 * foot;
constexpr Length mile = 5280 * foot;
constexpr Frequency Hz(1.0);
 
constexpr Area kilometre2 = kilometre*kilometre;
constexpr Area metre2 = metre*metre;
constexpr Area decimetre2 = decimetre*decimetre;
constexpr Area centimetre2 = centimetre*centimetre;
constexpr Area millimetre2 = millimetre * millimetre;
constexpr Area inch2 =inch*inch;
constexpr Area foot2 = foot*foot;
constexpr Area mile2 = mile*mile;
 
constexpr Volume kilometre3 = kilometre2*kilometre;
constexpr Volume metre3 = metre2*metre;
constexpr Volume decimetre3 = decimetre2*decimetre;
constexpr Volume litre = decimetre3;
constexpr Volume centimetre3 = centimetre2*centimetre;
constexpr Volume millimetre3 = millimetre2 * millimetre;
constexpr Volume inch3 =inch2*inch;
constexpr Volume foot3 = foot2*foot;
constexpr Volume mile3 = mile2*mile;
 
constexpr Time second(1.0);
constexpr Quantity<0,0,2> second2(1.0);
constexpr Time minute = 60 * second;
constexpr Time hour = 60 * minute;
constexpr Time day = 24 * hour;
 
constexpr Mass kg(1.0);
constexpr Mass gramme = 0.001 * kg;
constexpr Mass tonne = 1000 * kg;
constexpr Mass ounce = 0.028349523125 * kg; 
constexpr Mass pound = 16 * ounce;
constexpr Mass stone = 14 * pound;

You may wish to define extra units, like the gallon, the nautical mile, etc. What if we want to print the value of a quantity. In this case, you can either use the member function getValue, which will give the quantity in the corresponding SI units (since we defined the base units in the SI System), or use the Convert function, in which case we must explicitly specify the unit that we are converting to. Applying the Convert function is more explicit and, therefore, better. Here are some examples how we can use this approach:

Mass myMass = 80*kg;
cout << "my mass: " << myMass.Convert(kg) << " kg" << endl;
cout << "my mass: " << myMass.Convert(stone) << " stone" << endl;
cout << "my mass: " << myMass.Convert(pound) << " pounds" << endl;

This code will print (I have setup the 15-digit precision):

my mass: 80 kg
my mass: 12.5978435534216 stone
my mass: 176.369809747902 pounds

Here is another example:

Length distance100 = 100*kilometre;
Time time = 2*hour;
Speed sp1 = distance100 / time;               
cout << "100 km in 2 hours: " << sp1.Convert(kilometre/hour) << " km/hour" << endl;            
cout << "100 km in 2 hours: " << sp1.Convert(mile/hour) << " miles/hour" << endl;

This will print:

100 km in 2 hours: 50 km/hour
100 km in 2 hours: 31.0685596118667 miles/hour

All the above-mentioned code will execute in Visual C++ 2008/2010/2011/2012 and GCC 4.5 and above.

User-defined Literals in C++11

In C++11 (the new Standard adopted in 2011), it is possible to define new literals, using suffix notation. With this approach, we can write literals 10.0_kg, 20.0_g, 50.5_s, etc. Let’s us consider how we can define such literals. If we want to use a floating-point number the general definition will look as follows (square brackets show that the token is optional):

[constexpr]  return_type  operator""  literal_suffix(long double parameter)
 { statements; return expression; };

It looks similar to a function definition. The constexpr token is not mandatory, but it is advisable to use it if you want your literals to be evaluated at compile time. The return type is the type of the literal that will be created (in our case, Mass, Time, etc). The literal_suffix token is an identifier: _kg, _g or _s. The parameter token represents the parameter, that will receive the value of the floating-point value. If we write 10.0_kg, the value 10.0 will be assigned to the parameter.

The type long double, is the only type allowed for floating-point literals (the maximum available floating-point number). Here expression represents the expression, which contains the parameter that will evaluated to produce the result. The definition of the _kg literal can be as follows:

constexpr Mass operator"" _kg(long double x)  { return Mass(x); }

Now we can define other literals in similar manner:

constexpr Length operator"" _mm(long double x) { return x*millimetre; }
constexpr Length operator"" _cm(long double x)  { return x*centimetre; }
constexpr Length operator"" _m(long double x)  { return x*metre; }
constexpr Speed operator"" _mps(long double x)  { return Speed(x); }
constexpr Speed operator"" _miph(long double x) { return x*mile/hour; }
constexpr Speed operator"" _kmph(long double x) { return x*kilometre/hour; }

You may ask whether the underscore is really necessary. The answer is no. The standard does not formally prevent you from defining literals without the underscore, but literals which contain only letters may be used in future extensions. The standard prefixes, like that, are preferred over the user-defined ones. For example, your attempt to re-define the suffix LL will be ignored. So, it’s good practice to put an underscore.

Another important point: if you define only the long double parameter you must use only the floating-point number in your literal: like 10.0_kg. You are not allowed to use: 10_kg. But you may define another suffix with the same name:

constexpr Mass operator"" _kg(unsigned long long x)  { return Mass(static_cast<double>(x)); }

If both definitions are present, we may use 10.0_kg and 10_kg in our program. In order to be able to use constexpr literals we have to redefine our Quantity template, introducing constexpr tokens:

template<int M, int L, int T>
class Quantity
{
public:
    constexpr Quantity(double val) : value(val){}
 
    constexpr Quantity& operator+=(const Quantity& rhs)
    {
        value+=rhs.value;
        return *this;
    }
 
    constexpr Quantity& operator-=(const Quantity& rhs)
    {
        value-=rhs.value;
        return *this;
    }
 
    constexpr double Convert(const Quantity& rhs)
    {
        return value/rhs.value;                    
    }
 

    constexpr double getValue() const
    {
        return value;
    }        
 
private:    
    double value;    
};

The same applies to the operators and unit definitions: they should be all defined with the constexpr token. Now we can define the following conversion macro, which will enable us to write shorter conversion expressions:

#define ConvertTo(_x, _y) (_x).Convert(1.0_##_y)

Now, instead of

(20 * mile).Convert(kilometre)

we can write

ConvertTo(20.0_mi, km)

You may use your suffix without the underscore as the second parameter of the conversion macro.

Expressing π multiplied by a factor

You may define a convenient suffix for expressing values of π:

constexpr long double operator "" _pi(long double x) 
{ return x * 3.1415926535897932384626433832795;}
constexpr long double operator "" _pi(unsigned long long int x) 
{ return x * 3.1415926535897932384626433832795;}

In this case you can write literals as follows:

2_pi

-2.5_pi

Representing Binary Numbers

Basic approach

You may wish to use binary numbers in you program. In this case, the whole characters that you supply should be treated and a string. The general syntax will be:

[constexpr]  return_type  operator""  literal_suffix (const char* parameter)
{ statements ; return expression; }

In order to convert a string to a literal we can write the following recursive function (only recursive functions are allowed in constant expressions, see [4]):

constexpr unsigned long long ToBinary(unsigned long long x, const char* s)
{
    return (!*s ? x : ToBinary(x + x + (*s =='1'? 1 : 0), s+1));
}

Now we can define the the suffix _b as follows:

constexpr unsigned long long int operator "" _b(const char* s) 
{ return ToBinary(0,s);}

Here are some examples of its use:

std::cout << "1101(2) = " << 1101_b  << std::endl;
std::cout << "1111 1111 1111 1111 1111(2) = " 
    << std::hex << 11111111111111111111_b  << "(16)" << std::endl;
std::cout << "10 1010 1010 1010 1010(2) = " 
    << std::hex << 101010101010101010_b  << "(16)" << std::endl;

This code fragment will print:

1101(2) = 13
1111 1111 1111 1111 1111(2) = fffff(16)
10 1010 1010 1010 1010(2) = 2aaaa(16)

Adding a Scale Factor to Binary Numbers

It would be convenient to allow a scale factor in binary numbers, so that we can easily position binary digits in a particular place. The scale factor can be decimal. We can for example write:

1e32_b, which means 2^32;
1011e16_b, which means 11*2^16 = 720896.

We can define the following suffix:

unsigned long long operator"" _b(const char* str) 
{             
    return ToScaledBinary(0,0,str);
}

The task is to define the ToScaledBinary recursive function. First of all, let’s define the Scale function that will convert a sequence of decimal digits to a number:

constexpr unsigned long long Scale(unsigned long long x, const char* s)
{
    return (!*s ? x : Scale(10*x + ((unsigned long long)*s)-((unsigned long long)'0'), s+1));
}

This function will be used to convert the scale factor. Now we can define the ToScaledBinary function, which will be a small enhancement to the ToBinary function that we defined before:

constexpr unsigned long long ToScaledBinary
    (unsigned long long x, unsigned long long scale, const char* s)
{
    return (!*s 
                ? x 
                : ( *s == 'e' || *s == 'E'  
                        ? (x << Scale(0,s+1)) // put the digits in the right position
                        : ToScaledBinary(x + x + (*s =='1'? 1 : 0), scale,  s+1)));
}

Here are some examples of its use:

std::cout << std::hex << "0x" << 1111111111_b << std::endl;    
std::cout << std::hex << "0x" << 1111111111e16_b << std::endl;
std::cout << std::hex << "0x" << 1e32_b << std::endl;    
std::cout << std::dec << 1e32_b << std::endl;    
std::cout << std::dec << 1011e16_b << std::endl;

The output of the program will be:

0x3ff
0x3ff0000
0x100000000
4294967296
720896

An alternative approach would be to define the operator as a variadic template:

template<char...chars>
unsigned long long operator"" _b() 
{         
    constexpr static char str[] = {chars..., '\0'};
    return ToScaledBinary(0,0,str); 
}

A variadic template allows us to use multiple parameters. We define a static array of characters, which is assigned all those values. I, personally, do not see any advantage in using variadic templates in this case. Defining operators with a string parameter is easier, and it's better to use constant expressions without templates anyway.

String Manipulation

It is possible to use constant literals with strings of any characters, in which case the characters should be surrounded by double quotes, for example (the suffixes _UP and _S are user-defined):

"apple"_UP
"zxy"_S

The standard allows also to use UTF_8, Unicode (16 and 32-bit) and wide-character strings (here _w1 , _w2, _w3 and _w4 are user-defined suffixes):

u8"one"_w1
u"one"_w2
U"one"_w3
L"one"_w4

The general format for a literal can be as follows:

[constexpr]  return_type  operator""  literal_suffix(const char_type * str, size_t  length)
 
{ statements; return expression; };

The length parameter is compulsory. The char_type can be one of the follows: char, char16_t, char32_t, wchar_t. You can use such user-defined literals to convert strings to a different representation or to put string literals into containers.

Let us look at a simple example, where all the characters string literal is converted to an uppercase. So the following literals: “apple”_UP, “Apple”_UP, “APPLE”_UP will be converted to an std::string, whose contents will be “APPLE”. The std::string cannot be used in constant expressions, so we are not defining this operator with the contstexpr token:

std::string operator "" _UP(const char* s, std::size_t n) 
{
    std::string str(s,n);
    for (char &c: str)
    {       
       c = toupper(c);
    }
    return str;
}

Here are some examples:

std::cout << "apple"_UP << std::endl;
std::cout << ("APPLE" == "apple"_UP) << std::endl;

This code will print:

APPLE
1

Using Characters in User-Defined Literals

You may use characters in user-defined literals, for example:

u'π'_const

u'e'_const

The generals syntax for defining a suffix in this case will be:

[constexpr] return_type  operator""  literal_suffix(char_type parameter)
{ statements; return expression; };

The char_type can be one of the follows: char, char16_t, char32_t, wchar_t. You can use this syntax in order to incorporate, for example, characters that are not allowed in identifiers, like 'π' or 'µ'.

Here is a sample suffix definition for character literals:

constexpr double operator "" _const(char16_t c) 
{     
    return (c == u'π' ? 3.141592653589793 
                      : (c == u'e' ? 2.718281828459045 : 0));
}

We may use it as follows:

std::cout << std::setprecision(16) << u'π'_const << std::endl;
std::cout << u'e'_const << std::endl;

The Syntax of User-Defined Literals in Detail: What is Allowed and What is Not

Here some basic syntax and what is allowed in user-defined literals (optional items are enclosed in square brackets):

(1)

[constexpr]  return_type  operator""  literal_suffix(unsigned long long parameter){…};

the literals can be decimal integer numbers, hexadecimal numbers, octal numbers;

(2)

[constexpr]  return_type  operator""  literal_suffix(long double parameter){…};

here floating-point numbers can be used;

(3)

[constexpr]  return_type  operator""  literal_suffix(const char* parameter){…};

the literals can be decimal integer numbers, hexadecimal numbers, octal numbers or floating-point numbers;

(4)

[constexpr]  template<char … chrs> return_type  operator"" literal_suffix () {…} ;

the literals can be decimal integer numbers, hexadecimal numbers, octal numbers or floating-point numbers;

(5)

[constexpr]  return_type  operator""  literal_suffix(const char_type* str, size_t  length ){…};

the literals can be strings of char_type.

(6)

[constexpr] return_type  operator""  literal_suffix(char_type parameter) {…};

the literals can be of char type.

These rules show that the following literals (suffixes: _txt, _32) cannot be used:

programming_txt

AHZ_32

But the following (with suffixes: _txt, _32 and _7) are allowed:

“programming”_txt

“AHZ”_ 32

23456_7

References

[1] http://learningcppisfun.blogspot.co.uk/2007/01/units-and-dimensions-with-c-templates.html

[2] David Abrahams and Aleksey Gurtovoy.C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond (C++ in Depth).

[3] http://www.boost.org/doc/libs/1_51_0/doc/html/boost_units.html

[4] http://www.codeproject.com/Articles/417719/Constants-and-Constant-Expressions-in-Cplusplus11