Type Rich Style for C++11

Ilia Glizerin

4.81/5 (18 votes)

Feb 10, 2014

CPOL

9 min read

49807

241

How to implement the type rich style recommended, as recommended by Bjarne Stroustrup

Download source - 691 B

Introduction

The purpose of this article is to share with the community, the understanding and knowledge, to the how and why, of compiling and using, these two lines of code.

int main(){
	Speed spd = 1m/1.2s;
	Acceleration acc = 8m/3.3s2;
}

Background

After watching Bjarne Stroustrup's lecture on Youtube here, I decided to go and try to decipher the mystery behind those presentation slides starting at minute 19.04. I learned a lot in the process, and hope that so do you.

Prerequisite knowledge required:

C++ Operator Overloading
C++ Templates

Using the Code

I used my MinGW compiler with the -std=c++11 flag to compile all examples.

Article

I will only try to inspire the motive with one sentence, as I don't want to repeat what Bjarne said. Strongly typed programming style increases compile time error checking and code readability. Great, let's get started.

So we would like to be able to write code with style of that of a standard physics textbook. With meaningful types such as "Speed", and "Acceleration". And whose values contain the specific units, such as "Meters Per Second", or "Miles Per Hour".

Code which would look like this:

Speed speed = 18meters / 9.2seconds;

instead of:

Speed speed(18/9.2);

We will use the features of C++11 combined with Operator Overloading techniques topped of with Templates, (literally topped of).

To start, we need to understand the difference between a value's type, and a value's unit. A type would be something we all know and love, like an "int" or a "double", which simply put, represents the primitive data construct that bounds the value under its limitations. While the unit would be the meaning of the value as it is intended by the intender. The difference between 5.6 seconds, and 5.6 kilograms/meter², is truly only the unit that is intended. The value is the same and the underlying primitive type is a double in either case.

We thus must have ourselves a Unit structure like so:

template<int M, int K, int S>
struct Unit{
	static const int m = M;
	static const int k = K;
	static const int s = S;
};

So for example, if we would instantiate a Unit like so:

Unit<1,0,-1> unit;

The applied meaning would be m¹ * k⁰ * s^-1 . Or more simplified m/s.

struct Unit is a template so that we won't have to rewrite it each time we need a new unit. While the members are static so that we could access them through the type Unit like so which later will become useful, and even necessary.

Unit::m

Units are great and all, but they hardly have a reason to exist by themselves, that's why we require the Value structure as well.

template< typename UNIT>
struct Value{
		//the raw naked value
	double val;
		//retain access to unit's values after first stage compilation
	static constexpr int getM(){ return UNIT::m; }
	static constexpr int getK(){ return UNIT::k; }
	static constexpr int getS(){ return UNIT::s; }

	//constructors
	explicit Value(double d):val(d){}
	constexpr Value():val(0){}
};

Oooo, lots of fudge words in this one, let's go over lightly what's going on. So Value is also a template class, which is where it gets its information about the Unit structure that is associated with it. Important to note that the <typename UNIT> has no actual relation to the Unit struct which we built earlier on (this was very confusing for me in Bjarne's video as they were both named "Unit). That is because Unit is also a Template class, so we can't simply constrict the Value struct to receive only Units.

//wont work
template<Unit unit>
struct Value{};

This won't compile through with our definition of the Unit template because the Unit needs its template parameters like so.

//will work, but is very specific
template<Unit<1,0,-1> unit>
struct Value{};

However, this Value has only the ability to receive Units which represent m/s, and that won't do. We need the struct Value to be able to represent any Unit. Thus, we declare this as a <typename UNIT>.

Following on, we see a bunch of static getters()?, myes, this too wasn't on Bjarne's video, however I find it necessary. Soon we will try to divide a meter Value by a second Value. However, those two are different classes because of the templates which we put in. And even more problematic is that the return type will be a new Value type entirely, that we can't expect the compiler to figure out trivially. We thus need to have access to the values of the members of the Unit of the specific class template, or simply put we need the "m","k",and "s", visible inside the Value struct, so that we can construct new template class instances. However, they dissipate after first phase compilation as the Unit type never actually gets instantiated inside the Value struct. So for that, we got them getters().

Ok, ok, take it easy about the getters(), but why are they static constexpr?? They are static so that we would be able to access them later through the class Value. And they are constexpr so that they would "execute" at compile time and not at run time. What that means is that it's a function that doesn't really exist in run time. Its return value is always the same! and that means the compiler can just substitute that return value instead of the function call wherever it is needed. This is not just for fancy optimization, (trust me, I never optimize anything). This is a necessary feature because templates got to be deducible at compile time, and that creates the problem, which in turn created this "static constexpr" solution.

And the reason to the overwrite of the default constructor as constexpr, is because my compiler told me so. While the self defined constructor is "explicit" because Bjarne told me so. I'll leave it at that.

We have reached the treasure cove. Now we shall proceed to the magic. Let's overload the suffix literal operator. Like so.

using second = Unit<0,0,1>; 
Value<second> operator"" s (long double d){
	return Value<second>(d);
};

This of course belongs in the global scope of things, after the definition of Value. Now for our first test:

using Time = Value<second>;
Time time = 18s;  //wont work

This however is still not enough. We overloaded a suffix to a double literal, while this is translated as a suffix to an int literal. I know the compilers are capable of overlooking this difference when we don't want them too, but in this case they decide to be safe and throw us into oblivion. Not all is lost, however. We could simply overload the int suffix as well like so.

Value<meter> operator"" m (unsigned long long d){
	return Value<meter>(d);
};

So for double, it's apparently "long double" while for int literals, its "unsigned long long". Or at least is how the operator overloading demos requested. (I'll probably update this small issue when I find, or some wiser man will tell me, the optimal fix).

Now after all this work, we can instantiate Values with Appropriate Units.

using Distance = Value<Unit<1,0,0>>; //global scope
Time time = 1s; 
Distance d = 1.4m;

(Note that the using clause goes into the global scope, while the others generally go into function body scopes.)

Great, now all we need is to overload all the possible arithmetic manipulation we could ever need between different Unit Values. And oh, how grateful should you be that all the fudge has been layed nicely under the red carpet.

This is an example of the + operator overload.

Value<UNIT> operator +( Value<UNIT> another ){
	return Value<UNIT>(val + another.val);
}

This goes inside the public: section of the Value<> struct. This one is easy, because the parameter Value is of the same type as *this Value. And the result would also be the same type. Meaning that if we are adding Acceleration to Acceleration, we would get Acceleration.

//this works great
using Acceleration = Value<Unit<1,0,-2>>;	//global scope 
Value<second2> operator"" s2 (long double d){	//overloaded second square suffix operator 
	return Value<second2>(d);
}; 
Acceleration acc1 = 1m/2.4s2;
Acceleration acc2 = 4.8m/2.1s2; 
Acceleration acc3 = acc1 + acc2;

I mentioned this somewhere earlier, all the type checking is done at compile time. At run time, all that will really happen is the division and addition of two doubles, simply amazing. Well at least according to what Bjarne says, and that's better than looking at assembly for me:).

Now for the Final Fitality. Mortal Kombatttt!, em, em, I meant division Overloading.

template< typename OTHER >
Value<Unit<getM()-OTHER::getM(),getK()-OTHER::getK(),getS()-OTHER::getS()>>
operator /
(OTHER other){
	Value<Unit<getM()-other.getM(),getK()-other.getK(),
	getS()-other.getS()>> result(val/other.val);
	return result;
}

So this is the hard part, at least for me. The problem was that all this time, a division operator takes two arguments while both of them from unknown template classes and the result could be a template class completely different, while all this has to go down during compilation. Which is why it's a good thing that all those constexpr and static defenses are set in place. In addition to that, this is a template of an overload inside a template class that is not really associated with a template Unit. (Feel free to jump out of your window).

So how do we go about this? First let's think what it means to do meters/seconds. It would mean that Unit<1,0,0> is being divided by Unit<0,0,1> and the result is Speed which is <1,0,-1>. We get those numbers if for each Unit's value, we subtract the denominator from the nominator, or simply put meters<1,0,0> - seconds<0,0,1>. And just like regular vector subtraction, we need to thus only subtract between each corresponding unit, i.e. "m","k","s". That is why the return value of this wild Overload is:

 Value<Unit<getM()-OTHER::getM(),getK()-OTHER::getK(),getS()-OTHER::getS()>>

remembers those getters()? They are how we connect to the Unit's value that were associated with us. This is where being static and constexpr comes in really handy dandy, because the compiler can just replace the values of getX() and OTHER::getX() at compilation time, which is how it will figure out for us what the return Type will be, for any two Values<> being divided.

What was my objective again?? Oh that's right, these two innocent lines of code:

 int main()
{	Speed spd = 1m/1.2s;
	Acceleration acc = 8m/3.3s2;
}

I hope that you had fun reading, and that you understood at least the theoretical part of this codapalooza. I attached my final source code with the project, It's not very well documented and stuff, and is only meant to review finer points that I might have missed. It compiles with this line on my MinGW "g++ template.cpp -o test_template -std=c++11".

Conclusion

This particular implementation of the style is only achievable with C++11 (and some specialized languages). Even though Bjarne claims that "we can write code like this for about 10 years now", I could sense his disappointment in humanity. I can certainly understand why people wouldn't jump on the opportunity to use this style. However, one must consider the benefits before succumbing to a lazy "#define everybody double!" nature. This style provides a compile time Type checking, that will prevent logical errors, that might not be visible to the naked eye. And this style creates (in the end) a more readable code. While most of the fudge can be neatly tucked away into a remote library, so that the casual C++ goer won't have to look at these obscure templates. It will still be necessary to understand how to manually build new Value<Unit<>> types, with all the convenient suffix Overloading and stuff. I am still looking for a way to ease the usability of this style so that instead of having to write this:

using Momentum = Value< Unit<1,1,-1> >; //momentum is kg*meters/second or kg^1 * m^1 * s^-1

One could achieve the same result by writing something like this:

using Momentum = Kilogram * Meter / Second;

If I find the dark magic that can achieve this, I'll add a sequel to this article.

Cheers! :)