C++11 Move Semantics, rvalue Reference

BrainlessLabs.com

4.08/5 (5 votes)

Sep 10, 2014

MPL

8 min read

19796

In this article, we will discuss the move semantics for C++.

In this article, we will discuss the move semantics for C++. We will try to figure out what exactly move is.

Problem Statement

Copying is not the optimal solution in all situations. Some situations demand moving as copying may mean duplication of resources and it may be an intensive task.

Another problem arises due to temporaries. These temporaries may blog memory and slow down C++ execution.

Solution

Solution comes inform of the move semantics. We will gradually discover what this is.

rvalue Reference

RValues is a new addition to C++11. We will see what its purpose is and why it was implemented.

The original definition of lvalues and rvalues is as follows:

As per C style definition, an lvalue is an expression that may appear on the left or on the right hand side of an assignment, whereas an rvalue is an expression that can only appear on the right hand side of an assignment.

  int a = 42;
  int b = 43;

  // a and b are both l-values:
  a = b; // ok
  b = a; // ok
  a = a * b; // ok

  // a * b is an rvalue:
  int c = a * b; // ok, rvalue on right hand side of assignment
  a * b = 42; // error, rvalue on left hand side of assignment

C++ with its user-defined types has introduced some subtleties regarding modifiability and assignability that cause this definition to be incorrect. So now what we tell lvalue is an expression that represents a memory location. This lvalue lets us take the address of the location. What is a rvalue? Simple, whatever is not a lvalue.

Now let us formally define these terms and properties a little better.

First, what is an expression?

An expression is a sequence of operators and operands. A expression is a statement that specifies a computation. It tells the computer or say C++ what to do.
An expression can produce a value like "1+2;// It's value is 3"
An expression can have side effects too like a function call.
An expression can be simple or complex.
Each expression has a type and a value category, i.e., if the expression is lvalue, rvalue, etc.

lvalue

An lvalue is an expression that identifies a non-temporary object or a non-member function.

Address of a lvalue can be taken.
Modifiable, i.e., non const lvalue can be used on the left side of the "="
lvalue can be used to initialize the lvalue reference.
Sometimes when permitted, lvalue can have incomplete type.
An expression that designates a bit field (e.g. s.x where s is an object of type struct S { int x:3; };) is an lvalue expression (or xvalue if s is one): it may be used on the left hand side of the assignment operator, but its address cannot be taken and a non-const lvalue reference cannot be bound to it. A const lvalue reference can be initialized from a bit-field lvalue, but a temporary copy of the bit-field will be made: it won't bind to the bit field directly.

Examples

The name of a variable or function in scope, regardless of type, such as std::cin or std::endl. Even if the variable's type is rvalue reference, the expression consisting of its name is an lvalue expression.
Function call or overloaded operator expression if the function's or overloaded operator's return type is an lvalue reference, such as std::getline(std::cin, str) or std::cout << 1 or str1 = str2 or ++iter
Built-in pre-increment and pre-decrement, dereference, assignment and compound assignment, subscript (except on an array xvalue), member access (except for non-static non-reference members of xvalues, member enumerators, and non-static member functions), member access through pointer to data member if the left-hand operand is lvalue, comma operator if the right-hand operand is lvalue, ternary conditional if the second and third operands are lvalues.
Cast expression to lvalue reference type.
String literal
Function call expression if the function's return type is rvalue reference to function type
Cast expression to rvalue reference to function.

prvalue

A pure rvalue (prvalue) is an expression that identifies a temporary object (or a subobject thereof) or is a value not associated with any object.

It can be a rvalue
a prvalue cannot be polymorphic: the dynamic type of the object it identifies is always the type of the expression
a non-class non-array prvalue cannot be const-qualified.
a prvalue cannot have incomplete type (except for type void, see below)
The expressions obj.func and ptr->func, where func is a non-static member function, and the expressions obj.*mfp and ptr->*mfp where mfp is a pointer to member function, are classified as prvalue expressions, but they cannot be used to initialize references, as function arguments, or for any purpose at all, except as the left-hand argument of a function call expression, e.g. (pobj->*ptr)(args).
Function call expressions returning void, cast expressions to [cpp]void[/cpp], and [cpp]throw-expressions[/cpp] are classified as prvalue expressions, but they cannot be used to initialize references or as function arguments. They can be used in some contexts (e.g. on a line of its own, as the left argument of the comma operator, etc.) and in the return statement in a function returning void

Examples

Literal (except string literal), such as 42 or true or nullptr.
Function call or overloaded operator expression if the function's or the overloaded operator's return type is not a reference, such as str.substr(1, 2) or str1 + str2
Built-in post-increment and post-decrement, arithmetic and logical operators, comparison operators, address-of operator, member access for a member enumerator, a non-static member function, or a non-static non-reference data member of an rvalue, member access through pointer to a data member of rvalue or to a non-static member function, comma operator where the right-hand operand is rvalue, ternary conditional where either second or third operands aren't lvalues.
Cast expression to any type other than reference type.
Lambda expressions, such as [](int x){return x*x;}

xvalue

An xvalue is an expression that identifies an "eXpiring" object, that is, the object that may be moved from. The object identified by an xvalue expression may be a nameless temporary, it may be a named object in scope, or any other kind of object, but if used as a function argument, xvalue will always bind to the rvalue reference overload if available.

It can be either rvalue or
it can also be a gvalue
Like prvalues, xvalues bind to rvalue references
Unlike prvalues, an xvalue may be polymorphic, and a non-class xvalue may be cv-qualified.

Examples

A function call or overloaded operator expression if the function's or the overloaded operator's return type is an rvalue reference to object type, such as std::move(val)
A cast expression to an rvalue reference to object type, such as static_cast<T&&>(val) or (T&&)val
A non-static class member access expression, in which the object expression is an xvalue
A pointer-to-member expression in which the first operand is an xvalue and the second operand is a pointer to data member.

gvalue

A glvalue ("generalized" lvalue) is an expression that is either an lvalue or an xvalue.

Mostly its properties are as applies to pre-C++11 lvalues
A glvalue may be implicitly converted to prvalue with lvalue-to-rvalue, array-to-pointer, or function-to-pointer implicit conversion.
A glvalue may be polymorphic: the dynamic type of the object it identifies is not necessarily the static type of the expression.

rvalue

An rvalue is an expression that is either a prvalue or an xvalue.

It has properties that apply to both xvalues and prvalues, which means they apply to the pre-C++11 rvalues as well
Address of an rvalue may not be taken: &int(), &i++[3], &42, and &std::move(val) are invalid.
An rvalue may be used to initialize a const lvalue reference, in which case the lifetime of the object identified by the rvalue is extended until the scope of the reference ends.
An rvalue may be used to initialize an rvalue reference, in which case the lifetime of the object identified by the rvalue is extended until the scope of the reference ends.
When used as a function argument and when two overloads of the function are available, one taking rvalue reference parameter and the other taking lvalue reference to const parameter, rvalues bind to the rvalue reference overload (thus, if both copy and move constructors are available, rvalue arguments invoke the move constructor, and likewise with copy and move assignment operators).

Moving

Let's say we have a 3D model class. The model class holds textures that are image files, vertex points that can spawn to thousands, color info for each vertex. Say like:

class Vertex {
  // Members not imp
public:
  void addVertex ( /*vertex type*/ ) {

  }
  ~Vertex ( ) {
    // Destroy verted
  }
};

class Texture {
  // Members not imp
public:
  void load (/*info*/ ) {
    // Heavy duty image loading
  }
  ~Texture ( ) {
    // Destroy 
  }
};

class Model3D {
private:
  Vertex* _ver;
  Texture* _tex;
public:
  void initialize ( ) {
    _ver = new Vertex;
    _tex = new Texture;
    for ( int i = 0; i<10000; ++i ) {
      // Some more processing
      _ver->addVertex ( );
    }
    for ( int i = 0; i<500; ++i ) {
      _tex->load ( );
    }

  }
  ~Model3D ( ) {
    delete _ver;
    delete _tex;
  }
};

Model3D retGraphics ( ) {
  Model3D g;
  // Do some operation and return
  return g;
}

Model3D g1 = retGraphics ( );

Here, as you can see, the ThreeD model class does some heavy duty vertex and texture loading. Now take a look at the statement Model3D g1 = retGraphics ( );. This statement can be converted to the following pseudo code.

Model3D tempG;
Model3D retGraphics ( ) {
  Model3D g;
  // Do some operation and return
  // g will die with the scope, so copy it to a temp object
  tempG = g; // Clone the resources call Model3D::operator =( Model3D& ) on tempG 
  g->~Model3D( );
}
Model3D g1 = tempG; // Clone tempG. Call Model3D::operator =( Model3D& ) on g1
tempG->~Model3D ( );

As you see, there is a temporary involved. That means the vertex and texture destruction and loading happens 2ce. This is a time consuming and unnecessary process. So now the intelligent programmer is left with writing some code that can actually do the swapping of resource, rather than let the resource get destroyed. This is again time consuming and boring work, but all have to do it increasing the source size. Won’t it be good if the language did it, hence reducing the burden from the programmer? Well C++ does exactly that with the move functionality. So it does something like:

Model3D& Model3D::operator = ( <move type> rhs ) {
  //swap _ver
  //swap _tex
}

This is the reason C++ creates an overload with the move type, which is a special type to tell the compiler to move the resources rather than do the delete and construct operation. With the move type in play, the compiler deals with the following choices:

Move type must be a reference
When there is a choice between two overloads where one is an ordinary reference and the other is the mystery type, then rvalues must prefer the mystery type
lvalues must prefer the ordinary reference

So what exactly is this move type? This is the rvalue reference, i.e., Model3D&&.

Model3D& is called the lvalue reference. So what are the properties of the rvalue reference now?

During function overload resolution lvalue prefers lvalue reference and rvalue prefers rvalue reference.

void f ( Model3D& m); // Lvalue reference overload.
void f ( Model3D&& m); // rvalue reference overload.

f ( g1 ); // Here g1 is lvalue, so call void f ( Model3D& m );
f ( retGraphics ( ) ); // Here rvalue is needed. so void f ( Model3D&& m ); is called.

We can overload any function with rvalue. But mostly in practice, the copy constructor and assignment operator.

So what happens if you implement rvalue and forget the lvalue overloads? Well, try it yourself. We will cover it later.

For more on move and rvalue, please refer to the blog.

C++11 Move Semantics, rvalue Reference

Problem Statement

Solution

rvalue Reference

lvalue

Examples

prvalue

Examples

xvalue

Examples

gvalue

rvalue

Moving

Bibliography