C++11 Move Semantics, rvalue Reference






4.08/5 (5 votes)
In this article, we will discuss the move semantics for C++.
In this article, we will discuss the move semantics for C++. We will try to figure out what exactly move is.
Problem Statement
Copying is not the optimal solution in all situations. Some situations demand moving as copying may mean duplication of resources and it may be an intensive task.
Another problem arises due to temporaries. These temporaries may blog memory and slow down C++ execution.
Solution
Solution comes inform of the move semantics. We will gradually discover what this is.
rvalue Reference
RValues
is a new addition to C++11. We will see what its purpose is and why it was implemented.
The original definition of lvalues
and rvalues
is as follows:
As per C style definition, an lvalue
is an expression that may appear on the left or on the right hand side of an assignment, whereas an rvalue
is an expression that can only appear on the right hand side of an assignment.
int a = 42;
int b = 43;
// a and b are both l-values:
a = b; // ok
b = a; // ok
a = a * b; // ok
// a * b is an rvalue:
int c = a * b; // ok, rvalue on right hand side of assignment
a * b = 42; // error, rvalue on left hand side of assignment
C++ with its user-defined types has introduced some subtleties regarding modifiability and assignability that cause this definition to be incorrect. So now what we tell lvalue
is an expression that represents a memory location. This lvalue
lets us take the address of the location. What is a rvalue
? Simple, whatever is not a lvalue
.
Now let us formally define these terms and properties a little better.
First, what is an expression?
- An expression is a sequence of operators and operands. A expression is a statement that specifies a computation. It tells the computer or say C++ what to do.
- An expression can produce a value like "1+2;// It's value is 3"
- An expression can have side effects too like a function call.
- An expression can be simple or complex.
- Each expression has a type and a value category, i.e., if the expression is
lvalue
,rvalue
, etc.
lvalue
An lvalue
is an expression that identifies a non-temporary object or a non-member function.
- Address of a
lvalue
can be taken. - Modifiable, i.e., non
const lvalue
can be used on the left side of the "=" lvalue
can be used to initialize thelvalue
reference.- Sometimes when permitted,
lvalue
can have incomplete type. - An expression that designates a bit field (e.g.
s.x
wheres
is an object of typestruct S { int x:3; };
) is anlvalue
expression (orxvalue
ifs
is one): it may be used on the left hand side of the assignment operator, but its address cannot be taken and a non-const lvalue
reference cannot be bound to it. Aconst lvalue
reference can be initialized from a bit-fieldlvalue
, but a temporary copy of the bit-field will be made: it won't bind to the bit field directly.
Examples
- The name of a variable or function in scope, regardless of type, such as
std::cin
orstd::endl
. Even if the variable's type isrvalue
reference, the expression consisting of its name is anlvalue
expression. - Function call or overloaded operator expression if the function's or overloaded operator's return type is an
lvalue
reference, such asstd::getline(std::cin, str)
orstd::cout << 1
orstr1 = str2
or++iter
- Built-in pre-increment and pre-decrement, dereference, assignment and compound assignment, subscript (except on an array
xvalue
), member access (except for non-static
non-reference members ofxvalues
, member enumerators, and non-static
member functions), member access through pointer to data member if the left-hand operand islvalue
, comma operator if the right-hand operand islvalue
, ternary conditional if the second and third operands arelvalues
. - Cast expression to
lvalue
reference type. - String literal
- Function call expression if the function's return type is
rvalue
reference to function type - Cast expression to
rvalue
reference to function.
prvalue
A pure rvalue
(prvalue
) is an expression that identifies a temporary object (or a subobject
thereof) or is a value not associated with any object.
- It can be a rvalue
- a
prvalue
cannot be polymorphic: the dynamic type of the object it identifies is always the type of the expression - a non-class non-array
prvalue
cannot beconst
-qualified. - a
prvalue
cannot have incomplete type (except for typevoid
, see below) - The expressions
obj.func
andptr->func
, wherefunc
is a non-static
member function, and the expressionsobj.*mfp
andptr->*mfp
wheremfp
is a pointer to member function, are classified asprvalue
expressions, but they cannot be used to initialize references, as function arguments, or for any purpose at all, except as the left-hand argument of a function call expression, e.g.(pobj->*ptr)(args)
. - Function call expressions returning
void
, cast expressions to[cpp]void[/cpp]
, and[cpp]throw-expressions[/cpp]
are classified asprvalue
expressions, but they cannot be used to initialize references or as function arguments. They can be used in some contexts (e.g. on a line of its own, as the left argument of the comma operator, etc.) and in thereturn
statement in a function returningvoid
Examples
- Literal (except
string
literal), such as42
ortrue
ornullptr
. - Function call or overloaded operator expression if the function's or the overloaded operator's
return
type is not a reference, such asstr.substr(1, 2)
orstr1 + str2
- Built-in post-increment and post-decrement, arithmetic and logical operators, comparison operators, address-of operator, member access for a member enumerator, a non-
static
member function, or a non-static
non-reference data member of an rvalue, member access through pointer to a data member ofrvalue
or to a non-static
member function, comma operator where the right-hand operand isrvalue
, ternary conditional where either second or third operands aren'tlvalues
. - Cast expression to any type other than reference type.
- Lambda expressions, such as
[](int x){return x*x;}
xvalue
An xvalue
is an expression that identifies an "eXpiring
" object, that is, the object that may be moved from. The object identified by an xvalue
expression may be a nameless temporary, it may be a named object in scope, or any other kind of object, but if used as a function argument, xvalue
will always bind to the rvalue
reference overload if available.
- It can be either
rvalue
or - it can also be a
gvalue
- Like
prvalues
,xvalues
bind torvalue
references - Unlike
prvalues
, anxvalue
may be polymorphic, and a non-classxvalue
may be cv-qualified.
Examples
- A function call or overloaded operator expression if the function's or the overloaded operator's return type is an
rvalue
reference to object type, such asstd::move(val)
- A cast expression to an
rvalue
reference to object type, such asstatic_cast<T&&>(val)
or(T&&)val
- A non-
static
class member access expression, in which the object expression is anxvalue
- A pointer-to-member expression in which the first operand is an
xvalue
and the second operand is a pointer to data member.
gvalue
A glvalue
("generalized" lvalue
) is an expression that is either an lvalue
or an xvalue
.
- Mostly its properties are as applies to pre-C++11
lvalues
- A
glvalue
may be implicitly converted toprvalue
withlvalue
-to-rvalue
, array-to-pointer, or function-to-pointer implicit conversion. - A
glvalue
may be polymorphic: the dynamic type of the object it identifies is not necessarily thestatic
type of the expression.
rvalue
An rvalue
is an expression that is either a prvalue
or an xvalue
.
- It has properties that apply to both
xvalues
andprvalues
, which means they apply to the pre-C++11rvalues
as well - Address of an
rvalue
may not be taken:&int()
,&i++[3]
,&42
, and&std::move(val)
are invalid. - An
rvalue
may be used to initialize aconst lvalue
reference, in which case the lifetime of the object identified by thervalue
is extended until the scope of the reference ends. - An
rvalue
may be used to initialize anrvalue
reference, in which case the lifetime of the object identified by thervalue
is extended until the scope of the reference ends. - When used as a function argument and when two overloads of the function are available, one taking
rvalue
reference parameter and the other takinglvalue
reference toconst
parameter,rvalues
bind to thervalue
reference overload (thus, if both copy and move constructors are available,rvalue
arguments invoke the move constructor, and likewise with copy and move assignment operators).
Moving
Let's say we have a 3D model class. The model class holds textures that are image files, vertex points that can spawn to thousands, color info for each vertex. Say like:
class Vertex {
// Members not imp
public:
void addVertex ( /*vertex type*/ ) {
}
~Vertex ( ) {
// Destroy verted
}
};
class Texture {
// Members not imp
public:
void load (/*info*/ ) {
// Heavy duty image loading
}
~Texture ( ) {
// Destroy
}
};
class Model3D {
private:
Vertex* _ver;
Texture* _tex;
public:
void initialize ( ) {
_ver = new Vertex;
_tex = new Texture;
for ( int i = 0; i<10000; ++i ) {
// Some more processing
_ver->addVertex ( );
}
for ( int i = 0; i<500; ++i ) {
_tex->load ( );
}
}
~Model3D ( ) {
delete _ver;
delete _tex;
}
};
Model3D retGraphics ( ) {
Model3D g;
// Do some operation and return
return g;
}
Model3D g1 = retGraphics ( );
Here, as you can see, the ThreeD
model class does some heavy duty vertex and texture loading. Now take a look at the statement Model3D g1 = retGraphics ( );
. This statement can be converted to the following pseudo code.
Model3D tempG;
Model3D retGraphics ( ) {
Model3D g;
// Do some operation and return
// g will die with the scope, so copy it to a temp object
tempG = g; // Clone the resources call Model3D::operator =( Model3D& ) on tempG
g->~Model3D( );
}
Model3D g1 = tempG; // Clone tempG. Call Model3D::operator =( Model3D& ) on g1
tempG->~Model3D ( );
As you see, there is a temporary involved. That means the vertex and texture destruction and loading happens 2ce. This is a time consuming and unnecessary process. So now the intelligent programmer is left with writing some code that can actually do the swapping of resource, rather than let the resource get destroyed. This is again time consuming and boring work, but all have to do it increasing the source size. Won’t it be good if the language did it, hence reducing the burden from the programmer? Well C++ does exactly that with the move functionality. So it does something like:
Model3D& Model3D::operator = ( <move type> rhs ) {
//swap _ver
//swap _tex
}
This is the reason C++ creates an overload with the move type, which is a special type to tell the compiler to move the resources rather than do the delete and construct operation. With the move type in play, the compiler deals with the following choices:
- Move type must be a reference
- When there is a choice between two overloads where one is an ordinary reference and the other is the mystery type, then
rvalues
must prefer the mystery type lvalues
must prefer the ordinary reference
So what exactly is this move type? This is the rvalue
reference, i.e., Model3D&&
.
Model3D&
is called the lvalue
reference. So what are the properties of the rvalue
reference now?
- During function overload resolution
lvalue
preferslvalue
reference andrvalue
prefersrvalue
reference.
void f ( Model3D& m); // Lvalue reference overload.
void f ( Model3D&& m); // rvalue reference overload.
f ( g1 ); // Here g1 is lvalue, so call void f ( Model3D& m );
f ( retGraphics ( ) ); // Here rvalue is needed. so void f ( Model3D&& m ); is called.
- We can overload any function with
rvalue
. But mostly in practice, the copy constructor and assignment operator.
So what happens if you implement rvalue
and forget the lvalue
overloads? Well, try it yourself. We will cover it later.
For more on move and rvalue
, please refer to the blog.