## Changes From the Previous Version

Since the previous version of this article the following changes have been made:

- a section with more formal definitions of the proposed loop constructs have been added;
- the implementation pays more attention to the corner cases (when step is zero, when the loop iterates over the whole range of the corresponding integral type);
- the benchmarks contain results for compilers under Windows and under Linux.
- the section 'Tests for Correctness' has been added.

## Introduction

In this article, the following loops are proposed:

**for (auto&& ***x* : step(*start, count, step-width*)) { ... }

**for (auto ***x* : step(*count*)) {...}

**for (auto ***x* : step_backward(*count*)) {...}

**for (auto ***x* : step_to(*start, finish, step-width*)) {...}

**for (auto ***x* : step_until(*start, finish, step-width*)) {...}

They are based on the C++11 for-range loop. In order to write such loops, the appropriate functions `step`

, `step_backward`

,` step_to`

and `step_until `

have to be implemented. The article shows their implementation, discusses advantages of such loops and includes the benchmark results.

## Background

There are some cases when we need to produce values sequentially, for instance with a fixed step. If we need to use iterators, which is a natural thing to do, we have to store the values them in a container. This can be a problem if a sequence consists of several million elements. It will take much space if we store it in a container; we will have to provide a move constructor and move assignment operator, and, in general, be careful how to pass the containers between functions.

Using standard for-loops may be inconvenient: writing descending sequences using unsigned integers that stop at 0 is not staightforward. The following will not work:

for (std::size_t j = N-1; j >= 0; --j)
{
. . .
}

We have to write the following loop, which is not obvious for beginners:

for (std::size_t j = N; j-- != 0;)
{
. . .
}

Floating-point loops, which step through values of floating-point types, have also some issues. The following loop

for (double x = 0.5; x < 1.0; x += 0.1)
{
std::cout << std::setprecision(17) << x << std::endl;
}

will print (on most computers) something like that:

0.5
0.59999999999999998
0.69999999999999996
0.79999999999999993
0.89999999999999991
0.99999999999999989

We get six values instead of the "expected" five. In order to achieve the desired effect we can have two options. One of them is to use integer steps and write the following code:

double x = 0.5;
for (int i = 0; i < 5; ++i)
{
std::cout << std::setprecision(17) << x << std::endl;
x += 0.1;
}

Another option is to subtract a small epsilon from the upper bound:

double epsilon = 0.05;
double right_bound = 1.0 – epsilon;
for (double x = 0.5; x < right_bound; x += 0.1)
{
std::cout << std::setprecision(17) << x << std::endl;
}

In other programming languages (like Python [__1__], Pascal, etc) it is possible to use ranges with steps.

The implementation proposed in this article has been tested in VC++ CTP 2014, GNU C++ 4.9, GNU C++ 4.9.2 and Clang 3.6.

## Existing Implementations in C++ and Constructs in Other Languages

In Boost [2], there is a class template, called `irange`

, which is defined as follows:

template<class Integer>
iterator_range< range_detail::integer_iterator<Integer> >
irange(Integer first, Integer last);
template<class Integer, class StepSize>
iterator_range< range_detail::integer_iterator_with_step<Integer, StepSize> >
irange(Integer first, Integer last, StepSize step_size);

It can be used to iterate over a range [first, last) with a fixed step:

std::size_t m = 5;
for (auto x : boost::irange(0U, m))
{
std::cout << x << " ";
}
std::cout << std::endl;

This loop will output:

0 1 2 3 4

The last parameter is always excluded from the range. Negative steps are allowed:

std::vector<int> v = { 1, 2, 3, 4 };
for (auto i : boost::irange(static_cast<int>(v.size()) - 1, -1, -1))
{
std::cout << v[i] << std::endl;
}

The irange template allows to write loops for integer ranges. It is similar to the range function in Python [1].

Here are some issues. First of all the first and the last bounds of the loop should be of the same type. The following expressions will not compile:

boost::irange(0, m),
boost::irange(v.size() - 1, -1, -1)

Even **boost::irange(0U, m)** would fail in 64-bit code on most compilers. It should be written as follows:

boost::irange(0ULL, m)

or even better

boost::irange(static_cast<std::size_t>(0), m)

The second issue is that we prefer to use indice of `std::size_t`

and it is not convenient to use irange for loops with descending ranges using unsigned values. It would be better if the descending range excluded the first bound and included the last one.

For floating-point number, there is the linspace function, defined in the Nympy library for Python [3], which allows to write loops as follows:

for x in np.linspace(2.0, 3.0, num=5, endpoint=True):
print(x)

This loop will output the following values:

2.0
2.25
2.5
2.75
3.0

If, on the other hand, endpoint is defined as False, the output will be:

2.0
2.25
2.5
2.75

In MATLAB [4], there is a function called linspace as well, but it always includes the last bound.

## Proposed Loop Constructs

The following four types of loop constructs are proposed.

### 1. The **step** loop

#### 1.1. The three-parameter step loop

The following loop

for(auto&& control_variable : step(start_value, step_count, step_width)) statement

is equivalent to the following statement

{
auto control_variable = start_value;
auto __counter = step_count;
auto __stepw = step_width;
while (__counter-- != 0)
{
statement;
control_variable += __stepw;
}
}

where `__counter `

and `__stepw `

are defined for *exposition only *(which means that they are not explicitly available, and the actual implementation may use different variables to achieve the same effect). The type of `step_count `

must be integral.

##### Example 1

Here is a sample `step`

loop:

for (auto&& i : step(10, 5, -2))
{
std::cout << i << std::endl;
}

It will output the following values:

10
8
6
4
2

##### Example 2

The following loop will not output anything because `step_width`

is zero and the body will not be executed.

for (auto&& i : step(10, 10, 0))
{
std::cout << i << std::endl;
}

##### Example 3

The `<font face="Courier New">steps</font>`

loop can be used for generation not only arithmetic values:

for (auto c : step(std::string("!"), 6, std::string("*")))
{
std::cout << c << std::endl;
}

This code will print:

!
!*
!**
!***
!****
!*****

This means that the`<font face="Courier New"> steps </font>`

loop can also be used, for instance, to iterate over various user-defined Big Numbers.

#### 1.2. The two-parameter step loop

If the last parameter of step is absent, it is assumed to be of `int`

type with the value 1. So, the following

loop

for(auto&& control_variable : step(start_value, step_count)) statement

is equivalent to

for(auto&& control_variable : step(start_value, step_count, 1)) statement

##### Example

Here is sample two-parameter step loop:

for (auto&& i : step(3, 5))
{
std::cout << i << std::endl;
}

It will output this:

3
4
5
6
7

#### 1.3. The single-parameter step loop

The following loop

for(auto control_variable : step(step_count)) statement

is equivalent to this statement

for(auto&& control-variable : step(0,step-count, 1)) statement

##### Example

This is a sample single-parameter step loop:

for (auto i : step(5))
{
std::cout << i << std::endl;
}

It will output the following values:

0
1
2
3
4

### 2. The step_backward loop

This loop

for(auto control_variable : step_backward(step_count) statement

is equivalent to the following statement

{
auto __counter = step_count;
for(auto&& control_variable = step(__counter-1, __counter, -1) statement
}

where `__counter`

is defined for exposition only.

##### Example

This is a sample single-parameter step loop:

for (auto i : step_backward(5))
{
std::cout << i << std::endl;
}

It will output the following values:

4
3
2
1
0

### 3. The step_to loop

#### 3.1. The three-parameter step_to loop

This loop

for (auto control_variable : step_to(start_value, target_value, step_width) statement

is equivalent to the following statement

{
typename template CommonType<decltype(start_value),decltype(target_value)>::type __common_type;
__common_type control_variable = start_value;
__common_type __target = target_value;
auto __stepw = step_width;
if (__stepw > 0)
{
if (control_variable <= __target)
{
while (true)
{
statement;
if (control_variable > __target - __stepw)
break;
control_variable += __stepw;
}
}
}
else if (__stepw < 0)
{
if (control_variable >= __target)
{
while (true)
{
statement;
if (control_variable < __target - __stepw)
break;
control_variable += __stepw;
}
}
}
}

where `__common_type`

, `__target `

and `__stepw `

are defined for exposition only.

The type **CommonType<T,U>::type **is defined as follows:

- if
`T`

and `U`

** ** represent the same type, then this type is used; - otherwise,
**decltype(T()+U()) **is used.

The types of all the expressions `start_value`

, `target_value `

and `step_width `

must be integral. The value of `target_value `

might not be the final value of `control_variable`

. If `__common_type `

is `int`

or of the same or higher rank (`unsigned`

, `long`

, `unsigned long`

,` long long`

, `unsigned long long`

, etc.) then the effect of the loop over the whole range of values of this type are *implementation-dependent*, which means you should not rely on the results and avoid writing such loops.

##### Example 1

Here is a sample step_to loop:

for (auto i : step_to(10, 2, -2))
{
std::cout << i << std::endl;
}

It will output the following values:

10
8
6
4
2

##### Example 2

The effect of the following loop is implementation-dependent:

unsigned long long counter = 0;
for (auto i : step_to(std::numeric_limits<int>::max(), std::numeric_limits<int>::min(), -1))
{
counter++;
}
std::cout << counter << std::endl;

##### Example 3

The effect of this loop is properly defined (the loop counter `i`

will be of type `long long`

):

unsigned long long counter = 0;
long long max1 = std::numeric_limits<int>::max();
long long min1 = std::numeric_limits<int>::min();
for (auto i : step_to(max1, min1, -1))
{
counter++;
}
std::cout << counter << std::endl;

The result will be (a-b+1), which is in most implementations:

4294967296

#### 3.2. The two-parameter step_to loop

In `step_to`

, the last parameter can be absent. In this case it is assumed to of of int type with the value 1.

This loop

for (auto control_variable : step_to(start_value, target_value) statement

produces the same results as the following statement:

for (auto control_variable : step_to(start_value, target_value, 1) statement

### 4. The step_until loop

#### 4.1. The three-parameter step_until loop

The following loop

for (auto control_variable : step_until(start_value, target_value, step_width) statement

is equivalent to the following statement

{
typename template CommonType<decltype(start_value),decltype(target_value)>::type __common_type;
__common_type control_variable = start_value;
__common_type __target = target_value;
auto __stepw = step_width;
if (__stepw > 0)
{
if (control_variable < __target)
{
while (true)
{
statement;
if (control_variable >= __target - __stepw)
break;
control_variable += __stepw;
}
}
}
else if (__stepw < 0)
{
if (control_variable > __target)
{
while (true)
{
statement;
if (control_variable <= __target - __stepw)
break;
control_variable += __stepw;
}
}
}
}

where `<font face="Courier New">__common_type</font>`

, `__target `

and `__stepw `

are defined for exposition only. The types of all the expressions `start_value`

, `target_value `

and `step_width `

must be integral. The value of` target-value `

will never be assigned to control_variable.

##### Example 1

Here is a sample step_until loop:

for (auto i : step_until(10, 2, -2))
{
std::cout << i << std::endl;
}

It will output the following values:

10
8
6
4

##### Example 2

Here is another step_until loop:

unsigned long long counter = 0;
for (auto i : step_to(std::numeric_limits<int>::max(), std::numeric_limits<int>::min(), -1))
{
counter++;
}
std::cout << counter << std::endl;

This code will output the value of the expression

static_cast<long long>(std::numeric_limits<int>::max())-static_cast<long long>(std::numeric_limits<int>::min())

On most 32-bit and 64-bit systems the output will be:

4294967295

#### 4.2. The two-parameter step_until loop

In `step_until`

, the last parameter can be absent. In this case it is assumed to of of int type with the value 1.

This loop

for (auto control_variable : step_until(start_value, target_value) statement

produces the same results as the following statement:

for (auto control_variable : step_until(start_value, target_value, 1) statement

## Implementation

### Creating a Class to be Used in a For-Range Loop

In C++11 there is a for-range loop which allows to iterate over elements of a container. Here is an example:

std::vector<unsigned> a{ 5, 6, 7, 8 };
for (auto x : a)
{
std::cout << x << std::endl;
}

This loop is equivalent to the following loop:

for (auto it = a.begin(), itStop = a.end(); it != itStop; ++it)
{
auto x = *it;
std::cout << x << std::endl;
}

In the for-range loop, the class `a `

does not have to be a container. It can be a class, which does not store values: it can generate them through the iterator provided.

We need a starting value, a number of steps and a step width. The exact types of these values can vary depending on the loop we would like to write. In this case we have to create a template. Here is a general structure of the template:

template<class T, class N, class S>
class steps_class
{
public:
typedef *to-be-defined *iterator;
steps_class(T start, N step_count, S step);
iterator begin() const;
iterator end() const;
T first_value() const;
N size() const;
S step_length() const;
};

We can use this class as follows:

std::string str = "apples";
for (auto i : steps_class<std::size_t, std::size_t,int>(str.size()-1, str.size(), -1))
{
std::cout << str[i] << std::endl;
}

This will print:

s
e
l
p
p
a

The loop over floating-point values, discussed in the introduction, can be re-written as follows:

for (auto x : steps_class<double, unsigned, double>(0.5, 5, 0.1))
{
std::cout << x << std::endl;
}

Here is the full definition of `steps_class`

:

template<class T, class N, class S>
class steps_class
{
public:
class steps_iterator : public std::iterator<std::forward_iterator_tag, T>
{
N m_step_counter;
T m_counter;
S m_step;
public:
steps_iterator() : m_step_counter(N()) {}
steps_iterator(N steps_count, T start, S step) :m_step_counter(steps_count), m_counter(start), m_step(step)
{
}
steps_iterator& operator++()
{
--m_step_counter;
m_counter += m_step;
return *this;
}
steps_iterator operator++(int)
{
steps_iterator it = *this;
--m_step_counter;
m_counter += m_step;
return it;
}
const T& operator*() const
{
return m_counter;
}
const T* operator->() const
{
return &m_counter;
}
bool operator== (const steps_iterator& r) const
{
return m_step_counter == r.m_step_counter;
}
bool operator!= (const steps_iterator& r) const
{
return !(m_step_counter == r.m_step_counter);
}
};
typedef steps_iterator iterator;
private:
iterator m_end_iterator;
iterator m_start_iterator;
public:
steps_class(T start, N step_count, S step) : m_start_iterator(step_count, start, step)
{
}
iterator begin() const
{
return m_start_iterator;
}
iterator end() const { return m_end_iterator; }
};

### Implementation of Loop Constructs

#### 1. Implementation of the step loop

It is not really convenient to use `steps_class`

: it requires to explicitly specify all the types of the parameters. It is much easier to provide convenient functions for writing various loops. The three- and two-parameter `step`

loop can be implemented as follows:

template<class T, class N, class S = int>
inline auto step(T a, N n, S s = 1)->steps_class<decltype(a + s), N, S>
{
static_assert(std::is_integral<N>::value, "step: the second parameter should be of integral type");
return steps_class<decltype(a + s), N, S>(a, n, s);
}

The single-parameter `step`

loop implementation can be as follows :

template<class N>
inline steps_class<N, N, int> step(N n)
{
static_assert(std::is_integral<N>::value, "step: parameter should be of integral type");
return steps_class<N, N, int>(N(0), n, 1);
}

#### 2. Implementation of the step_backward loop

The `step_backward `

loop can be implemented as follows:

template<class N>
inline steps_class<N, N, int> step_backward(N n)
{
static_assert(std::is_integral<N>::value, "step_backward: parameter should be of integral type");
return steps_class<N, N, int>(n + int(-1), n, int(-1));
}

#### 3. Implementation of the step_to and step_until loops

First of all, we have to define the `CommonType`

template definition. Here is a simple definition of this template:

template<class T, class U>
class CommonType
{
typedef typename std::remove_reference<T>::type T1;
typedef typename std::remove_reference<U>::type U1;
public:
typedef decltype(T1() + U1()) type;
};
template<class T>
class CommonType<T,T>
{
typedef typename std::remove_reference<T>::type T1;
public:
typedef T1 type;
};

Since we may pass varaible as parameters to the `step_to`

and `step_until `

loops, the types T and U may be, for instance, `int&`

, `char&`

instead of simple: `int`

and `char`

. That's why we have to use `remove_reference`

.

In this implementation we will be using `steps_class`

to count the number of iterations. It is convenient to use `std::size_t`

and the counter type, unless its rank is less than the common type of the first two parameters (`start_value`

and` target_value`

). In this case we can start the definition of the step_to as follows:

template<class T, class U, class S = int,
class StartType = typename CommonType<T, U>::type,
class CounterType = typename CommonType<std::size_t, StartType>::type>
inline steps_class<StartType, CounterType, S> step_to(T a, U b, S s = 1)

Two extra types have been added to the template parameters: `StartType`

and `CounterType`

. The former will be the type of the values produced in the loop, the latter will be used to count the number of steps.

The full definition of those loop will be as follows:

template<class T, class U, class S = int,
class StartType = typename CommonType<T, U>::type,
class CounterType = typename CommonType<std::size_t, StartType>::type>
inline steps_class<StartType, CounterType, S> step_to(T a, U b, S s = 1)
{
static_assert(std::is_integral<StartType>::value, "step_to requires integral parameters");
CounterType n = 0;
if (s == 0 || b != a && (b < a) != (s < 0))
{
n = 0;
}
else
{
S abs_s = s > 0 ? s : -s;
CounterType diff = b > a ? (long long)b - (long long)a : (long long)a - (long long)b;
if (abs_s == 1)
n = diff;
else
n = diff / abs_s;
n++;
}
return steps_class<StartType, CounterType, S>(a, n, s);
}
template<class T, class U, class S = int,
class StartType = typename CommonType<T, U>::type,
class CounterType = typename CommonType<std::size_t, StartType>::type>
inline steps_class<StartType, CounterType, S> step_until(T a, U b, S s = 1)
{
static_assert(std::is_integral<StartType>::value, "step_until requires integral parameters");
CounterType n;
if (b == a || s == 0 || (b < a) != (s < 0))
{
n = 0;
}
else
{
S abs_s = s > 0 ? s : -s;
CounterType diff = b > a ? (long long)b - (long long)a : (long long)a - (long long)b;
if (abs_s == 1)
n = diff;
else
n = diff / abs_s;
CounterType finish = a + n*s;
if (finish != b)
{
n++;
}
}
return steps_class<StartType, CounterType, S>(a, n, s);
}

## Other Examples

The `steps`

function can be used for generation not only arithmetic values:

for (auto c : step(std::string("!"), 6, std::string("*")))
{
std::cout << c << std::endl;
}

This code will print:

!
!*
!**
!***
!****
!*****

This means that the` steps `

function can also be used for various user-defined Big Numbers.

## Benchmarks

The following loops were tested for benchmarks:

- T1 - two-level loops with single-parameter steps;
- T2 - two-level loops, stepping through vector indices with the step width of 2;
- T3 - two-level loops, stepping through vector indices with the step width of 5;
- T4 - integral calculation using the 7-point open Newton-Cotes rule.

I ran tests under Windows 8.1 and under Linux Mint 17 Qiana. The computers were different:

- Windows ran on a computer with 8GB RAM and a Quad Core i7-4790, 3.6 GHz processor;
- Linux ran on a computer with 6GB RAM and a Dual Core i5-2410M, 2.3 GHz processor.

The following compilers were used:

- VC++ 2014;
- GCC C++ 4.9.0;
- GCC C++ 4.9.2;
- Clang 3.6

For VC++ the following optimizations were set up:

For the other compilers the option **-O3** was used.

The results slightly differ in various compilers and there was some difference between the 32-bit and 64-bit models. The results for Windows are shown in Figure 1.

#### Figure 1. Benchmark results under Windows.

The timings are in milliseconds; percentage was calculated as follows:

**100*(timing_steps - timing_standard)/timing_standard**

The negative percentage means that the steps constructs performed better. I have also highlighted in yellow the results which were close (differed within 1% from one another), in green -- if the steps code peformed better by more than 1%, and in red if the steps perform worse by more than 1%.

The Linux results are shown in Figure 2.

#### Figure 2. Benchmark results under Linux

Overall, the results for the steps constructs showed a very good performance.

## Tests for Correctness

The tests contain various loops with defferent ranges. The implementation-dependent features for stop_to loops have been included as well. In general, the following issues in different implemetations have been observed (they are well-known issues):

- in VC++ in both 32-bit and 64-bit code, the constant
**-2147483648 **is of type `unsigned`

and has the value 2147483648, which is diffrent from std::numeric_limits<int>::min(), which has correct value; at the same time **-2147483648LL **is** **correct; - in GCC and Clang the constant
**-2147483648 **takes 8 bytes in memory.

Both these issues make `step_to`

loop behave differently in different compilers. That is the reason I mentioned some implementation-dependent issues. In the provided tests, the implementation depended issues are categorized as "acceptable errors": they do not produce results that we want, but they are acceptable. We should not rely on the implementation-dependent code anyway and avoid such loops.

Other issues include floating-point calculations. The provided tests compared like with like and produced exactly the same results using standard loops and the `step`

loops. But it is possible to slightly modify those tests so that, with aggressive optimization, they may produce different results, although the the calculations used look the same. There are several reasons for this. I will name two main ones:

- possible change in the order of calculations (for instance, summation of floating-point numbers, due two rounding, may produce different results);
- the Intel floating-point arithmetic traditionally uses 80-bit floating-point numbers, but calculations on
**xmm** registers (which usually produce faster code) use 64-bit numbers (there is an obvious difference in precision).

## References

[1]. Python 3.2 Build-in Function.: https://docs.python.org/3.2/library/functions.html

[2]. Boost irange: http://www.boost.org/doc/libs/1_55_0/libs/range/doc/html/range/reference/ranges/irange.html

[3]. NumPy linspace: http://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html

[4]. MATLAB linspace: http://www.mathworks.co.uk/help/matlab/ref/linspace.html

This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.