Click here to Skip to main content
13,731,369 members
Click here to Skip to main content
Add your own
alternative version

Stats

5.8K views
70 downloads
9 bookmarked
Posted 30 Dec 2017
Licenced CPOL

Pythonic Operators on STL Set Algorithms

, 16 Jun 2018
Rate this:
Please Sign up or sign in to vote.
Overloaded Operators to write concise code on STL Set Algorithms

Table of Contents

Introduction

Ever since I started working with Python and that have gotten me into a lot of thinking how to redesign my libraries to be pythonic, if I were to implement them from scratch again. In this first article of the series, I want to introduce Python's wonderful and intuitive operators for working with set algebra into C++ world. These operators are nothing more than syntatic-sugar to reduce the amount of code to write.

Table of Python Set Operators

Set Intersection

C++ Reference: std::set_intersection

  • Commutative

set_intersection is an algorithm that produces a set of elements which are common to both sets. It is commutative, meaning that even the 2 sets are switched places, the algorithm returns the same result.

void intersection_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> v_intersection;

    std::set_intersection(v1.begin(), v1.end(),
        v2.begin(), v2.end(),
        std::back_inserter(v_intersection));

    for (int n : v_intersection)
        std::cout << n << ' ';
}

Output

5 7

This is the example using & operator to do intersection.

void intersection_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };

    std::vector<int> v_intersection = s(v1) & s(v2);

    for (int n : v_intersection)
        std::cout << n << ' ';
}

I skip showing the output of the operator example as it is the same.

s is a function, not class. If s is to be a class, to instantiate it, a container type would have to be specified (see below).

std::vector<int> v_intersection = s<std::vector<int>>(v1) & s<std::vector<int>>(v2);

In order to make use of automatic type deduction, s has to be a function that does nothing but returns the wrapper class.

#include <algorithm>
#include <iterator>

template<typename T>
struct wrapper
{
    wrapper(T& container) : cont(container) {}
    T& cont;
};

template<typename T>
wrapper<T> s(T& s_cont)
{
    return wrapper<T>(s_cont);
}

The & operator function checks whether to sort the container. Since std::sort only works with random access iterators, so we cannot use this function with STL list and slist which has non-random access iterators. In my 15 years of work, I have not seen a single use of list in any codebase.

template<typename T>
T operator&(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T v_intersection;

    std::set_intersection(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(v_intersection));

    return v_intersection;
}

All set algorithm precondition requires the ranges to be sorted, hence this is_sorted check.

Set Union

C++ Reference: std::set_union

  • Commutative

set_union is an algorithm that produces a set of elements from both sets. For the elements appearing in intersection, it always picks them from the 1st set, not 2nd set.

void union_example()
{
    std::vector<int> v1 = { 1, 2, 3, 4, 5 };
    std::vector<int> v2 = {       3, 4, 5, 6, 7 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> dest1;

    std::set_union(v1.begin(), v1.end(),
        v2.begin(), v2.end(),
        std::back_inserter(dest1));

    for (const auto &i : dest1) {
        std::cout << i << ' ';
    }
    std::cout << '\n';
}

Output

1 2 3 4 5 6 7

The code required to write is much less, therefore the code is more concise.

void union_example()
{
    std::vector<int> v1 = { 1, 2, 3, 4, 5 };
    std::vector<int> v2 = {       3, 4, 5, 6, 7 };

    std::vector<int> dest1 = s(v1) | s(v2);

    for (int n : dest1)
        std::cout << n << ' ';
}

The | operator is almost similar to & operator except that algorithm is different.

template<typename T>
T operator|(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T dest1;

    std::set_union(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(dest1));

    return dest1;
}

Set Difference

C++ Reference: std::set_difference

  • Non-Commutative

set_difference returns the elements in 1st set which is not in the 2nd set and is represented by minus operator in Python. For obvious reasons, the results are different when the arguments have swapped places. set_difference is non-commutative like minus operation.

void set_difference_example() 
{
    std::vector<int> v1{ 1, 2, 5, 5, 5,    9 };
    std::vector<int> v2{    2, 5,       7 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> diff;

    std::set_difference(v1.begin(), v1.end(), v2.begin(), v2.end(),
        std::inserter(diff, diff.begin()));

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "minus ";
    for (auto i : v2) std::cout << i << ' ';
    std::cout << "is: ";

    for (auto i : diff) std::cout << i << ' ';
    std::cout << '\n';
}

Output

1 2 5 5 5 9 minus 2 5 7 is: 1 5 5 9

This is an example with minus operator.

void set_difference_example()
{
    std::vector<int> v1{ 1, 2, 5, 5, 5, 9 };
    std::vector<int> v2{    2, 5,       7 };

    std::vector<int> diff = s(v1) - s(v2);

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "minus ";
    for (auto i : v2) std::cout << i << ' ';
    std::cout << "is: ";

    for (auto i : diff) std::cout << i << ' ';
    std::cout << '\n';
}

The code for minus operator is shown below:

template<typename T>
T operator-(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T diff;

    std::set_difference(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(diff));

    return diff;
}

Set Symmetric Difference

C++ Reference: std::set_symmetric_difference

  • Commutative

set_symmetric_difference computes the elements in either set but not both.

void set_symmetric_difference_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::vector<int> v_symDifference;

    std::set_symmetric_difference(
        v1.begin(), v1.end(),
        v2.begin(), v2.end(),
        std::back_inserter(v_symDifference));

    for (int n : v_symDifference)
        std::cout << n << ' ';
}

Output

1 2 3 4 6 8 9 10

set_symmetric_difference is represented by logical exclusive or operator.

void set_symmetric_difference_example()
{
    std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
    std::vector<int> v2{         5,  7,  9,10 };

    std::vector<int> v_symDifference = s(v1) ^ s(v2);

    for (int n : v_symDifference)
        std::cout << n << ' ';
}

The code for logical exclusive or operator is shown below:

template<typename T>
T operator^(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;
    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    T v_symDifference;

    std::set_symmetric_difference(c1.begin(), c1.end(),
        c2.begin(), c2.end(),
        std::back_inserter(v_symDifference));

    return v_symDifference;
}

Superset and Subset

C++ Reference: std::includes

  • Non-Commutative

STL includes can be used to find out whether a set is a superset (returns a boolean). To check if it is a subset, just switch the 2 sets.

void is_superset_example()
{
    std::vector<char> v1{ 'a', 'b', 'c', 'f', 'h', 'x' };
    std::vector<char> v2{ 'a', 'b', 'c' };
    std::vector<char> v3{ 'a', 'c' };
    std::vector<char> v4{ 'g' };
    std::vector<char> v5{ 'a', 'c', 'g' };
    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());
    std::sort(v3.begin(), v3.end());
    std::sort(v4.begin(), v4.end());
    std::sort(v5.begin(), v5.end());

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "\nincludes:\n" << std::boolalpha;

    for (auto i : v2) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v2.begin(), v2.end()) << '\n';
    for (auto i : v3) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v3.begin(), v3.end()) << '\n';
    for (auto i : v4) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v4.begin(), v4.end()) << '\n';
    for (auto i : v5) std::cout << i << ' ';
    std::cout << ": " 
              << std::includes(v1.begin(), v1.end(), v5.begin(), v5.end()) << '\n';

    auto cmp_nocase = [](char a, char b) {
        return std::tolower(a) < std::tolower(b);
    };

    std::vector<char> v6{ 'A', 'B', 'C' };
    for (auto i : v6) std::cout << i << ' ';
    std::cout << ": (case-insensitive) "
        << std::includes(v1.begin(), v1.end(), v6.begin(), v6.end(), cmp_nocase)
        << '\n';
}

Output

a b c f h x
includes:
a b c : true
a c : true
g : false
a c g : false
A B C : (case-insensitive) true

The >= operator example is below. The <= operator example is not shown in this article.

void is_superset_example()
{
    std::vector<char> v1{ 'a', 'b', 'c', 'f', 'h', 'x' };
    std::vector<char> v2{ 'a', 'b', 'c' };
    std::vector<char> v3{ 'a', 'c' };
    std::vector<char> v4{ 'g' };
    std::vector<char> v5{ 'a', 'c', 'g' };

    for (auto i : v1) std::cout << i << ' ';
    std::cout << "\nincludes:\n" << std::boolalpha;

    for (auto i : v2) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v2)) << '\n';
    for (auto i : v3) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v3)) << '\n';
    for (auto i : v4) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v4)) << '\n';
    for (auto i : v5) std::cout << i << ' ';
    std::cout << ": " << (s(v1) >= s(v5)) << '\n';

    auto cmp_nocase = [](char a, char b) {
        return std::tolower(a) < std::tolower(b);
    };

    std::vector<char> v6{ 'A', 'B', 'C' };
    for (auto i : v6) std::cout << i << ' ';
    std::cout << ": (case-insensitive) "
        << std::includes(v1.begin(), v1.end(), v6.begin(), v6.end(), cmp_nocase)
        << '\n';
}

User cannot opt for use of a custom comparator in the >= and <= overloaded operators at the moment, as shown in the case-insensitive example. In this situation, includes has to be called directly.

// Returns true if left is superset of right?
template<typename T>
bool operator>=(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;

    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    return std::includes(
        c1.begin(), c1.end(),
        c2.begin(), c2.end());
}

// Returns true if left is subset of right?
template<typename T>
bool operator<=(wrapper<T>& left, wrapper<T>& right)
{
    T& c1 = left.cont;
    T& c2 = right.cont;

    if (!std::is_sorted(c1.begin(), c1.end()))
        std::sort(c1.begin(), c1.end());
    if (!std::is_sorted(c2.begin(), c2.end()))
        std::sort(c2.begin(), c2.end());

    return std::includes(
        c2.begin(), c2.end(),
        c1.begin(), c1.end());
}

"I Have No Use For All These!"

Before you are quick to exclaim that you have no use for these set algorithms, I like to show to you a typical selection example where you can use this. Imagine you are writing a subject enrollment website for college students. On the form, there are currently selected subjects which the student added, and the available subject dropdown which student can pick. It makes sense to remove subject from available dropdown after addition because you do not want the student to accidentally add the same subject twice. One way to compute leftover subjects available for selection, is to just subtract the selected set from the complete set of subjects with minus operator introduced in this article.

Article source code is hosted at Github

Article related to the performance of set-intersection: Intersection of ordered sets

History

  • 30 Dec 2017: First version
  • 16 Jun 2018: Version 1.1 Remove the std::move during return to enable the "Named Return Value Optimization"

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Shao Voon Wong
Software Developer (Senior)
Singapore Singapore
Right now, I am picking up DevOps skills at Pluralsight and pursuing CCNA certification. Stay tuned for my CCNA related article!

Coding Tidbit Blog

Latest blogpost: C++ – The Forgotten Trojan Horse by Eric Johnson

IT Certifications

  • IT Infrastructure Library Foundational (ITIL v3)
  • Scrum Alliance Certified Scrum Master (CSM)
  • Certified Secure Software Lifecycle Professional (CSSLP)

View my certificates here.

You may also be interested in...

Comments and Discussions

 
GeneralMy vote of 5 Pin
Degryse Kris1-Jan-18 23:37
memberDegryse Kris1-Jan-18 23:37 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web05-2016 | 2.8.180920.1 | Last Updated 16 Jun 2018
Article Copyright 2017 by Shao Voon Wong
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid