Modern C++
Programming
19. Advanced Topics II
Federico Busato
2024-01-28
Table of Contents
1 Undefined Behavior
Undefined Behavior Common Cases
Detecting Undefined Behavior
2 Error Handing
C++ Exceptions
Defining Custom Exceptions
noexcept Keyword
Memory Allocation Issues
Alternative Error Handling Approaches
1/64
Table of Contents
3 C++ Idioms
Rule of Zero/Three/Five
Singleton
PIMPL
CRTP
Template Virtual Functions
2/64
Table of Contents
4 Smart pointers
std::unique ptr
std::shared ptr
std::weak ptr
5 Concurrency
Thread Methods
Mutex
Atomic
Task-based parallelism
3/64
Undefined Behavior
Undefined Behavior Overview
Undefined behavior means that the semantic of certain operations is
undefined/unspecified behavior: outside the language/library specification, two or
more choices
illegal, and the compiler presumes that such operations never happen
Motivations behind undefined behavior:
Compiler optimizations, e.g. signed overflow or NULL pointer dereferencing
Simplify compile checks
Some undefined behavior cases provide an implementation-defined behavior depending
on the compiler and platform. In this case, the code is not portable
What Every C Programmer Should Know About Undefined Behavior, Chris Lattner
What are all the common undefined behaviours that a C++ programmer should know
about?
4/64
Undefined Behavior Common Cases 1/5
const cast applied to a const variables
const int var = 3;
const_cast<int&>(var) = 4;
... // use var
Memory alignment
char* ptr = new char[512];
auto ptr2 = reinterpret_cast<uint64_t*>(ptr + 1);
ptr2[3]; // ptr2 is not aligned to 8 bytes (sizeof(uint64_t))
Memory initialization
int var; // undefined value
auto var2 = new int; // undefined value
5/64
Undefined Behavior Common Cases 2/5
Memory access-related
NULL pointer dereferencing
Out-of-bound access: the code could crash or not depending on the
platform/compiler
Platform specific behavior
Endianness
union U {
unsigned x;
char y;
};
Type definition
long x = 1ul << 32u; // different behavior depending on the OS
Intrinsic functions
6/64
Undefined Behavior Common Cases 3/5
Strict aliasing
float x = 3;
auto y = reinterpret_cast<unsigned&>(x);
// x, y break the strict aliasing rule
Lifetime issues
int* f() {
int tmp[10];
return tmp;
}
int* ptr = f();
ptr[0];
7/64
Undefined Behavior Common Cases 4/5
Operations unspecified behavior
A legal operation but the C++ standard does not document the result
Signed shift -2 x (before C++20), large-than-type shift 3u 32 ,
etc.
Arithmetic operation ordering f(i++, i++)
Function evaluation ordering
auto x = f() + g(); // C++ doesn't ensure that f() is evaluated before g()
Signed overflow
for (int i = 0; i < N; i++)
if N is INT MAX , the last iteration is undefined behavior. The compiler can assume that
the loop is finite and enable important optimizations, as opposite to unsigned (wrap
around)
8/64
Undefined Behavior Common Cases 5/5
One Definition Rule violation
Different definitions of inline functions in distinct translation units
Missing return statement
int f(float x) {
int y = x * 2;
}
Dangling reference
iint n = 1;
const int& r = std::max(n-1, n+1); // dagling
// GCC 13 experimental -Wdangling-reference (enabled by -Wall)
9/64
Detecting Undefined Behavior
There are several ways to detect undefined behavior at compile-time and at run-time:
Using GCC/Clang undefined behavior sanitizer (run-time check)
Static analysis tools
Use constexpr expressions as undefined behavior is not allowed
constexpr int x1 = 2147483647 + 1; // compile error
constexpr int x2 = (1 << 32); // compile error
constexpr int x3 = (1 << -1); // compile error
constexpr int x4 = 3 / 0; // compile error
constexpr int x5 = *((int*) nullptr) // compile error
constexpr int x6 = 6
constexpr float x7 = reinterpret_cast<float&>(x6); // compile error
Exploring Undefined Behavior Using Constexpr
10/64
Error Handing
Recoverable Error Handing
Recoverable Conditions that are not under the control of the program. They indicates
“exceptional” run-time conditions. e.g. file not found, bad allocation, wrong user
input, etc.
The common ways for handling recoverable errors are:
Exceptions Robust but slower and requires more resources
Error values Fast but difficult to handle in complex programs
Modern C++ best practices for exceptions and error handling
Back to Basics: Exceptions - CppCon2020
ISO C++ FAQ: Exceptions and Error Handling
Zero-overhead deterministic exceptions: Throwing values
C++ exceptions are becoming more and more problematic
11/64
C++ Exceptions - Advantages
C++ Exceptions provide a well-defined mechanism to detect errors passing the
information up the call stack
Exceptions cannot be ignored. Unhandled exceptions stop program execution
(call std::terminate() )
Intermediate functions are not forced to handle them. They don’t have to
coordinate with other layers and, for this reason, they provide good composability
Throwing an exception acts like a return statement destroying all objects in the
current scope
An exception enables a clean separation between the code that detects the error
and the code that handles the error
Exceptions work well with object-oriented semantic (constructor)
12/64
C++ Exceptions - Disadvantages 1/2
Code readability: Using exception can involve more code than the functionality
itself
Code comprehension: Exception control flow is invisible and it is not explicit in
the function signature
Performance: Extreme performance overhead in the failure case (violate the
zero-overhead principle)
Dynamic behavior: throw requires dynamic allocation and catch requires
RTTI. It is not suited for real-time, safety-critical, or embedded systems
Code bloat: Exceptions could increase executable size by 5-15% (or more*)
*Binary size and exceptions
13/64
C++ Exceptions - Disadvantages 2/2
14/64
C++ Exception Basics
C++ provides three keywords for exception handling:
throw Throws an exception
try Code block containing potential throwing expressions
catch Code block for handling the exception
void f() { throw 3; }
int main() {
try {
f();
} catch (int x) {
cout << x; // print "3"
}
}
15/64
std Exceptions
throw can throw everything such as integers, pointers, objects, etc. The standard
way consists in using the std library exceptions <stdexcept>
# include <stdexcept>
void f(bool b) {
if (b)
throw std::runtime_error("runtime error");
throw std::logic_error("logic error");
}
int main() {
try {
f(false);
} catch (const std::runtime_error& e) {
cout << e.what();
} catch (const std::exception& e) {
cout << e.what(); // print: "logic error"
}
}
16/64
Exception Capture
NOTE: C++, differently from other programming languages, does not require explicit
dynamic allocation with the keyword new for throwing an exception. The compiler
implicitly generates the appropriate code to construct and clean up the exception
object. Dynamically allocated objects require a delete call
The right way to capture an exception is by const -reference. Capturing by-value is
also possible but, it involves useless copy for non-trivial exception objects
catch(...) can be used to capture any thrown exception
int main() {
try {
throw "runtime error"; // throw const char*
} catch (...) {
cout << "exception"; // print "exception"
}
}
17/64
Exception Propagation
Exceptions are automatically propagated along the call stack. The user can also
control how they are propagated
int main() {
try {
...
} catch (const std::runtime_error& e) {
throw e; // propagate a copy of the exception
} catch (const std::exception& e) {
throw; // propagate the exception
}
}
18/64
Defining Custom Exceptions
# include <exception> // to not confuse with <stdexcept>
struct MyException : public std::exception {
const char* what() const noexcept override { // could be also "constexpr"
return "C++ Exception";
}
};
int main() {
try {
throw MyException();
} catch (const std::exception& e) {
cout << e.what(); // print "C++ Exception"
}
}
19/64
noexcept Keyword
C++03 allows listing the exceptions that a function might directly or indirectly throw,
e.g. void f() throw(int, const char*) {
C++11 deprecates throw and introduces the noexcept keyword
void f1(); // may throw
void f2() noexcept; // does not throw
void f3() noexcept(true); // does not throw
void f4() noexcept(false); // may throw
template<bool X>
void f5() noexcept(X); // may throw if X is false
If a noexcept function throw an exception, the runtime calls std::terminate()
noexcept should be used when throwing an exception is impossible or unacceptable.
It is also useful when the function contains code outside user control, e.g. std
functions/objects
20/64
Function-try-block
Exception handlers can be defined around the body of a function
void f() try {
... // do something
} catch (const std::runtime_error& e) {
cout << e.what();
} catch (...) { // other exception
...
}
21/64
Memory Allocation Issues 1/4
The new operator automatically throws an exception ( std::bad alloc ) if it cannot
allocate the memory
delete never throws an exception (unrecoverable error)
int main() {
int* ptr = nullptr;
try {
ptr = new int[1000];
}
catch (const std::bad_alloc& e) {
cout << "bad allocation: " << e.what();
}
delete[] ptr;
}
22/64
Memory Allocation Issues 2/4
C++ also provides an overload of the new operator with non-throwing memory
allocation
# include <new> // std::nothrow
int main() {
int* ptr = new (std::nothrow) int[1000];
if (ptr == nullptr)
cout << "bad allocation";
}
23/64
Memory Allocation Issues 3/4
Throwing exceptions in constructors is fine while it is not allowed in destructors
struct A {
A() { new int[10]; }
A() { throw -2; }
};
int main() {
try {
A a; // could throw "bad_alloc"
// "a" is out-of-scope -> throw 2
} catch (...) {
// two exceptions at the same time
}
}
Destructors should be marked noexcept
24/64
Memory Allocation Issues 4/4
struct A {
int* ptr1, *ptr2;
A() {
ptr1 = new int[10];
ptr2 = new int[10]; // if bad_alloc here, ptr1 is lost
}
};
struct A {
std::unique_ptr<int> ptr1, ptr2;
A() {
ptr1 = std::make_unique<int[]>(10);
ptr2 = std::make_unique<int[]>(10); // if bad_alloc here,
} // ptr1 is deallocated
};
25/64
Alternative Error Handling Approaches 1/2
Global state, e.g. errno
- Easily forget to check for failures
- Error propagation using if statements and early return is manual
- No compiler optimizations due to global state
Simple error code, e.g. int , enum , etc.
- Easily forget to check for failures (workaround [[nodiscard]] )
- Error propagation using if statements and early return is manual
- Potential error propagation through different contexts and losing initial error
information
- Constructor errors cannot be handled
26/64
Alternative Error Handling Approaches 2/2
std::error code , standardized error code
- Easily forget to check for failures (workaround [[nodiscard]] )
- Error propagation using if statements and early return is manual
- Code bloating for adding new enumerators (see Your own error code)
- Constructor errors cannot be handled
Supporting libraries, e.g. Boost Outcome, STX, etc.
- Require external dependencies
- Constructor errors cannot be handled in a direct way
- Extra logic for managing return values
27/64
C++ Idioms
Rule of Zero
The Rule of Zero is a rule of thumb for C++
Utilize the value semantics of existing types to avoid having to implement custom
copy and move operations
Note: many classes (such as std classes) manage resources themselves and should not
implement copy/move constructor and assignment operator
class X {
public:
X(...); // constructor
// NO need to define copy/move semantic
private:
std::vector<int> v; // instead raw allocation
std::unique_ptr<int> p; // instead raw allocation
}; // see smart pointer
28/64
Rule of Three
The Rule of Three is a rule of thumb for C++(03)
If your class needs any of
a copy constructor X(const X&)
an assignment operator X& operator=(const X&)
or a destructor X()
defined explicitly, then it is likely to need all three of them
Some resources cannot or should not be copied. In this case, they should be declared
as deleted
X(const X&) = delete
X& operator=(const X&) = delete
29/64
Rule of Five
The Rule of Five is a rule of thumb for C++11
If your class needs any of
a copy constructor X(const X&)
a move constructor X(X&&)
an assignment operator X& operator=(const X&)
an assignment operator X& operator=(X&&)
or a destructor X()
defined explicitly, then it is likely to need all five of them
30/64
Singleton
Singleton is a software design pattern that restricts the instantiation of a class to one
and only one object (a common application is for logging)
class Singleton {
public:
static Singleton& get_instance() { // note "static"
static Singleton instance { ..init.. } ;
return instance; // destroyed at the end of the program
} // initiliazed at first use
Singleton(const& Singleton) = delete;
void operator=(const& Singleton) = delete;
void f() {}
private:
T _data;
Singleton( ..args.. ) { ... } // used in the initialization
}
31/64
PIMPL - Compilation Firewalls
Pointer to IMPLementation (PIMPL) idiom allows decoupling the interface from
the implementation in a clear way
header.hpp
class A {
public:
A();
A();
void f();
private:
class Impl; // forward declaration
Impl* ptr; // opaque pointer
};
NOTE: The class does not expose internal data members or methods
32/64
PIMPL - Implementation
source.cpp (Impl actual implementation)
class A::Impl { // could be a class with a complex logic
public:
void internal_f() {
..do something..
}
private:
int _data1;
float _data2;
};
A::A() : ptr{new Impl()} {}
A::A() { delete ptr; }
void A::f() { ptr->internal_f(); }
33/64
PIMPL - Advantages, Disadvantages
Advantages:
ABI stability
Hide private data members and methods
Reduce compile type and dependencies
Disadvantages:
Manual resource management
- Impl* ptr can be replaced by unique ptr<impl> ptr in C++11
Performance: pointer indirection + dynamic memory
- dynamic memory could be avoided by using a reserved space in the interface e.g.
uint8 t data[1024]
34/64
PIMPL - Implementation Alternatives
What parts of the class should go into the Impl object?
Put all private and protected members into Impl :
Error prone. Inheritance is hard for opaque objects
Put all private members (but not functions) into Impl :
Good. Do we need to expose all functions?
Put everything into Impl , and write the public class itself as only the public
interface, each implemented as a simple forwarding function:
Good
https://herbsutter.com/gotw/ 100/
35/64
CRTP 1/3
The Curiously Recurring Template Pattern (CRTP) is an idiom in which a class
X derives from a class template instantiation using X itself as template argument
A common application is static polymorphism
template <class T>
struct Base {
void my_method() {
static_cast<T*>(this)->my_method_impl();
}
};
class Derived : public Base<Derived> {
// void my method() is inherited
void my_method_impl() { ... } // private method
};
36/64
CRTP 2/3
# include <iostream>
template <typename T>
struct Writer {
void write(const char* str) {
static_cast<const T*>(this)->write_impl(str);
}
};
class CerrWriter : public Writer<CerrWriter> {
void write_impl(const char* str) { std::cerr << str; }
};
class CoutWriter : public Writer<CoutWriter> {
void write_impl(const char* str) { std::cout << str; }
};
CoutWriter x;
CerrWriter y;
x.write("abc");
y.write("abc");
37/64
CRTP 3/3
template <typename T>
void f(Writer<T>& writer) {
writer.write("abc);
}
CoutWriter x;
CerrWriter y;
f(x);
f(y);
CRTP C++ Examples
38/64
Template Virtual Function 1/3
Virtual functions cannot have template arguments, but they can be emulated by
using the following pattern
class Base {
public:
template<typename T>
void method(T t) {
v_method(t); // call the actual implementation
}
protected:
virtual void v_method(int t) = 0; // v_method is valid only
virtual void v_method(double t) = 0; // for "int" and "double"
};
39/64
Template Virtual Function 2/3
Actual implementations for derived class A and B
class AImpl : public Base {
protected:
template<typename T>
void t_method(T t) { // template "method()" implementation for A
std::cout << "A " << t << std::endl;
}
};
class BImpl : public Base {
protected:
template<typename T>
void t_method(T t) { // template "method()" implementation for B
std::cout << "B " << t << std::endl;
}
};
40/64
Template Virtual Function 3/3
template<class Impl>
class DerivedWrapper : public Impl {
private:
void v_method(int t) override {
Impl::t_method(t);
}
void v_method(double t) override {
Impl::t_method(t);
} // call the base method
};
using A = DerivedWrapper<AImpl>;
using B = DerivedWrapper<BImpl>;
int main(int argc, char* argv[]) {
A a;
B b;
Base* base = nullptr;
base = &a;
base->method(1); // print "A 1"
base->method(2.0); // print "A 2.0"
base = &b;
base->method(1); // print "B 1"
base->method(2.0); // print "B 2.0"
}
method() calls v method() (pure virtual method of Base )
v method() calls t method() (actual implementation)
41/64
Smart pointers
Smart Pointers
Smart pointer is a pointer-like type with some additional functionality, e.g. automatic
memory deallocation (when the pointer is no longer in use, the memory it points to is
deallocated), reference counting, etc.
C++11 provides three smart pointer types:
std::unique ptr
std::shared ptr
std::weak ptr
Smart pointers prevent most situations of memory leaks by making the memory
deallocation automatic
C++ Smart Pointers
42/64
Smart Pointers Benefits
If a smart pointer goes out-of-scope, the appropriate method to release resources
is called automatically. The memory is not left dangling
Smart pointers will automatically be set to nullptr if not initialized or when
memory has been released
std::shared ptr provides automatic reference count
If a special delete function needs to be called, it will be specified in the pointer
type and declaration, and will automatically be called on delete
43/64
std::unique ptr - Unique Pointer 1/4
std::unique ptr is used to manage any dynamically allocated object that is not
shared by multiple objects
# include <iostream>
# include <memory>
struct A {
A() { std::cout << "Constructor\n"; } // called when A()
A() { std::cout << "Destructor\n"; } // called when u_ptr1,
}; // u_ptr2 are out-of-scope
int main() {
auto raw_ptr = new A();
std::unique_ptr<A> u_ptr1(new A());
std::unique_ptr<A> u_ptr2(raw_ptr);
// std::unique_ptr<A> u_ptr3(raw_ptr); // no compile error, but wrong!! (not unique)
// u_ptr1 = raw_ptr; // compile error (not unique)
// u_ptr1 = u_ptr2; // compile error (not unique)
u_ptr1 = std::move(u_ptr2); // delete u_ptr1;
} // u_ptr1 = u_ptr2;
// u_ptr2 = nullptr
44/64
std::unique ptr - Unique Pointer 2/4
std::unique ptr methods
get() returns the underlying pointer
operator* operator-> dereferences pointer to the managed object
operator[] provides indexed access to the stored array (if it supports random
access iterator)
release() returns a pointer to the managed object and releases the ownership
reset(ptr) replaces the managed object with ptr
Utility method: std::make unique<T>() creates a unique pointer to a class T
that manages a new object
45/64
std::unique ptr - Unique Pointer 3/4
# include <iostream>
# include <memory>
struct A {
int value;
};
int main() {
std::unique_ptr<A> u_ptr1(new A());
u_ptr1->value; // dereferencing
(*u_ptr1).value; // dereferencing
auto u_ptr2 = std::make_unique<A>(); // create a new unique pointer
u_ptr1.reset(new A()); // reset
auto raw_ptr = u_ptr1.release(); // release
delete[] raw_ptr;
std::unique_ptr<A[]> u_ptr3(new A[10]);
auto& obj = u_ptr3[3]; // access
}
46/64
std::unique ptr - Unique Pointer 4/4
Implement a custom deleter
# include <iostream>
# include <memory>
struct A {
int value;
};
int main() {
auto DeleteLambda = [](A* x) {
std::cout << "delete" << std::endl;
delete x;
};
std::unique_ptr<A, decltype(DeleteLambda)>
x(new A(), DeleteLambda);
} // print "delete"
47/64
std::shared ptr - Shared Pointer 1/3
std::shared ptr is the pointer type to be used for memory that can be owned by
multiple resources at one time
std::shared ptr maintains a reference count of pointer objects. Data managed by
std::shared ptr is only freed when there are no remaining objects pointing to the data
# include <iostream>
# include <memory>
struct A {
int value;
};
int main() {
std::shared_ptr<A> sh_ptr1(new A());
std::shared_ptr<A> sh_ptr2(sh_ptr1);
std::shared_ptr<A> sh_ptr3(new A());
sh_ptr3 = nullptr; // allowed, the underlying pointer is deallocated
// sh_ptr3 : zero references
sh_ptr2 = sh_ptr1; // allowed. sh_ptr1, sh_ptr2: two references
sh_ptr2 = std::move(sh_ptr1); // allowed // sh_ptr1: zero references
} // sh_ptr2: one references
48/64
std::shared ptr - Shared Pointer 2/3
std::shared ptr methods
get() returns the underlying pointer
operator* operator-> dereferences pointer to the managed object
use count() returns the number of objects referring to the same managed
object
reset(ptr) replaces the managed object with ptr
Utility method: std::make shared() creates a shared pointer that manages a new
object
49/64
std::shared ptr - Shared Pointer 3/3
# include <iostream>
# include <memory>
struct A {
int value;
};
int main() {
std::shared_ptr<A> sh_ptr1(new A());
auto sh_ptr2 = std::make_shared<A>(); // std::make_shared
std::cout << sh_ptr1.use_count(); // print 1
sh_ptr1 = sh_ptr2; // copy
// std::shared_ptr<A> sh_ptr2(sh_ptr1); // copy (constructor)
std::cout << sh_ptr1.use_count(); // print 2
std::cout << sh_ptr2.use_count(); // print 2
auto raw_ptr = sh_ptr1.get(); // get
sh_ptr1.reset(new A()); // reset
(*sh_ptr1).value = 3; // dereferencing
sh_ptr1->value = 2; // dereferencing
}
50/64
std::weak ptr - Weak Pointer 1/3
A std::weak ptr is simply a std::shared ptr that is allowed to dangle (pointer
not deallocated)
# include <memory>
std::shared_ptr<int> sh_ptr(new int);
std::weak_ptr<int> w_ptr = sh_ptr;
sh_ptr = nullptr;
cout << w_ptr.expired(); // print 'true'
51/64
std::weak ptr - Weak Pointer 2/3
It must be converted to std::shared ptr in order to access the referenced object
std::weak ptr methods
use count() returns the number of objects referring to the same managed
object
reset(ptr) replaces the managed object with ptr
expired() checks whether the referenced object was already deleted (true,
false)
lock() creates a std::shared ptr that manages the referenced object
52/64
std::weak ptr - Weak Pointer 3/3
# include <memory>
auto sh_ptr1 = std::make_shared<int>();
cout << sh_ptr1.use_count(); // print 1
std::weak_ptr<int> w_ptr = sh_ptr1;
cout << w_ptr.use_count(); // print 1
auto sh_ptr2 = w_ptr.lock();
cout << w_ptr.use_count(); // print 2 (sh_ptr1 + sh_ptr2)
sh_ptr1 = nullptr;
cout << w_ptr.expired(); // print false
sh_ptr2 = nullptr;
cout << w_ptr.expired(); // print true
53/64
Concurrency
Overview
C++11 introduces the Concurrency library to simplify managing OS threads
# include <iostream>
# include <thread>
void f() {
std::cout << "first thread" << std::endl;
}
int main(){
std::thread th(f);
th.join(); // stop the main thread until "th" complete
}
How to compile:
$g++ -std=c++11 main.cpp -pthread
54/64
Example
# include <iostream>
# include <thread>
# include <vector>
void f(int id) {
std::cout << "thread " << id << std::endl;
}
int main() {
std::vector<std::thread> thread_vect; // thread vector
for (int i = 0; i < 10; i++)
thread_vect.push_back( std::thread(&f, i) );
for (auto& th : thread_vect)
th.join();
thread_vect.clear();
for (int i = 0; i < 10; i++) { // thread + lambda expression
thread_vect.push_back(
std::thread( [](){ std::cout << "thread\n"; } );
}
}
55/64
Thread Methods 1/2
Library methods:
std::this thread::get id() returns the thread id
std::thread::sleep for( sleep duration )
Blocks the execution of the current thread for at least the specified sleep duration
std::thread::hardware concurrency() returns the number of concurrent threads
supported by the implementation
Thread object methods:
get id() returns the thread id
join() waits for a thread to finish its execution
detach() permits the thread to execute independently from the thread handle
56/64
Thread Methods 2/2
# include <chrono> // the following program should (not deterministic)
# include <iostream> // produces the output:
# include <thread> // child thread exit
// main thread exit
int main() {
using namespace std::chrono_literals;
std::cout << std::this_thread::get_id();
std::cout << std::thread::hardware_concurrency(); // e.g. print 6
auto lambda = []() {
std::this_thread::sleep_for(1s); // t2
std::cout << "child thread exit\n";
};
std::thread child(lambda);
child.detach(); // without detach(), child must join() the
// main thread (run-time error otherwise)
std::this_thread::sleep_for(2s); // t1
std::cout << "main thread exit\n";
}
// if t1 < t2 the should program prints:
// main thread exit
57/64
Parameters Passing
Parameters passing by-value or by-pointer to a thread function works in the same way
of a standard function. Pass-by-reference requires a special wrapper ( std::ref ,
std::cref ) to avoid wrong behaviors
# include <iostream>
# include <thread>
void f(int& a, const int& b) {
a = 7;
const_cast<int&>(b) = 8;
}
int main() {
int a = 1, b = 2;
std::thread th1(f, a, b); // wrong!!!
std::cout << a << ", " << b << std::endl; // print 1, 2!!
std::thread th2(f, std::ref(a), std::cref(b)); // correct
std::cout << a << ", " << b << std::endl; // print 7, 8!!
th1.join(); th2.join();
}
58/64
Mutex (The Problem) 1/3
The following code produces (in general) a value < 1000:
# include <chrono>
# include <iostream>
# include <thread>
# include <vector>
void f(int& value) {
for (int i = 0; i < 10; i++) {
value++;
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
int main() {
int value = 0;
std::vector<std::thread> th_vect;
for (int i = 0; i < 100; i++)
th_vect.push_back( std::thread(f, std::ref(value)) );
for (auto& it : th_vect)
it.join();
std::cout << value;
}
59/64
Mutex 2/3
C++11 provide the mutex class as synchronization primitive to protect shared data
from being simultaneously accessed by multiple threads
mutex methods:
lock() locks the mutex, blocks if the mutex is not available
try lock() tries to lock the mutex, returns if the mutex is not available
unlock() unlocks the mutex
More advanced mutex can be found here: en.cppreference.com/w/cpp/thread
C++ includes three mutex wrappers to provide safe copyable/movable objects:
lock guard (C++11) implements a strictly scope-based mutex ownership
wrapper
unique lock (C++11) implements movable mutex ownership wrapper
shared lock (C++14) implements movable shared mutex ownership wrapper
60/64
Mutex 3/3
# include <thread> // iostream, vector, chrono
void f(int& value, std::mutex& m) {
for (int i = 0; i < 10; i++) {
m.lock();
value++; // other threads must wait
m.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
int main() {
std::mutex m;
int value = 0;
std::vector<std::thread> th_vect;
for (int i = 0; i < 100; i++)
th_vect.push_back( std::thread(f, std::ref(value), std::ref(m)) );
for (auto& it : th_vect)
it.join();
std::cout << value;
}
61/64
Atomic
std::atomic (C++11) class template defines an atomic type that are implemented
with lock-free operations (much faster than locks)
# include <atomic> // chrono, iostream, thread, vector
void f(std::atomic<int>& value) {
for (int i = 0; i < 10; i++) {
value++;
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
int main() {
std::atomic<int> value(0);
std::vector<std::thread> th_vect;
for (int i = 0; i < 100; i++)
th_vect.push_back( std::thread(f, std::ref(value)) );
for (auto& it : th_vect)
it.join();
std::cout << value; // print 1000
}
62/64
Task-based parallelism 1/2
The future library provides facilities to obtain values that are returned and to catch
exceptions that are thrown by asynchronous tasks
Asynchronous call: std::future async(function, args...)
runs a function asynchronously (potentially in a new thread)
and returns a std::future object that will hold the result
std::future methods:
T get() returns the result
wait() waits for the result to become available
async() can be called with two launch policies for a task executed:
std::launch::async a new thread is launched to execute the task asynchronously
std::launch::deferred the task is executed on the calling thread the first time its
result is requested (lazy evaluation)
63/64
Task-based parallelism 2/2
# include <future> // numeric, algorithm, vector, iostream
template <typename RandomIt>
int parallel_sum(RandomIt beg, RandomIt end) {
auto len = end - beg;
if (len < 1000) // base case
return std::accumulate(beg, end, 0);
RandomIt mid = beg + len / 2;
auto handle = std::async(std::launch::async, // right side
parallel_sum<RandomIt>, mid, end);
int sum = parallel_sum(beg, mid); // left side
return sum + handle.get(); // left + right
}
int main() {
std::vector<int> v(10000, 1); // init all to 1
std::cout << "The sum is " << parallel_sum(v.begin(), v.end());
}
64/64