Multithreading is hard. C++ is hard too. I will demonstrate how we can make multithreading at least a little easier by avoiding some of the pitfalls in C++ by using Rust instead.
Some of the C++ examples listed here are modified versions of examples from Top 20 C++ multithreading mistakes and how to avoid them by Deb Haidar, others I made up myself.
Keep in mind that all of the examples listed here are toy examples: in real-world codebases, multithreading is usually much more complicated than this, and preventing these mistakes in C++ may be that much more difficult. In Rust, the compiler will still check what you’re doing even in those complicated situations, preventing you from making these mistakes.
1. Race conditions
#include <thread>#include <iostream>
#include <string>
#include <thread>int main() {
auto data = std::string{"Hello, world!"};
auto thread1 = std::thread([&] {
data = std::string{"AAAAAAAAAAAAAAAAAAAAAAAA!"};
});
auto thread2 = std::thread([&] {
for (auto&& c : data) {
c += 1;
}
});
thread1.join();
thread2.join();
std::cout << data << '\n';
}
In C++, nothing stops us from introducing race conditions, and in the worst cases we even access invalid memory. If thread1
accesses data
during thread2
‘s loop, thread2 can easily access memory that has been freed. In large codebases, it can be difficult to tell which classes’ methods are being called from multiple threads even if we are cautious about using std::atomic
or std::mutex
otherwise.
fn main() {
let mut data = "Hello, world!".to_owned();
// error[E0373]: closure may outlive the current function,
// but it borrows `data`, which is owned by the current function
std::thread::spawn(||
data = "AAAAAAAAAAAAAAAAAAAAAAAA!".to_owned()
);
// error[E0502]: cannot borrow `data` as immutable
// because it is also borrowed as mutable
// error[E0373]: closure may outlive the current function,
// but it borrows `data`, which is owned by the current function
std::thread::spawn(|| {
for x in data.chars() {
println!("{}", x);
}
});
}
Rust will not let us do this. In Rust, any variable can have an unlimited number of immutable references, or a single mutable reference. This means that we cannot have these kinds of race condition errors in safe Rust code. We have a number of ways to do this correctly; let’s look at mutexes and channels.
use std::sync::{Arc, Mutex};fn main() {
let shared_data = Arc::new(Mutex::new(
"AAAAAAAAAAAAAAAAAAAAAAAA!".to_owned()
));
let t1 = {
let shared_data = shared_data.clone();
std::thread::spawn(move || {
let mut data = shared_data.lock().unwrap();
*data = "Hello, world!\n".to_owned();
})
};
let t2 = {
let shared_data = shared_data.clone();
std::thread::spawn(move || {
let mut data = shared_data.lock().unwrap();
*data = "Goodbye, world!\n".to_owned();
})
};
t1.join().unwrap();
t2.join().unwrap();
println!("{}", shared_data.lock().unwrap());
}
In this example, we use a thread-safe reference-counting pointer to store our data to ensure that our data lives long enough (try using the single-threaded reference-counting pointer in this example to see if we can get a race condition inside the shared pointer), even if we decide to detach our threads. If we only passed immutable data, this would be enough. But since we want to mutate our shared state, we need to wrap it in one of Rust’s synchronization primitives (see std::sync::Mutex
, parking_lot::Mutex
, RwLock
and atomic types).
use std::{sync::mpsc::channel, thread};fn main() {
let (s, r) = channel();
let (s2, r2) = channel();
thread::spawn(move || s.send("Hello, ".to_owned()).unwrap());
thread::spawn(move || s2.send("world!").unwrap());
let message = r.recv().unwrap() + r2.recv().unwrap();
println!("{}", message);
}
For passing data between threads, we can use channels instead. These channels will only let us send types that are safe to send; if we tried to send a type that wasn’t safe to use across threads (such as Rc
), we would get a compiler error. For an even faster implementation of many-producer many-consumer channels, see crossbeam_channel
.
2. Lifetimes and references
#include <string>
#include <thread>int main() {
{
auto data = std::string{"DATA"};
std::thread([&] { data.push_back('!'); }).detach();
}
std::this_thread::sleep_for(std::chrono::milliseconds(500));
}
In this example, the thread we spawn may be accessing invalid memory — the data’s destructor may have been called by the time it started accessing the variable. We need to make sure that the data we are using in our threads outlive the thread itself. In many different scenarios (and not exclusively multithreaded ones), we pass references to data in C++ that has to stay alive until the user of that reference is done with it, and we have no automated way of checking that it is.
fn main() {
let data = "AAAAAAAAAAAAAAAAAAAAAAAA!".to_owned();
// error[E0373]: closure may outlive the current function,
// but it borrows `data`, which is owned by the current function
std::thread::spawn(|| println!("{}", data));
}
As we saw in the example from part 1, the compiler will complain that the borrowed variables don’t live long enough. Since threads can be detached, the data references in these threads point to must live for the entire duration of the program.
use std::sync::Arc;fn main() {
let data = Arc::new("AAAAAAAAAAAAAAAAAAAAAAAA!".to_owned());
{
let data = data.clone();
std::thread::spawn(move || println!("{}", data));
}
}
As we saw in the previous examples, we can use shared pointers to ensure our data lives long enough.
use crossbeam_utils::thread;fn main() {
let data = "DATA".to_owned();
thread::scope(|s| {
let data = &data;
for c in data.chars() {
s.spawn(move |_| {
println!("{}: {}", data, c);
});
}
}).unwrap();
}
We can also use scoped threads from crossbeam-utils that guarantee they will be joined before their scope goes out of scope.
3. Handling errors from threads
#include <exception>
#include <iostream>
#include <stdexcept>
#include <thread>static std::exception_ptr ep = nullptr;int main() {
std::thread([] {
try {
throw std::runtime_error("error");
} catch (...) {
ep = std::current_exception();
}
}).join();
if (ep) {
try {
std::rethrow_exception(ep);
} catch(std::runtime_error& e) {
std::cout << e.what() << "\n";
}
}
}
The example above shows the minimal code for handling exceptions from a single thread. I think it’s fair to say that handling errors from threads in C++ is fairly complicated.
fn main() {
let thread = std::thread::spawn(|| {
// ...
// change the integer here to see the other results
match 0 {
0 => Ok("ok"),
1 => Err("error"),
_ => panic!("something went wrong"),
}
});
match thread.join() {
// Ok return, no panic
Ok(Ok(x)) => println!("OK {}", x),
// Error return, no panic
Ok(Err(x)) => println!("Error: {}", x),
Err(_) => println!("Thread panicked"),
}
}
In Rust, we can choose to handle panics from threads (or just call .unwrap()
to terminate if we assert the thread can never panic), and we may return a Result<T, E>
to signal that the thread may fail. Please note that for most threads, we will not need to handle either of these.
4. Join()ing and detach()ing threads
#include <iostream>
#include <thread>
int main() {
std::thread([] { std::cout << “Hello!”; });
return 0;
}
In C++, if you forget to join your threads, you will terminate your program. The destructor of a std::thread
will call std::terminate
if the thread is joinable: if your thread is still running when the thread is destroyed.
fn main() {
std::thread::spawn(|| println!(“Hello!”));
}
In Rust, threads implicitly detach when their handles are dropped, so this mistake is impossible to make. (Note that you may not see “Hello!” in your stdout after the main thread terminates.)
When I don’t save the thread handle anywhere, I expect the thread to detach, otherwise I would not be dropping it. C++ treats this intuitive behavior as an unrecoverable runtime error, where we have the option of making erroneous states unrepresentable.
5. Join()ing detach()ed threads
#include <iostream>
#include <thread>
int main() {
std::thread t {[] { std::cout << "Hello!"; }};
t.detach();
t.join();
return 0;
}
Attempting to join a detached thread causes a crash.
fn main() {
let thread = std::thread::spawn(|| println!("Hello!"));
thread.join().unwrap();
// no detach
}
This problem doesn’t exist when the threads are detached when we drop the thread handle.
6. Passing parameters by reference
#include <iostream>
#include <thread>
int main() {
std::string hello {"Hello!"};
std::thread {
[&](const std::string& hi) {
std::cout << std::boolalpha << (&hi == &hello);
},
hello
}.join();
return 0;
}
std::thread
passes its arguments by value even in cases where you would not expect it to (std::ref
would fix this).
fn main() {
let hello = "Hello!".to_owned();
std::thread::spawn(move || println!("{}", hello))
.join()
.unwrap();
}
In Rust, you can’t pass arguments to threads, instead you borrow (with scoped threads) or move your values into your closures.
Bonus: returning calculated values from threads
If we want to use a thread in C++ to return a value properly, we need to use some extra synchronization mechanism. I will present two of the most obvious answers: storing through a reference, or std::future
.
#include <chrono>
#include <future>
#include <thread>
#include <optional>
#include <iostream>
using namespace std::chrono_literals;// this only works for simple scenarios
void reference_store() {
auto data = std::optional<std::string>{std::nullopt};
auto t = std::thread([&] {
// calculate the meaning of life
std::this_thread::sleep_for(500ms);
data = "42";
});
t.join();
std::cout << *data << '\n';
}void future() {
auto promise = std::promise<std::string>{};
auto future = promise.get_future();
auto t = std::thread([](auto promise) {
// calculate the meaning of life
std::this_thread::sleep_for(500ms);
promise.set_value("42");
}, std::move(promise));
std::cout << future.get() << '\n';
t.join();
}int main() {
reference_store();
future();
}
Rust’s threads provide a mechanism for returning values from them directly, which makes this operation built-in to threads.
fn main() {
let thread = std::thread::spawn(|| {
// calculate the meaning of life
std::thread::sleep(std::time::Duration::from_millis(500));
"42".to_owned()
});
let result = thread.join().unwrap();
println!("{}", result);
}