Gianni Rodari, a famous Italian writer, wrote in a short poem “You never finish learning and what you don’t know is more important than what you know“. I’m not sure if he had C++ in mind or not, but surely that sentence fits quite well this programming language.
The program didn’t connect on the second attempt, but the bug was quite easy to spot, I had the retry loop that started by instantiating a std::promise captured by reference from the lambda used as a callback from the network layer.
while( attemptsToGo > 0 ) { std::promise<int> p; connect( [&p]( ConnectionResult cr ){ ... p.set_value(x);... } ); auto f = p.get_future(); f.wait_for( timeout ); ... }
The mistake was that when the second iteration of the loop was entered the instance of the promise from the first iteration was gone and the callback trying to access an invalid instance wreaked havoc. Easy peasy, I thought, just capture by copy… er, well, but you can’t copy a promise. This makes sense as soon as you ponder the question – the std::promise
, is a placeholder for a value that will appear in the future. Duplicating a promise makes no sense.
No problem, it’s a piece of cake, just move the promise into the lambda:
while( attemptsToGo > 0 ) { std::promise<int> p; auto f = p.get_future(); connect( [p=std::move(p)]( ConnectionResult cr ){ ... p.set_value(x);... } ); f.wait_for( timeout ); ... }
I had to rearrange the future extraction before the std::move and use the ugly syntax for capturing by move in a lambda (so ugly that it almost seems that no one in the standard committee thought about this need).
Good?
Not at all! With the typical terse style that C++ reserves for template errors, the compiler flooded me with pages and pages of gibberish. It took a while to decode to find the root cause in the std::function argument of the connection callback.
The connect signature was something like this:
void connect( std::function<void(ConnectionResult)> callback );
After some digging into the error messages, I got the failure reason – std::function
wants the callable object to be copy-constructible, and given that the std::promise captured by the lambda is not copy-constructible, the lambda itself is not copy-constructible.
Gosh, this seems like a problem in the standard library… how am I expected to include a promise in a callback? For sure, I don’t want to turn every class with a callback in a template so that I can write:
template<typename F> class ClassWithCallback { public: void methodWithCallback( F&& callback ); private: F mCallback; };
And, as noted in a previous post, C++ lacks a built-in function type. Every lambda, even if having the same parameter list and capturing the same variables, in the same way, has a different type.
A quick tour of StackOverflow provided some useful information (and a solution). First, this is a well-known problem (!), and, second, it has a solution in… C++23. The solution comes with the handy and short name std::move_only_function, and, of course, it is a template.
As much as I would love to use the brand new shiny standard, I am stuck with partial support of C++20 because the official toolchain for the project is a bit dated and there is no plan to update it, at least for a while.
So, it is time to roll up my solution to the problem (I have to be honest, StackOverflow hinted in the right direction and provided plenty of inspiration). Briefly, I considered whether to design a generic template to mimic std::move_only_function, but I discarded the idea as soon as I realized the effort needed to design, code, and test such a template. So I resorted to a much simpler and very specific design.
To make a std::function-like template, you have to deal with the different object types that can be handled by this code. In other words, you want different types to fit in the same container so that all these types are compatible. This is called type-erasure because you are erasing the specific type and replacing it with a more generic type.
This immediately calls for polymorphism – you define an abstract base class that exposes the generic operation:
struct Concept { virtual ~Concept() = default; virtual void operator()(ConnectionResult const&) = 0; };
(I used Concept here, and Model below, because they are pretty standard names for C++ type erasure idiom).
A struct instead of a class is fine here, since this is part of an implementation and will end in a private section of my template class.
Now that the concept is defined, you may write the implementation:
template<typename Cb> struct Model : Concept { explicit Model(Cb&& callback) : mCallback(std::move(callback)) { } void operator()(ConnectionResult const& result) override { mCallback(result); } Cb mCallback; };
This implementation needs to manage the type-specific callback but can be handled with the virtual methods of the concept.
The move constructor enforces the move semantic we need for the solution.
Now the two parts can be combined, using a std::unique_ptr. In this way, my callback object type and size are known and the object does not require a template to be used.
class ConnectionResultCallback { public: template<typename Cb> ConnectionResultCallback(Cb&& cb) : mCb(std::make_unique<Model<Cb>>(std::move(cb))) { } void operator()( ConnectionResult const& result ) const { (*mCb)(result); } private: struct Concept { virtual ~Concept() = default; virtual void operator()(ConnectionResult const&) = 0; }; template<typename Cb> struct Model : Concept { explicit Model(Cb&& callback) : mCallback(std::move(callback)) { } void operator()(ConnectionResult const& result) override { mCallback(result); } Cb mCallback; }; std::unique_ptr<Concept> mCb; };
A few finishing touches are needed.
Since std::function
can be implicitly constructed, I wanted my template to do the same. But this caused some ambiguity – since the templated implicit constructor was an adequate candidate for move-constructor. This is why you have to add the std::enable_if spell to the constructor:
template<typename Cb, typename = std::enable_if_t<!std::is_same_v<std::decay_t<Cb>, ConnectionResultCallback>>> ConnectionResultCallback(Cb&& cb) : mCb(std::make_unique<Model<Cb>>(std::move(cb))) { }
Adding a sprinkle of default and delete constructors and assignment operators, completes the solution:
class ConnectionResultCallback { public: template<typename Cb, typename = std::enable_if_t<!std::is_same_v<std::decay_t<Cb>, ConnectionResultCallback>>> ConnectionResultCallback(Cb&& cb) : mCb(std::make_unique<Model<Cb>>(std::move(cb))) { } ConnectionResultCallback( ConnectionResultCallback const& ) = delete; ConnectionResultCallback( ConnectionResultCallback&& ) = default; ConnectionResultCallback& operator( ConnectionResultCallback const& ) = delete; ConnectionResultCallback& operator( ConnectionResultCallback&& ) = default; void operator()( ConnectionResult const& result ) const { (*mCb)(result); } private: struct Concept { virtual ~Concept() = default; virtual void operator()(ConnectionResult const&) = 0; }; template<typename Cb> struct Model : Concept { explicit Model(Cb&& callback) : mCallback(std::move(callback)) { } void operator()(ConnectionResult const& result) override { mCallback(result); } Cb mCallback; }; std::unique_ptr<Concept> mCb; };
The solution is ready, the bug is fixed, and the project is safe. Time to reach and identify the bug 20′, time to research, fix, and test 4 hours. It hurts a bit to stumble into a reference to an out-of-scope object bug, after some 30 years of C++ practice, but let’s consider it a humbling experience. What begs me is that fixing this would have taken a few minutes if C++11 had been defined consistently. std::promise/std::future, move semantics, and std::function were all introduced by C++11. Using std::function to store callbacks seems quite an obvious application. Using std::promise and std::future to build synchronization between a callback and the non-callback thread seems another legitimate design solution.
I already had the chance to spend a few words on the C++ standard and evolution, inconsistencies, and questionable defaults, so I won’t repeat myself. I’m convinced that the C++ standard committee is plenty of utterly clever people, and there is no other way to keep together such a complex and intricate body of norms. My impression is that politics plays a great role in the decisions and evolution is constrained by important sponsors that own large C++ codebases and want to benefit from re-compiling them with the latest standard without paying the cost of adapting.
In comparison rust move operation is always the default with assignment and parameter passing, C++ defines a set of byzantine rules opening the opportunities for an entire book written on the subject. On one hand, the comparison between C++ and Rust is not fair – C++ endured a much longer evolution (it is almost 40) and has a much more widespread usage (meaning many and diverse interests among the adopters). On the other hand, should you pick a language today for an embedded system, unless you are constrained to some specific environment or toolchain, then going rust is a good idea, as proved by the bug and the fix I presented here.