A look at modern C++

I've been writing C++ code on and off since the early 2000s, but my previous job was very slow to update compilers, and so I was used to writing C++03-style code. Now that I'm working at Mozilla on Firefox, we use C++17, and so I've been learning all the new neat stuff in the language.

For more about why and how I wrote this article, see this blog post.

Having learned a lot from his earlier books, I decided to read through Scott Meyers's book Effective Modern C++

That page seems a little broken as of this writing; here's where you can order the book on bookshop.org or Barnes & Noble. Up to you, but I'd recommend a paper copy - it's printed in full color and looks very nice!

and take some notes on it. Originally I was focusing on "how backwards-compatibility can result in weird design choices" but I ended up just taking notes on things I thought were interesting.

Note that I was reading it with an Firefox-centered eye, so I skipped some of the parts about things that have a Firefox-specific equivalent; for example, mozilla::RefPtr instead of std::shared_ptr .

In this case this is probably because many classes are part of XPCOM and need to be able to be manipulated from C++ and Javascript, which means they need a method-based way of reference counting. For other more direct analogues like mozilla::UniquePtr and std::unique_ptr, this is probably because the Firefox codebase predates a lot of modern C++ - for example, mozilla::UniquePtr seems to have been created in late 2013, and my guess is that most compilers didn't support C++11 by then - see the post announcing mozilla::UniquePtr which mentions that std::unique_ptr is "not always available right now". Such are the perils of long-lived codebases!

In some cases similar advice applies, but there are definitely differences - for example, mozilla::RefPtr requires that its template parameter has AddRef() and Release() methods that do the reference counting, while std::shared_ptr has no such requirement and thus has its own storage for a reference count. Since Firefox uses C++17, I also ignored things that only apply to C++11.

I think something helpful to keep in mind is that C++ seems to prioritize two things above all else: backwards compatibility so that old code keeps working in newer versions of the standard, and performance so that it's at least possible to squeeze every last ounce out of your hardware. Notably not on this list is safety - even with the new modern constructs there are still plenty of ways to write code that crashes or has undefined behavior!

If you have feedback or see an error, please let me know with a comment on my blog post or an email at greg@gregstoll.com.

This goes through the book in order

I'm reading from the first edition, twelfth release, for the record.

and I list things by item number and, where appropriate, page number.

Section 1 - Deducing Types (Items 1-4)

Item 2 talks about using auto and how it's great at deducing the correct type! Except...well, it interacts with std::initializer_list and "uniform initialization" in some very confusing ways. Uniform initialization (new in C++11), means initializing a variable with braces like so:

Widget x { 5 };

instead of the more conventional

Widget x(5);

(both of these call Widget's constructor with a parameter of 5) The various problems with uniform initialization will be covered later in Item 7. But this item gives an example like:

int x1 = 5;
int x2(5);
int x3 { 5 };

Unsurprisingly, these all initialize their int vairables to 5. If we change the type to auto, though, like:

auto x1 = 5;
auto x2(5);
auto x3 { 5 };

x1 and x2 are of type int, but x3 is of type std::initializer_list! Unless your compiler has implemented N3922, which is not technically part of C++11 or C++14 but many compilers have, probably because this example is painful, in which case x3 is of type int. So I'm guessing that this is OK now for all practical purposes. But this is not an auspicious start for uniform initialization!

Item 3 introduces decltype, which is a way to get the type of a variable or expression. This can be useful in some template scenarios - for example, let's say you have a function that wants to wrap a container indexing operation and add a logging statement. If the signature of the function is (ignoring ReturnType for a minute):

template<typename Container, typename Index>
ReturnType accessValueAndLog(Container& c, Index& i)

then you could explicitly declare a variable of the correct type with:

decltype(c[i]) value = c[i];

Of course, this is somewhat artificial, because you'd probably just do

auto value = c[i];

instead and it's perfectly clear what you mean.

But let's come back to ReturnType above - it seems like we should be able to do

template<typename Container, typename Index>
decltype(c[i]) accessValueAndLog(Container& c, Index& i)

but this won't compile!

The parser writer in me understands why this is - when parsing the decltype, the parser doesn't know that c and i are variables. But it's pretty annoying!

Instead, in C++11 you can use this gnarly syntax (known as "trailing return type"):

template<typename Container, typename Index>
auto accessValueAndLog(Container& c, Index& i) -> decltype(c[i])

And in C++14, if it's clear what type the body is returning as in the example below, you can just use auto:

template<typename Container, typename Index>
auto accessValueAndLog(Container& c, Index& i) {
  logSomething();
  return c[i];
}

But! This all works fine if we just want to return the value of c[i]. What if we want to return a reference to c[i], so callers could do something like this?

accessValueAndLog(c, i) = 5;

The good news is that you can do this! The bad news is that it looks like this:

template<typename Container, typename Index>
decltype(auto) accessValueAndLog(Container& c, Index& i) // same function body as before

Sigh. There is an actual explanation in the book about how type deduction for auto and decltype(auto) differ, and I'm assuming the reason they made decltype(auto) a thing was that they couldn't change the behavior of auto. Anyway, I think of decltype(auto) as "super-duper auto" and that's good enough for me!

Fun fact: my notes here say "aaaaaaaa", not the last time that this is used! (see Item 27)

Another "fun" bit about decltype(auto) - compare these two functions:

decltype(auto) thisIsFine() {
  int x = 0;
  return x; // returns an int, fine
}
decltype(auto) thisIsBad() {
  int x = 0;
  return (x); // returns an int&, surprising!
}

Not only that, but thisIsBad() returns a reference to the local variable x, so you get undefined behavior! All because of an extra set of parentheses! I admit that return (x) looks kinda weird and I probably wouldn't write that code, but parentheses changing what type something is adds to my general skepticism about auto (see Item 5).

Section 2 - auto

Item 5 says to prefer auto over explicit type declarations. I don't know how I feel about this. Here are things that seem fine:

auto x = Widget::make_a_widget();
auto y = std::make_shared<Frob>();

In each case, the type name is incredibly clear from the expression on the right.

I also like that you can use modifiers like const auto& and whatnot.

I also don't have a problem using it when it's clear what the type is by convention, like:

std::vector<int> values;
for (auto it = values.begin(); it != values.end(); ++it) {
  // some code that uses *it
}
Here auto saves me from writing out the type std::vector<int>::iterator. Although since C++11 you probably wouldn't want to do this since you can use the range-based for loop like:
std::vector<int> values;
for (int value : values) {
  // some code that uses value
}

but iterators are good for more than just for loops.

However, anything more than that starts to make me a bit uncomfortable. For example:

Widget w;
auto x = w.get_owner();
auto y = x.lines();

It's pretty unclear what types x and y are. The usual rebuttal to this is "but your IDE can easily tell you!" and, OK, true, although that usually involves hovering over the auto keyword which you have to do one at a time.

My real concern is that if the return type of Widget::get_owner() changes, this automatically changes the type of x. And if the new type also has a lines() method, then the meaning of this code has changed and it still compiles. Scary!

Unrelated to this but something surprising I wanted to mention is that auto can actually have better performance than not using it! Let's look at a lambda expression:

int captured;
auto lambda1 = [captured](bool x) {
   return x ? captured : captured + 1;
};
std::function<int(bool)> lambda2 = [captured](bool x) {
   return x ? captured : captured + 1;
};

Here, lambda1 can be entirely stack-allocated since the actual type includes what data is stored in the closure. lambda2 requires a heap allocation for the captured variable that is stored in the closure, because std::function<int(bool)> doesn't include any inline storage for closure variables, since the type doesn't have the knowledge of what those are.

I think I understand this correctly? I am pretty confident that auto uses less storage in this case, at least. This is discussed on pages 39-40.

Item 6 says hold on: sometimes auto doesn't do what you think! The most common case is with vector<bool>:

vector<bool> v;
bool entry1 = v[2]; // entry1 is a bool, unsurprising
auto entry2 = v[2]; // entry2 is a vector<bool>::reference, surprising!

This is because of some peculiarities of vector<bool> - because it stores its entries packed in bits but it still needs to have things like v[2] = true; work, it has to do some trickery. vector<bool>::reference is that trickery, and it can implicitly convert to bool, which is why the entry1 line is fine.

The proposed fix for this is the "explicitly typed initializer idiom", which looks like

auto entry3 = static_cast<bool>(v[2]); // entry3 is a bool again

which...is this really better? To me this looks like entry1 but needlessly verbose.

The book does give an example of a usage that I do think is nicer:

double getDouble();
auto f = static_cast<float>(getDouble());

Here it's obvious that we're deliberately reducing the precision of the result of getDouble(), as opposed to

float f = getDouble();

which might just be a mistake.

Section 3 - Moving to Modern C++ (Items 7-17)

Item 7 talks about uniform initialization, which is a fancy new way to initialize variables:

int i(0); // parentheses initialization, old way
int j = 0; // equals initialization, old way
int k { 0 }; // uniform initialization, new exciting way!

You can also use it to call constructors, like:

Widget w { 3, 4 }; // calls two-argument Widget constructor

Long story short, there are some places where you can use parentheses initialization and some places where you can use the equals initialization, but uniform initialization can be used everywhere.

Hence "uniform"!

Unfortunately, there are some surprising parts of this. The biggest is with vector, because:

vector v1(5, 1); // create vector with 5 elements, all of which have value 1
vector v2 { 5, 1 }; // create vector with 2 elements, 5 and 1

The reason has to do with those pesky initializer_lists that we mentioned back in Item 2. In fact, if a class has any constructor that takes an initializer_list, the compiler will prefer calling that constructor from a uniform initialization call site even if the types don't match very well. That isn't the issue here, but it's still enough to be very confusing!

Honestly, this whole initializer_list mess is enough to make me not want to use uniform initialization.

Also I think the braces look weird, but presumably I'd get used to that!

Sorry, I have to mention the wildest example of this problem (from page 53):

class Widget {
public:
  Widget(int i, bool b);
  // the usual move and copy constructors
  operator float() const; // implicit conversion to float
  Widget(initializer_list<double>); // danger ahead!
};
Widget w;
Widget w1(w); // calls the copy constructor, normal
Widget w2 { w }; // converts w to a float and calls the initializer_list constructor, what the heck??

Item 10 talks about scoped enums versus unscoped enums, with the upshot being that there's basically no reason to use the old unscoped enums versus the new scoped enums. Unscoped enums look like this:

enum Color { kBlack, kWhite, kRed}; // unscoped enum, not very type safe
This has a bunch of weird consequences:

All of these problems go away if you used a scoped enum, which is as easy as adding a "class":

enum class Color { kBlack, kWhite, kRed}; // scoped enum, more type safety!

This fixes the scoping issue, and Color values no longer implicitly convert to integer types. (although you can of course static_cast them to int if you need to)

Item 11 introduces the new syntax for deleting functions. Previously, if you wanted to make the compiler not automatically generate a "special member function",

Usually this is used for removing copy constructor and copy assignment for things like a unique_ptr.

you had to declare the method to be private and have no implementation, like:

class DontCopyMe {
  public:
    // methods and such
  private:
    // deliberately private with no implementation to break callers
    DontCopyMe(const DontCopyMe&);
    DontCopyMe& operator=(const DontCopyMe&);
};

The problem is that this is somewhat artificial. Also, code that tries to call the member gets either a confusing linker error (if the caller has access to DontCopyMe's private members) or an error that you're trying to call a private method, which can also be confusing.

Although in practice, once you've been writing C++ a while you get used to this.

Instead, in C++11 you can do:

class DontCopyMe {
  public:
    // methods and such
    DontCopyMe(const DontCopyMe&) = delete;
    DontCopyMe& operator=(const DontCopyMe&) = delete;
};

Much nicer syntax, and much clearer error messages to boot!

Less commonly you can use this for other kinds of methods; if you want to make a function that only takes an int and won't implicitly upcast a char, you can do

void takeAnInt(int i);
void takeAnInt(char c) = delete;

and you can even do this for templated functions too:

template <typename T>
void processAnyPointerExceptAString<T>(T* ptr);
template <>
void processAnyPointerExceptAString<char>(char* ptr) = delete;

although then you have to worry about const char* and volatile char* and such too.

Maybe you could work around this with std::remove_cv somehow?

Item 12 talks about the override keyword

Technically override isn't a keyword - since it wasn't added until C++11 they didn't want to break existing code that might already use override as a variable name. So override is a "specifier" that only means something when used after a member function declaration. This isn't a huge deal, but it is another example of how having a language that's been around for a long time makes changing things hard!

and I'm a big fan! It breaks the compile if the method marked override isn't actually overriding a virtual method. It's nice to detect typos in method names, but also lets you know when you rename a base class method that's overriden. I like ways to help the compiler detect problems, even at the cost of a little verbosity!

Item 14 talks about the new noexcept specifier versus the old throw(), and this is definitely a case where C++11 is trying to fix some mistakes made in C++98.

The short version is that functions marked noexcept are more optimizable than ones marked throw(), since the compiler isn't required to unwind the stack if an exception is thrown.

Thankfully they now mean the same thing and it looks they actually got rid of throw() in C++20, which is impressive!

Item 15 deals with constexpr, which is pretty neat! You can use it on a variable:

constexpr int MAX_SIZE = 128;

which ensures that MAX_SIZE is a compile time constant and can be used in things like template arguments. But you can also use it on a function:

constexpr double average(double d1, double d2) {
  return (d1 + d2) / 2.0;
}

and if d1 and d2 are constexpr, then average(d1, d2) will be too! But if both d1 and d2 are not constexpr, you can still use average(d1, d2) and it will just return a normal non-constexpr result! Neat!

Item 16 discusses why const methods on classes should be thread-safe - it's not a language requirement, but clients will expect it. This can get you into trouble if you have a const method that modifies some mutable members.

The "classic" example is using a mutable member to cache an expensive computation, but you want the caching to be transparent to the caller so you want the method to be const.

Really this item just convinced me that mutable members are a bad idea!

Item 17 talks about special member function generation. I want to use this as an excuse to talk about std::move(). It's a very nice addition to the language and in retrospect I'm surprised it took so long to get added. But it also relies on the programmer to use it safely - you can't use a variable after calling std::move() on it or you get undefined behavior, like this:

And I'm surprised there aren't static analysis rules to at least try to catch this? Or at least Firefox builds with a lot of static analysis rules and I don't think this is one of them, as I'm pretty sure I've seen this mistake in code...

Widget w;
someMethod(std::move(w));
w.frob(); // undefined behavior!

This is one place where Rust shines - the Rust compiler ensures that you can't use a variable after its lifetime has ended, and std::move() just makes me miss that!

Section 4 - Smart Pointers (Items 18-22)

I skimmed over most of this section because Firefox has its own smart pointer types.

Item 19, though, describes std::shared_ptr, and I noticed some interesting differences between it and the Firefox smart pointer types mozilla::RefPtr and nsCOMPtr:

Item 21 talks about std::make_shared() and why you should always prefer it when constructing a std::shared_ptr. One reason that I hadn't thought of but resonates with me is that if you use it all the time, any usage of new T() will stick out like a sore thumb, and will prompt you to take a closer look to make sure it's being used correctly and not being leaked. Another nicety is that if you use std::make_shared() to construct the std::shared_ptr, it will allocate space for the object and the reference count in a single allocation! Neat!

Section 5 - Rvalue References, Move Semantics, and Perfect Forwarding (Items 23-30)

Most of the stuff in this chapter I already knew pretty well by this point (std::move()) or seemed pretty esoteric and not worth my time right now (std::forward()).

Item 27 bears some calling out, because it made me question the life choices that got me to this point. It's about how to overload methods on universal references (i.e. T&&) and I'm not going to walk through the whole thing, but here's the code at the end that has the "correct" solution:

Yes, this is the other long-awaited usage of "aaaaaaaa" from my notes!

class Person {
public:
  template<
    typename T,
    typename = std::enable_if_t<
      !std::is_base_of<Person, std::decay_t<T>>::value
      &&
      !std::is_integral<std::remove_reference_t<T>>::value
    >
  >
  explicit Person (T&& n)     // constructor for std::strings and
  : name (std::forward<T>(n)) // args convertible to strings
  { ... }
  explicit Person(int idx)    // constructor for int args
  : name (nameFromIdx(idx))
  { ... }
  ...                         // copy and move constructors, etc.
private:
  std::string name;
}

I...just cannot.

Section 6 - Lambda Expressions (Items 31-34)

To start with, I really like lambda expressions! Being able to pass a lambda to std::find_if to find the first odd element with just:

std::find_if(v.begin(), v.end(), [](int i) { return i % 2 != 0; });

is very convenient, and sure beats the old way of:

bool isOdd(int i) { return i % 2 != 0; }
std::find_if(v.begin(), v.end(), isOdd);

Item 31 advises against using default capture modes. I knew that default by-reference capture (i.e. [&]) was dangerous, because if the lambda is going to be used outside of its enclosing scope, any captured local variables will be gone and trying to use them will lead to a crash (if you're lucky).

But the fact that default by-value capture ([=]) is also dangerous surprised me! Check this out:

class Widget {
public:
  // Filters is a std::vector<> of functions that
  // take an int and return a bool
  void addFilter(Filters& filters) const {
    filters.emplace_back(
      [=](int value) { return value > count; }
    );
  };
private:
  int count;
};

What this does is that the this pointer is actually getting captured by value! (and inside the lambda the count expression is actually this->count) So if the Widget gets destroyed, running the lambda is going to crash (if you're lucky).

The best way to fix this is to explicitly capture count like:

class Widget {
public:
  // Filters is a std::vector<gt; of functions that
  // take an int and return a bool
  void addFilter(Filters& filters) const {
    filters.emplace_back(
      [count=count](int value) { return value > count; }
    );
  };
private:
  int count;
};

Another thing to note is that default by-value capture does not capture static variables. This is fine, but just another way that just sticking [=] in front of your lambda expression and assuming it's going to capture absolutely everything it needs is a bad idea.

Anyway, it can be a bit annoying to explicitly list everything you're capturing, but I think it's the right way to go. And I like that the C++ language makes you do this, as I've definitely written bugs in languages like Javascript where variables get captured automatically!

Item 32 recommends using std::move() with explicit capturing for best performance, like:

std::vector<int> aBigVector;
auto countEntries = [aBigVector = std::move(aBigVector)]{ return aBigVector.size(); }

I would just warn that it's easy to overlook std::move() calls in the captures part of a lambda, and you have to remember that aBigVector is no longer usable after countEntries is declared.

Section 7 - The Concurrency API (Items 35-40)

Firefox doesn't use std::async and such - it has its own APIs for processes and threads and IPC, so I skipped over this section.

Section 8 - Tweaks (Items 41-42)

Item 41 is what a coworker told me about and prompted me to get serious about learning modern C++ and buy this book. (thanks Chris!) Let's say you have an API that takes in a string and needs to take ownership of it. Our first attempt at writing this might look like:

void Widget::setName(const std::string& s) {
  this->mName = s; // make a copy of s
}

But what if the caller is done with s - we could avoid the need to make a copy by taking in a std::string&& like this:

void Widget::setName(std::string&& s) {
  this->mName = std::move(s); // no copy needed!
}
// Example calling code
Widget w;
std::string s;
w.setName(std::move(s));

An improvement! But, if the caller is not done with the string it wants to pass, it has to do something like this:

// Example calling code 
Widget w;
std::string s; // caller needs to keep a copy of s
std::string tempString(s);
w.setName(std::move(tempString));
}

A little ugly, because now the caller has to make a copy of its string. So what if we had both versions of Widget::setName()?

void Widget::setName(const std::string& s) {
  this->mName = s; // make a copy of s
}
void Widget::setName(std::string&& s) {
  this->mName = std::move(s); // no copy needed!
}

Now regardless of whether the caller wants to std::move() the string parameter, we get the most efficient and convenient calling code!

But, one problem: what if there's a function that takes two strings? To get the fully optimized and convenient experience, you need four overloads of Widget::setName(). And if there are three strings...well, things get ugly at an exponential rate!

There is a better way. Here is a version that's easier to maintain and almost as efficient, at the cost of being a bit unintuitive if you've been writing C++ for a while:

void Widget::setName(std::string s) { // not a reference!
  this->mName = std::move(s);
}
Widget w;
std::string s;
// Example calling code if we're not done with s
w.setName(s);
// Example calling code if we are done with s
w.setName(std::move(s));

Yes, the rarely-seen by value std::string parameter! But it makes sense - regardless of whether the caller is done with s or not, this is almost as efficient as the previous solution and only adds one std::move() call, which is pretty insubstantial if the class supports an efficient std::move() like std::string does.

As I alluded to, the downside is that this looks really weird if you're used to C++ - I've had at least one code reviewer confused about this (although they were fine when I explained what was going on). But it really does give you the best of both worlds!

Greg's Random C++ Complaints

(I was hoping to fit these in elsewhere but there wasn't really a natural place for them)

On the use of explicit: It really bothers me that one-argument class constructors default to setting up an implicit conversion - for example:

class Widget {
public:
  Widget(float f) {};
}
float score = 5.0;
Widget w = score; // calls the Widget constructor!

I almost never want this! Of course you can prevent this by declaring the constructor explicit, but boy it would be nicer if you had to opt into this with a keyword like implicit. I'm guessing this decision was made a while ago and it can't be changed without breaking existing code. Sigh.

For some reason non-void functions that don't return a value compile with just a warning! Like, seriously, see:

int square(int num) {
  // Whoops, don't return anything
}
  
int main(int argc, char* argv[]) {
  return square(argc);
}  

This compiles and runs! (see it on Godbolt) If execution gets to the end of square() without returning, a ud2 instruction is run, which indicates "undefined behavior" and crashes.

I know this because I have done this before! And while it is a pretty helpful warning, it can get lost in a sea of other warnings, and this is the sort of thing that really really feels like it should be an error. At least I've done this enough times now (*sob*) that I know when a program crashes with a ud2 instruction this is very likely the problem.