How to inspect your C++ code after preprocessor

In C++, the preprocessor is a tool that performs a series of transformations on the source code before it is compiled. It performs operations such as macro expansion, inclusion of header files, and conditional compilation.

One reason why it can be useful to see what the preprocessor produces in C++ is to understand how the code is transformed before it is compiled. The preprocessor can expand macros, substitute values, and perform other transformations that can significantly change the code structure. By examining the preprocessed output, developers can see how their code is transformed and check whether it is behaving as expected.

Additionally, examining the preprocessor output can help to diagnose errors and bugs in the code. By looking at the expanded code, developers can identify issues such as incorrect macro expansion, missing header files, or other problems that may arise during compilation.

Another reason why it can be useful to see the preprocessor output is to optimize the code. By examining the preprocessed code, developers can identify redundant operations, unnecessary inclusions of header files, and other areas where the code can be streamlined for better performance.

Overall, examining the preprocessor output in C++ can be a useful tool for understanding how the code is transformed before it is compiled, diagnosing errors and bugs, and optimizing the code for better performance.

How?

So, we understand that sometimes it can be very useful to see what the preprocessor actually produces before handling over the file to the compiler. You can inspect details and track down some type of problems. There are a few ways to do this:

gcc -E my_class.cpp -o my_class.ii

or

cpp my_class.cpp >> my_class.ii

Or if you work with qmake, you can use:

QMAKE_CXXFLAGS += -save-temps

In which case you will also get these *.ii files for each *.cpp file in your build folder. When we look inside such a file we see a lot more stuff than we wrote. For example line markers around the include statements:

# 1 "/usr/local/include/something.h" 1 3 4

The 3 means the import comes from a system header, 4 means the import should be wrapped in a ‘extern “C”‘ block.

This and what the other numbers mean is explained here: https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html

Happy Debugging!

Qt application on Windows not starting

When you click on an applications exe file and nothing happens, you might wonder what it does (if anything) and find a way to debug what is happening. By default, the binary does not output any log to the console. But there is a way to do that.

You can rebuild your Qt app using the console qmake flag in order to get some useful debugging output on the console.

CONFIG += console

either in your QtCreator Build Settings or as a command line argument to qmake. Then run the app exe from a Windows terminal such as PowerShell and you will see the same output you would see in QtCreator, including your qDebug outputs. You can also click on it in the explorer and it opens a standard terminal showing the console output.

This can be very useful, e.g. when debugging a sandbox created with windeployqt.

C++ operator[] mystic

Why do people use at() to get an item out of a std::map? Why don’t they use the [] operator like in other languages? The documentation on cppreference.com says something mystical like this:

“.. performing an insertion if such key does not already exist.” Why? Well, because it is meant to be used like this:

auto pictures = std::map<uint32_t, std::shared_ptr<PictureData>>;
..
auto& picture = pictures[10];
if (!picture) {
    picture = std::make_shared<PictureData>("Picasso", "Las señoritas de Avignon", 1907);
}

If there is nothing on position 10, it creates a default one. PictureData has to be default constructible.

You can use at() instead, it will throw you an exception when you try to access outside of its boundaries.

at()

When using the at() function, if the index is out of bounds, the function will throw an out_of_range exception. This can help catch potential bugs and improve the safety of the code. On the other hand, when using the operator[] method, if the index is out of bounds, undefined behavior can occur, which can lead to difficult-to-debug errors.

operator[]

In general, the operator[] method is slightly faster than the at() function because it does not perform the bounds checking, but the difference in performance is usually negligible. Furthermore, the performance benefit of using operator[] is only significant when the code is run in a tight loop or with very large data structures.

Which one should I choose?

When it comes to choosing between the operator[] and at() functions, it ultimately depends on the specific needs of the application. If safety and reliability are paramount, then the at() function with its built-in bounds checking is the best choice. However, if performance is a top priority, and the code is carefully written to ensure index safety, then the operator[] function can provide a significant speed advantage.

std::optional example

std::optional is a way of handling values in C++ where it is not clear if there actually will be a value: it represents an object that may have a value.

std::optional<std::string> value;

std::optional is in C++ since C++17 and adds to readability of code, as it explicitly expresses its intention.

How to use it

void updateProfile(uint64_t handle, std::optional<std::string> name) {
    if (name) {
        std::cout << name.value() << std::endl;
    }
}
updateProfile(12345, "Ernie");
// avoid making a copy:
std::string name{"Ernie"};
updateProfile(12345, 
    std::make_optional<std::string>(std::move(name)));

If you don’t want to call it with a name, you can do:

updateProfile(12345, {});

or

updateProfile(12345, std::nullopt);

Improved readability

No need for an extra if() or ternary : ? statement, if name has no value, value_or() returns the alternative.

name.value_or("no name set");

The * and -> operators can be used to access the value, e.g.:

std::cout << *name << " " << name->size() << std::endl;

out parameter in C++

What is an out-parameter? An out-parameter is a non const reference (or pointer) passed to the function, which will then modify it by setting a value.

In C++, we pass arguments by reference usually to avoid copying the object, but what about the behavior of the function taking these arguments.

Using pass-by-value is clear: the arguments are inputs, whereas pass-by-reference can be inputs, outputs or in-outs. This confuses the reader: one has to take extra steps to find out what it does, these constructs are not self-documenting.

I still often see that methods or functions take one or several references for the purpose to modify it. This is not intuitive, and can lead to unexpected behavior in your Cpp code.

If an object needs to be modified, a method on that object could be used instead. The modified object could also be returned, where it is clear that the return type is an ‘output’.

If you need to return several values, a std::tuple or std::pair can be used.

std::atomic default value

std::atomic is a template that allows you to declare its contained value to be atomic: access to this value from different threads is well defined and does not lead to data races.

How to set values

std::atomic is not copyable and not movable.

Lets say we need an atomic value as a class member and want it to be set to a default value:

#include <atomic>

class Foo {
    std::atomic<bool> _online = true; // intuitive, but doesn't work
}

That doesn’t compile. Clang will tell us: "copying member subobject of type 'std::atomic_bool' (aka 'atomic<bool>') invokes deleted constructor."

Use the initialization constructor

Default values to atomic members can be assigned using an initializer list {} (braced initializer, or C++11 brace-init):

std::atomic<bool> _online{true};

For the complete API, refer to https://www.cplusplus.com/reference/atomic/atomic/

Aliases

And for those who want to improve readability a bit more, there are aliases we can use:

std::atomic_bool _online{true};

std::atomic_short
std::atomic_ushort
std::atomic_int
std::atomic_uint
std::atomic_long
std::atomic_ulong
std::atomic_llong
std::atomic_ullong
std::atomic_char8_t
std::atomic_char16_t
std::atomic_char32_t
std::atomic_wchar_t
std::atomic_int8_t
std::atomic_uint8_t
std::atomic_int16_t
std::atomic_uint16_t
std::atomic_int32_t
std::atomic_uint32_t
std::atomic_int64_t
std::atomic_uint64_t
std::atomic_int_least8_t
std::atomic_uint_least8_t
std::atomic_int_least16_t
std::atomic_uint_least16_t
std::atomic_int_least32_t
std::atomic_uint_least32_t
std::atomic_int_least64_t
std::atomic_uint_least64_t
std::atomic_int_fast8_t
std::atomic_uint_fast8_t
std::atomic_int_fast16_t
std::atomic_uint_fast16_t
std::atomic_int_fast32_t
std::atomic_uint_fast32_t
std::atomic_int_fast64_t
std::atomic_uint_fast64_t
std::atomic_intptr_t
std::atomic_uintptr_t
std::atomic_size_t
std::atomic_ptrdiff_t
std::atomic_intmax_t
std::atomic_uintmax_t

Tools

To detect data races you can use ThreadSanitizer, for example when you run your automated tests. It will tell you where to optimize your code.

clang++ -fsanitize=thread

And in cmake:

cmake -DCMAKE_C_FLAGS="-fsanitize=thread"

More about tsan here: https://clang.llvm.org/docs/ThreadSanitizer.html

pre-increment and post-increment in C++

Maybe you have wondered why there are 2 types if increment operators in C++ and why it matters in some situations.

Lets have a variable i of type int. Pre-increment would be ++i, and post-increment is i++ as we learned it in school;

The result in both ways is the same, it increments the variable i by 1. The difference is how it behaves when passing i along to a function:

void doSomething(int i) {
  std::cout << i << std::endl;
}

int i = 12;
doSomething(i++); // prints 12
// i is now 13
int i = 12;
doSomething(++i); // prints 13
// i is now 13

The explanation

Pre-increment increments the value of i and returns a reference to the now increased i.

Post-increment makes a copy of i, then increments i, but returns the copy.

std::map and the 2 evils

The task: store pairs of keys and objects in a map. Sounds simple? Well. std::map offers several ways of accessing its members:

operator[key]
at(key)
find(key)

That all sounds good as long as the key exists in the map. If not, it can get tricky: the operator[] inserts the key and a default constructed element, which often might be undesired, and we also cannot use it with const.

And the at() method? Throws an exception. There is a chance you don’t want neither of them.

What we still can try is getting an iterator using find():

auto it = myMap.find(9);

The iterator needs to be checked to not be end(), in case the element doesn’t exist in the map.

The value type of a map is std::pair<const Key, T>, so we can get the key and element from the iterator like so:

auto key = it->first;
auto value = it->second;

async unit tests in C++

Sometimes we are in the situation where we need to test the result of a callback that is invoked on another thread and we don’t know when this will happen.

The problem? Our test already runs out of scope before the callback gets called. In order to test such a scenario we’ll need the following:

Lets assume we have a downloader that takes a while to download something, and (hopefully) reports true when it was successful:

class Downloader {
    asyncDownload(std::string url, std::function<void(bool)> callback) {
        std::thread downloadThread(download, std::move(callback));
        downloadThread.join();
    }

    download(std::string url, std::function<void(bool)> callback) {

        // perform download which takes a while
        ...
        if (everythingWentWell)
            callback(true);
    }
};

We could start writing our test case like this:

TEST(Downloader, testDownload) {
    auto callback = [](bool success) {
        EXPECT_TRUE(success);
    };
    Downloader downloader;
    downloader.download("http://foo.bar/interesting-file.jpg", std::move(callback));
}

This will leave us in a situation where the test case finishes while the Downloader is still downloading. What we need now is a way for the test case to wait until the callback is invoked, with a maximum timeout.

The way to go here is a promise with a 30s max timeout:

std::promise<void> waitGuard;
...
    EXPECT_EQ(boost::future_status::ready, waitGuard.get_future().wait_for(std::chrono::seconds(30)));

When the callback gets invoked, we inform the wait guard by setting a value. The test case now looks like this:

TEST(Downloader, testDownload) {
    std::promise<void> waitGuard;
    auto callback = [&](bool success) {
        // remember this gets called on the other thread
        EXPECT_TRUE(success);
        waitGuard.set_value();
    };
    Downloader downloader;
    downloader.download("http://foo.bar/interesting-file.jpg", std::move(callback));
    EXPECT_EQ(boost::future_status::ready, waitGuard.get_future().wait_for(std::chrono::seconds(30)));
}

Now, the test will wait until the download finishes and the callback gets called, but not more than 30s.

std::expected

std::expected is a way of handling errors in C++. It presents an object which either has an expected value, or an error in case something unexpected happens.

std::expected<int, std::string>

std::expected is currently a proposal to C++, and in order to use it you’d have to pick one of the existing implementations yourself:

In my example I chose the single-header implementation expected-lite. Installing it is very easy, you can just put the header next to your main.cpp, or as I did it, in a nonstd subdirectory.

And here is how to use it

In this simple example, we have a method doSomething() and we expect it to return a double if everything goes well.

In case of an error, we don’t really expect the return value to be correct, it might be corrupted or otherwise wrong. Instead, we’d like to know what happened. Correct? In that case we actually expect an error.

This behaviour is exactly what std::expected does. Have a look:

#include "nonstd/expected.hpp"

nonstd::expected<double, std::string> doSomething() {
  // here we do something ..
  double frequency = calculateFrequency();
  // and try to detect an error:
  if (!frequency) {
    return nonstd::make_unexpected("Could not calculate frequency");
  }
  return {frequency};
}
..
auto calculatedFrequency = doSomething();
if (!calculatedFrequency) {
  std::cout << *calculatedFrequency << std::endl;
}

We wrap our return type into expected<> together with a string, which is our type for the error. In case of an error, call make_unexpected() with the error message.

Otherwise use the { value } syntax to pack the value if everything is fine.

It also works well with exceptions, where you can add the error message right from the exception:

nonstd::expected<int, std::string> queryDatabase() {
  try {
    return callIntoDatabase();
  } catch (const std::exception& exception) {
    return nonstd::make_unexpected(exception.what());
  }
}
..
auto record = queryDatabase();
if (!record) {
  std::cout << record.error() << std::endl;
}

Btw and out of topic, if you do code reviews in your team I’d like to invite you to this quick 3 minute, anonymous survey which helps understand the challenges people have when reviewing code. Thank you!