Tracing Code in C++

Josh Weinstein
The Startup
Published in
7 min readMar 24, 2020

--

Credits to WoodWorkCareer

One difficult part about languages like C and C++ is the amount of freedom given to the developer. Nothing, or well close to nothing is hidden or masked away. You have the full control over memory, threading, processes, file descriptors, and so much more. You can even write new operating system kernels with C++.

Yet, with more freedom comes less support. All this freedom is achieved through removing builtin or prepackaged code that handles tasks like memory allocation and reallocation. There’s no universal, native way to track the use of new and delete , for example. When projects become larger and larger, misuses of memory become harder and harder to detect. Some C++ code bases have multiple processes running together, multiple threads, and multiple data structures all using up and freeing memory.

There are two ways of tracking memory usage in C++. One is using a heap profile. This allows inspection into the way the heap of the process grows or shrinks over time. This is a dynamic analysis method, it happens at run-time. The other way is tracing, also called debug messaging. This approach deals with embedding and including messages that can be printed to a console or written to a file, which gives information on how the code runs. Tracing happens at run-time, but it is configured and structured at compile time. In this article, I will explain the benefits and advantages of tracing.

The Fundamentals

The goal of tracing code in C++ is to write code that can emit desirable information at run-time. However, this should be done implicitly as much as possible. Meaning, tracing a particular function, should not be different than how a different function might be traced. The advantage of tracing, when done correctly, allows code to be written as if it were never intended to be traced, yet still has the added benefit if debugging is required.

However, there are limits to what can be traced in C++. Despite being a low level language, complete access to everything isn’t given. You also have to think carefully whether the behavior or events you’re trying to trace are even going to be called. A great example is trying to track when integers are added with + operator. Let’s look at the following code:

The above attempts to overload the + operator, and print out a message whenever the + operator is used. However, this program prints out nothing. It does not work as intended at all. Why ? Two reasons.

  1. Adding two integer types just translates to assembly instructions, it doesn’t use a function call or a new stack frame.
  2. You cannot overload infix operators for primitive types such as an integer in C++. At least one side of the operator must be a class type.

Tracing can still offer immense benefits, you just need to use it correctly. Next, let’s try and trace the most fundamental building block of C++ or any programming language, functions.

Functions

The execution of most programs, regardless of programming language are traced through function calls. In C++, every program or running process begins with the main() function. The main() function makes calls to other functions, which eventually leads to the exit or termination of the process at some point. A function could be traced, by simply writing a printing statement in it’s body:

Note: Some compilers may not support __func__ . Other pre-defined identifiers, like __FUNCTION__ are sometimes used.

This works, but not that well. Explicitly writing print statements is not only tedious, they make code harder to understand and read. Differentiating between true error statuses, for example, and tracing, could prove difficult with explicit statements. Tracing a function shouldn’t make the body of the function look that different.

In order to trace while not polluting function bodies, macros need to be used. Macros are a perfect remedy for this scenario because they can mask away statements of code from source code. That way, all the tracing statements will be pre-processed before compile time, making the code look clean from the developer point of view. Consider the following:

Here, the macro TRACED takes the desired name of a function, and it’s variable list of parameters. __VA_ARGS__ is a special macro introduced in C99 that allows variable arguments, similar to those in the stdarg.h header of the C standard library. TRACED defines a function so that the first statement is always a printf of it’s own name, and that it’s currently being called. This permits the actual tracing statement to be hidden from the source code. The # macro is used to stringify the name of the function such that it becomes a string literal that can be printed.

Tracing is not intended to be used in all builds or runs of a program. In fact, it’s really only needed when debugging or metrics are needed, beyond from what is observable from typical output of a program. So far, the approaches to tracing would have to be manually be rewritten if tracing needed to be turned off. To change that, the special #ifdef and #else pre-processor macros can be used to control when the TRACED macro actually evaluates to a trace statement or not.

During compilation, if the -DWITH_TRACE flag is passed to the compiler, tracing will be enabled. Otherwise, the functions defined with TRACED will execute as normal. This allows two types of “builds” to be producible, a debugging, traced build, or a release build.

Counting Calls

Let’s say that simply printing every time a function is called is not sufficient enough for the desired level of debugging. There might be a program that calls several functions hundreds of thousands of times. You might want insight into how many times particular functions are called. While it’s possible to save the output of tracing to a file, then write a script to search and count how many times particular lines appear, that’s a lot of extra work.

Instead, the TRACED macro can be expanded to track the number of times a particular function is called through a static counter. Static variables in C++ are locally linked and live for the duration of the program. So having a hidden static variable inside a function does not risk any potential namespace collisions, and can keep track of the call count.

This would lead to the following output:

Called add 1 times

Called add 2 times

In this example, the ## pasting macro is utilized to create a function specific variable name, lessening the likelihood another variable within the function body might collide with it.

Memory Tracing

A common application of tracing in C++ is monitoring memory allocation. Monitoring memory allocation provides insight into how a particular program may be allocating or freeing the memory it uses. This information can tell whether if there are issues at play such as a memory leak or memory growing too quickly.

In C++, new and delete are the most commonly used forms of memory allocation and freeing. They are both operators, rather than functions, and work differently than malloc and free. The new and delete operators can be overloaded, like so:

This will print the size of every block of memory that is allocated, as well as the returned pointer’s address. It will print the address of every delete block of memory. However, globally overloading new and delete has some drawbacks.

Solving a memory problem or leak requires knowing not only that memory is not being freed, but also what part of the code is not freeing that memory. Global overloading does not point to specific functions or classes that could be responsible for that. Secondly, some compilers may declare delete or new operators differently, leading to warnings about mixing explicit and implicit declarations, like:

To circumvent those disadvantages, both new and delete can be overloaded on a per-class basis. This permits access to the instance’s properties, as well as being able to differentiate what memory is being allocated or freed for. Lastly, tracing memory allocation on a per-class basis allows customized behavior for each class. One may or may not want to record memory usage for different classes or data structures.

Let’s look at an example of this with a very basic class, which only has a single property:

The above class, A is constructed with a C-string literal. A’s new and delete operators are overloaded. They are overloaded by using malloc and free, and printing the related status information, such as the size of the memory being allocated. Although this could work with calling new and delete, it would work differently.

New does not use the same memory allocation scheme as malloc does. In fact, trying to call delete on a void pointer actually leads to a compiler warning on most platforms. That’s because , new and delete are concerned with the type and size of the memory, while the C library memory functions are most concerned with size.

Overall, tracing is a powerful concept that can greatly enhanced the safety and durability of C++ code bases. It can allow developers to gain more insight into the run-time behavior of their programs, and wield stronger triage skills for memory leak situations.

--

--

Josh Weinstein
The Startup

I’m an engineer on a mission to write the fastest software in the world.