Featured image for Valgrind.

Valgrind is an instrumentation framework for building dynamic analysis tools that check C and C++ programs for errors. Memcheck is the default tool Valgrind uses when you don't ask it for another tool. Valgrind Memcheck can detect various memory leaks and keep track of whether memory is accessible and defined. But what if you have built your own memory manager? Memcheck keeps track of memory by observing the standard malloc/free, new/delete, and mmap/munmap system calls. But Memcheck doesn't know how a program subdivides that memory internally without a little help, and this article will show you how to provide that help through specialized code annotations.

Replacing malloc

If you simply replace the whole GNU C malloc implementation by defining your own functions with the same names (for example, by using tcmalloc or jemalloc), Valgrind will, since version 3.12.0, intercept all your replacement function (like malloc, free, and so on) unless you tell it not to. (See the --soname-synonyms option in the Valgrind manual for more on this.) After you give your functions these standard names, Memcheck can provide all normal memory-tracking functions, just as if you were using the system's malloc implementation.

The RUNNING_ON_VALGRIND macro

If you wrote your own allocator for more specialized use, Valgrind has a way for you to annotate your code so that tools such as Memcheck can keep track of the memory blocks you hand out to the rest of your program.

Make sure you have the valgrind-devel package installed to get access to this behavior. That package provides the /usr/include/valgrind/valgrind.h file, which defines some basic macros that annotate your code using instructions that look like they don't do anything. These instructions provide near-zero overhead in normal use; but when run under Valgrind, they are recognized as "magic sequences" that instruct Valgrind to do something special at that place in the code.

The simplest macro is RUNNING_ON_VALGRIND, which is 0 if running natively and 1 when running under Valgrind. If you have defined your own allocation and deallocation functions, you could use the macro as follows:

#include <valgrind.h>
#include <stdlib.h>

/* Some global structures for the real allocator.  */

void *
my_alloc (size_t size)
{
  if (RUNNING_ON_VALGRIND)
    return malloc (size);

  void *p = NULL;
  /* The real allocator ... */
  return p;
}

void
my_free (void *ptr)
{
  if (RUNNING_ON_VALGRIND)
    free (ptr);

  /* The real deallocator ... */
}

Compile this code with:

$ gcc -I/usr/include/valgrind -O2 -g -c my_alloc.c

Now, when running under Valgrind (and only when running under Valgrind), your allocation functions can simply use malloc and free, and Memcheck can track all memory usage as normal.

This might look like cheating. You put a lot of work into your own allocator, which you believe to be way more efficient for your application than the GNU C library allocator. And now when running under Valgrind, all that work is thrown away by simply calling the malloc and free functions you were trying to avoid.

But if you are running under Valgrind, you don't do it for efficiency: You do it to catch memory issues. And now Memcheck can report memory leaks, use-after-free errors, undefined memory use, buffer overruns, and more for all blocks allocated through my_alloc and my_free without you having to add any extra instrumentation to your own allocator.

Tracking with BLOCK macros

If you have your own library for managing memory with unique function names, there is a different way to make Valgrind track memory while your own allocator hands out and retrieves blocks: the VALGRIND_MALLOCLIKE_BLOCK and VALGRIND_FREELIKE_BLOCK macros.

Like other valgrind.h macros, these do nothing when your program is not running under Valgrind. But when run under Valgrind, they let Memcheck track memory usage from your custom allocator.

VALGRIND_MALLOCLIKE_BLOCK takes four arguments: the starting address of the block, the size of the block in bytes, the size of the redzone (the padding around the block) in bytes, and whether the block's contents have been initialized (with zero or a known pattern). The redzone can be eliminated by designating its size as zero; but if your allocator can reserve bytes before and after the block (Memcheck by default uses 16 bytes when overriding malloc), those extra bytes help Valgrind detect reads and writes that overrun or underrun memory.

VALGRIND_FREELIKE_BLOCK takes two arguments: the starting address of the block and any redzone bytes added as padding.

#include <valgrind.h>

/* Some global structures for the real allocator.  */

void *
my_alloc (size_t size)
{
  void *p = NULL;
  /* The allocator ... */

  VALGRIND_MALLOCLIKE_BLOCK (p, size, 0, 0);
  return p;
}

void
my_free (void *ptr)
{
  VALGRIND_FREELIKE_BLOCK (p, 0);
  /* The deallocator ... */
}

Use VALGRIND_FREELIKE_BLOCK as early in your program as possible and VALGRIND_MALLOCLIKE_BLOCK as late as possible. This practice allows Memcheck to provide the most accurate warnings when it detects an issue. If those macros are embedded deeper in the allocator code, stack traces will include parts of your allocator internals, which can confuse users.

Another macro, VALGRIND_RESIZEINPLACE_BLOCK, functions similarly to the realloc system call.

Beyond blocks: Accessible and defined memory

The Valgrind BLOCK macros in the previous section are all you need for a simple allocator that doesn't itself need to use (that is, to read or write) any of the memory blocks.

But in most cases, you need other annotations to tell Memcheck when the allocator itself is manipulating addresses inside these blocks. These more complex macros are defined in /usr/include/valgrind/memcheck.h.

The BLOCK macros treat the memory areas as either accessible or inaccessible, and mark them as needing to be tracked for memory leaks. And that goes for the whole application, including the allocator parts, which might need special access.

For example, if your allocator wants to clear the contents of a memory block before returning it, or if you embed some allocator meta-information in the memory block's redzone, you have to tell Valgrind you are allowed to write to that memory block even if it isn't fully defined yet (or if it has been freed before and is now being reused). Use VALGRIND_MAKE_MEM_UNDEFINED, VALGRIND_MAKE_MEM_UNDEFINED, and VALGRIND_MAKE_MEM_NOACCESS for these tasks.

All three MAKE_MEM macros take two arguments: a starting address and a size in bytes. The VALGRIND_MAKE_MEM_UNDEFINED and VALGRIND_MAKE_MEM_UNDEFINED macros mark the address range as accessible and the contents as defined or undefined, respectively.

VALGRIND_MAKE_MEM_NOACCESS marks the address range as inaccessible. But unlike the BLOCK macros, this one doesn't ask Valgrind to track the areas for memory leaks. This macro therefore lets the allocator use memory blocks, while the rest of your application won't. (Do remember to flip the blocks back to being accessible before leaving the allocator code.)

These macros are also useful when you reuse data structures, or if a data structure is allocated using calloc or cleared with memset, but you want to mark parts of the structure or the whole structure as having undefined values that need to be tracked by Memcheck:

#include <string.h>
#include <memcheck.h>

/* The raw message as mapped in by frob ()  */
struct rawmsg
{
  int frob_count;
  long frob_flags;
  unsigned char frobs[128];
  /* lots of other fields...  */

struct message
{
  int seq_nr;
  struct rawmsg msg;
};

/* Cleans up the given message and returns a pointer to the next one.
   Don't forget to frob() the message before accessing the fields. */
struct message*
next_message (struct message *m)
{
  int next_seq_nr = m->seq_nr + 1;

  /* clean up the old message... */
  memset (m, 0, sizeof (struct message));
  m->seq_nr = next_seq_nr;

  /* Except for the sequence number, everything else is undefined till
     frob () is used on the message.  */
  VALGRIND_MAKE_MEM_UNDEFINED (&m->msg, sizeof (struct rawmsg));
 
  return m;
}

In this example, a message data structure is passed around that gets "refreshed" from time to time. The next_message function clears various fields under the assumption that those fields will get "real" values later. Normally Memcheck assumes that all fields have defined values, because it sees the memset function putting values into the message structure.

But the VALGRIND_MAKE_MEM_UNDEFINED macro lets Valgrind know that the values in those fields aren't really defined. Now Memcheck will warn when any field value is used before being properly set first. (Remember, the macro does nothing when not running under Valgrind.)

Learn more about Valgrind Memcheck

This article showed you how you can annotate your code so that you can use Valgrind Memcheck even when you've built your own memory manager. To learn more about the Valgrind client requests and the macros specific to Memcheck, see the Valgrind manual chapters on the client request mechanism and Memcheck client requests. And for more from Red Hat Developer on Valgrind's capabilities, check out the following articles: