gnu logoOne feedback I got from my blog post on Understanding malloc behavior using Systemtap userspace probes was that I should have included an example script to explain how this works. Well, better late than never, so here's an example script. This script prints some diagnostic information during a program run and also logs some information to print out a summary at the end. I'll go through the script a few related probes at a time.

global sbrk, waits, arenalist, mmap_threshold = 131072, heaplist

First, we declare some globals:

  • We record the process heap size in the sbrk variable
  • We count the number of times a thread had to wait for an arena lock in the waits variable
  • We record an association of threads and arena address spaces in the arenalist variable
  • We record the mmap threshold in the suitably named mmap_threshold variable. We initialize the value to the known initial value of 128K
  • We record the list of heaps in the heaplist variable. The address of the heap is the key to this associative array and the size is the value

Now, whenever the process heap is increased or decreased, we record the increment or decrement.

probe process("/lib64/libc.so.6").mark("memory_sbrk_more")
{
    sbrk += $arg2
}

probe process("/lib64/libc.so.6").mark("memory_sbrk_less")
{
    sbrk -= $arg2
}

Next, we log every time there is a change in the dynamic mmap and trim thresholds since those values affect how allocations are done. One could extend this by adding a probe on various allocation functions (malloc, calloc, etc.) and seeing how future allocations relate to the threshold in terms of size.

probe process("/lib64/libc.so.6").mark("memory_mallopt_free_dyn_thresholds")
{
    printf("%d: New thresholds:: mmap: %ld bytes, trim: %ld bytesn", tid(), $arg1,
               $arg2)
    mmap_threshold = $arg1
}

Next, we note every time a new arena is created and also record which thread is using it. At the same time, a thread may block to access the arena and we want to record that.

probe process("/lib64/libc.so.6").mark("memory_arena_new")
{
    printf ("%d: Created new arenan", tid())
        arenalist[$arg1, tid()] = 1
}

probe process("/usr/lib64/libc.so.6").mark("memory_arena_reuse_wait")
{
    waits[tid()]++
}

probe process("/usr/lib64/libc.so.6").mark("memory_arena_reuse")
{
    if ($arg2 != 0)
    {
        printf ("%d: failed to allocate on own arena, trying anothern", tid())
            arenalist[$arg1, tid()] = 1
    }
}

Next, the good case, where a thread picks up an arena from the free list. We update the association of that arena with a new thread.

probe process("/usr/lib64/libc.so.6").mark("memory_arena_reuse_free_list")
{
    arenalist[$arg1, tid()] = 1
}

And then, when a thread is destroyed, its arena is sent back to the free list. Find that arena in our records and mark it as unused.

probe process.thread.end
{
    /* Find the thread and mark its arena as unused.  */
    foreach ([a, t] in arenalist)
        if (t == tid())
            break
    arenalist[a, t] = 0
}

Next, we look at arena heap accounting. These probes give the current size of the heap at all times, so it is easy for us to record it.

probe process("/usr/lib64/libc.so.6").mark("memory_heap_new")
{
    printf("%d: New heapn", tid());
    heaplist[$arg1] = $arg2
}

probe process("/usr/lib64/libc.so.6").mark("memory_heap_more")
{
    heaplist[$arg1] = $arg2
}

probe process("/usr/lib64/libc.so.6").mark("memory_heap_less")
{
    heaplist[$arg1] = $arg2
}

probe process("/usr/lib64/libc.so.6").mark("memory_heap_free")
{
    heaplist[$arg1] = 0
}

Finally, at the end of the program, we print out a summary that tells us how many times a thread had to wait for a lock on an arena, an association of arenas with their active thread at the time of process exit, allocated heaps and their sizes, the total size of the process heap on exit and finally, the mmap threshold on exit.

probe end
{
        printf ("Contention: n")
        foreach (t in waits)
            printf ("t%d: %d waitsn", t, waits[t])

        print("Active arenas:n")
        foreach ([a, t] in arenalist)
            if (arenalist[a, t])
                printf ("t%d -> %pn", t, a)

        print ("Allocated heaps:n")
        foreach (h in heaplist)
            if (heaplist[h])
                printf ("t%p -> %ld bytesn", h, heaplist[h])

    printf ("Total sbrk: %ld bytesn", sbrk)
    printf ("Mmap threshold in the end: %ld kbn", mmap_threshold / 1024)
}

When I run this script on Red Hat Enterprise Linux 7 with gnome-help and click a couple of links, here is what I get as output:

# stap malloc.stp -c gnome-help
17319: New heap
17319: Created new arena
17320: New heap
17320: Created new arena
17321: New heap
17321: Created new arena
17324: New heap
17324: Created new arena
17325: New heap
17325: Created new arena
17326: New heap
17326: Created new arena
17315: New thresholds:: mmap: 565248 bytes, trim: 1130496 bytes
17329: New heap
17329: Created new arena
17336: New heap
17336: Created new arena
17337: New heap
17337: Created new arena
17335: New heap
17335: Created new arena
17338: New heap
17338: Created new arena
17334: New heap
17334: Created new arena
17315: New thresholds:: mmap: 925696 bytes, trim: 1851392 bytes
17339: New heap
17339: Created new arena
17341: New heap
17341: Created new arena
17342: New heap
17342: Created new arena
17315: New thresholds:: mmap: 1642496 bytes, trim: 3284992 bytes
17315: New thresholds:: mmap: 1642496 bytes, trim: 3284992 bytes
Contention:
Active arenas:
        17326 -> 0x7f8a80000020
        17328 -> 0x7f8ad4000020
        17360 -> 0x7f8b01742760
Allocated heaps:
        0x7f8ae4000000 -> 139264 bytes
        0x7f8ad8000000 -> 139264 bytes
        0x7f8ad0000000 -> 135168 bytes
        0x7f8ad4000000 -> 27246592 bytes
        0x7f8a88000000 -> 10854400 bytes
        0x7f8a80000000 -> 139264 bytes
        0x7f8a84000000 -> 135168 bytes
        0x7f8a78000000 -> 10919936 bytes
        0x7f8a70000000 -> 311296 bytes
        0x7f8a74000000 -> 540672 bytes
        0x7f8a68000000 -> 352256 bytes
        0x7f8a6c000000 -> 335872 bytes
        0x7f8a5c000000 -> 5074944 bytes
        0x7f8a54000000 -> 6815744 bytes
        0x7f8a58000000 -> 5279744 bytes
Total sbrk: 17633280 bytes
Mmap threshold in the end: 1604 kb
Last updated: April 5, 2018