Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Understanding malloc behavior using Systemtap userspace probes

 

October 2, 2014
Siddhesh Poyarekar
Related topics:
Security
Related products:
Developer Tools

Share:

    The malloc family of functions are critical for almost every serious application program. Its performance characteristics often have a big impact on the performance of applications. Given that the default malloc implementation needs to have consistent performance for all general cases, it makes available a number of tunables that can help developers tweak its behavior to suit their programs.

    About two years ago I had written an article on the Red Hat Customer Portal that described the high level design of the GNU C Library memory allocator and also introduced the reader to various magic environment variables that malloc understands to change its behavior. The behavior documented in that article and the tricks to tweak malloc behavior hold just as true for RHEL-7, which is based on upstream glibc 2.17 as they did for RHEL-6, which is based on upstream glibc 2.12.

    However, it can be pretty cumbersome for an application developer to try and find out exactly what aspect of the malloc implementation to tweak. The best way would have been to try tweaking all of the magic variables and stick with the combination that works best. This is very ad hoc, which is why we came up with the idea of adding Systemtap static probe points to malloc to make this process much more systematic. These malloc probes were included upstream in glibc 2.19 and have been backported to RHEL-7.

    The Userspace Probing chapter in the Systemtap Beginners' Guide on the Customer Portal is a good starting point if you've never used Systemtap for userspace probes before. The glibc manual (invoked using the info libc command) has details for each of the probe points. This post only intends to serve as a guideline to explain what hitting those probe points may mean for your application in terms of performance.

    Changes to the process heap

    There are two major events that have a performance penalty during changes made to the process heap, viz. growing and shrinking of the heap. These are indicated by the following probe points:

    memory_sbrk_more (void *$arg1, size_t $arg2)
    memory_sbrk_less (void *$arg1, size_t $arg2)

    As the names suggest, memory_sbrk_more is hit when the heap is grown using the sbrk system call and memory_sbrk_less is hit when the heap is shrunk using the same system call. $arg1 is the pointer that markes the new end of the process heap and $arg2 is the size by which the process heap was grown or shrunk. The penalty here is the invocation of a system call, which results in a context switch into the kernel.

    The other important factor in allocator behavior is the choice between allocating a malloc request on the heap as opposed to using mmap to service the request. Larger requests are better off being mmapped since they have a tendency to create larger unused gaps in the heap. Allocating on the heap however is advantageous from a performance point of view since it does not involve a syscall. To strike a balance, the allocator maintains a dynamic threshold that is adjusted to service frequent requests to the heap to improve performance at the cost of higher potential space wastage. Requests smaller than the threshold are allocated on the heap while larger ones are serviced using mmap. Changes in this dynamic threshold are important because they can change the balance between heap wastage and speed of allocation.

    memory_mallopt_free_dyn_thresholds (int $arg1, int $arg2)

    This probe is triggered when the free function adjusts the dynamic threshold. Argument $arg1 and $arg2 are the adjusted mmap and trim thresholds, respectively.

    Changes to arenas

    Multiple threads in a program cannot possibly scale in performance if they have to synchronize access to the process heap for memory allocation. For this reason, the allocator maintains multiple arenas so that threads attach themselves to their own arenas and hence don't have to contend with each other during malloc. Maintaining these arenas has some bottlenecks. The first major cost is the creation of an arena.

    memory_arena_new (void *$arg1, size_t $arg2)

    A new arena is typically created when a newly created thread calls malloc for the first time. This is an expensive event because it involves getting an address space mapping from the kernel. It is also expensive because it means additional address space utilization by the process. $arg1 is the address space returned for the arena and $arg2 is the size of the arena.

    Of course, every new thread doesn't result in creation of a new arena. When a thread exits, it may leave behind an arena that is available for reuse in a free list. Hitting the memory_arena_reuse_free_list probe point is an indicator that this may have happened. This is a good sign since arena reuse for exclusive usage means that resources are being optimally used.

    Alternatively, a thread may fail to get an existing free arena and also may not create an arena because of the limit on the number of arenas that can be created in a process. This is determined either by the M_ARENA_MAX mallopt parameter or by the number of available cpu cores. Once this limit is reached, threads have to share arenas with other threads, which is when the following probe point is hit:

    memory_arena_reuse (void *$arg1, void *$arg2)

    $arg1 is the arena that's about to be reused and $arg2 is the arena that the thread failed to allocate space on. If $arg2 is NULL, it means that the calling thread has invoked malloc for the first time, that is we have reached the limit for maximum arenas. This is an indicator that your application may experience lock contention when allocating memory.

    On the other hand if $arg2 is not NULL, it means that the calling thread may have failed to allocate space on the arena at $arg2.

    If the memory_arena_reuse probe is hit without hitting memory_arena_reuse_wait, then it means that the thread did not encounter any contention at that moment. This obviously does not mean that the thread will never see contention on this arena.

    memory_arena_reuse_wait (void *$arg1, void *$arg2, void *$arg3)

    If this probe is hit just before memory_arena_reuse, it means that the calling thread is about to enter a wait state on the lock for the arena. The lock address is in $arg1, $arg2 is the arena it is trying to acquire and $arg3 is the arena the thread failed to allocate memory on previously. Like in case of $arg2 in memory_arena_reuse, if $arg3 is NULL, then the thread is calling malloc for the first time and was unable to secure an arena exclusively for itself.

    Arena Heaps

    The process heap is easy to extend using the sbrk system call. Arenas however are allocated using mmap and it is not always possible to extend them contiguously. To work around this, the allocator implements the concept of an arena heap, which is an mmapped location chained on to the arena to extend the arena.

    memory_heap_new (void *$arg1, size_t $arg2)

    This probe is hit when a new heap is allocated for an arena. This is an expensive operation since it involves a system call to the kernel to get address space for the heap. $arg1 is the returned heap address and $arg2 is the size of the heap.

    On allocation, much of the heap has PROT_NONE permissions. In fact, this is true for the originally allocated arenas as well and is done so to reduce the actual commit charge for the process. Portions of the arena heaps are given permissions using the mprotect system call to give the effect of growing. Similarly, portions are given back to the system by using either the madvise system call or by using mprotect to give PROT_NONE permissions again. There are probe points to capture these events since they are again costly events.

    memory_heap_more (void *$arg1, size_t $arg2)

    As the name suggests, this probe point tracks the growth of the arena heap, which is done using the mprotect syscall to give read+write permissions to appropriate blocks within the heap. $arg1 is the address of the heap and $arg2 is the new size of the heap.

    memory_heap_less (void *$arg1, size_t $arg2)

    This is exactly the opposite of memory_heap_more. The trailing portion of the heap is returned to the system using either the madvise or mprotect system call. $arg1 is the address of the heap and $arg2 is the new size of the heap.

    memory_heap_free (void *$arg1, size_t $arg2)

    When an arena heap is completely unused, it may be freed. This probe point tracks this operation since it involves calling the munmap syscall to return the arena heap to the system. $arg1 is the address of the arena heap and $arg2 is the size.

    Memory pressure in arenas

    memory_malloc_retry (size_t $arg1)
    memory_realloc_retry (size_t $arg1, void *$arg2)
    memory_memalign_retry (size_t $arg1, size_t $arg2)
    memory_calloc_retry (size_t $arg1)
    memory_arena_retry (size_t $arg1, void *$arg2)

    These probes are triggered when the corresponding functions fail to obtain the requested amount of memory from the arena in use. The memory_arena_retry probe is a catch-all for all of the individual probes, which is useful when one only wants to see cases where a thread had to change arenas due to resource limitations.

    These probes are an indication that another arena may be tried or allocation may fail. Usually this would also result in sharing of arenas among threads, which in turn increases contention between threads, thus affecting performance. $arg1 is the user requested size for all probes above. For probes that have $arg2 as a pointer, that is the old memory address. In memory_memalign_retry, $arg2 is the requested alignment.

    Looking out for mallopt

    Finally, you may want to see when your mallopt tweaks kicked in. Alternatively, libraries that the application depends on may be doing its own tweaking of malloc behavior which will have consequences on application performance. There are probe points to indicate when such tweaking is done.

    memory_mallopt (int $arg1, int $arg2)

    This is a catch-all probe that is hit whenever an application calls the mallopt function. $arg1 and $arg2 are arguments passed to mallopt.

    memory_mallopt_mxfast (int $arg1, int $arg2)
    memory_mallopt_trim_threshold (int $arg1, int $arg2, int $arg3)
    memory_mallopt_top_pad (int $arg1, int $arg2, int $arg3)
    memory_mallopt_mmap_threshold (int $arg1, int $arg2, int $arg3)
    memory_mallopt_mmap_max (int $arg1, int $arg2, int $arg3)
    memory_mallopt_check_action (int $arg1, int $arg2)
    memory_mallopt_perturb (int $arg1, int $arg2)
    memory_mallopt_arena_test (int $arg1, int $arg2)
    memory_mallopt_arena_max (int $arg1, int $arg2)

    These are separate probes for each of the allowed mallopt parameters. $arg1 is the requested value and $arg2 is the previous value of this parameter.

    The third argument ($arg3) in some of the probe points above is nonzero if dynamic threshold adjustment was already disabled.

    Conclusion

    Hopefully this article has helped you gain a deeper understanding of how to tweak malloc behaviour for your applications or even understand how the allocator works. There are a number of such static probe points throughout glibc in RHEL-7 — in the dynamic linker, pthreads implementation and even the math library. Watch out for information on static probe points in future posts.

    Last updated: February 7, 2024

    Recent Posts

    • How to run a fraud detection AI model on RHEL CVMs

    • How we use software provenance at Red Hat

    • Alternatives to creating bootc images from scratch

    • How to update OpenStack Services on OpenShift

    • How to integrate vLLM inference into your macOS and iOS apps

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue