Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Dirty Tricks: Launching a helper process under memory and latency constraints (pthread_create and vfork)

August 19, 2015
Carlos O'Donell
Related topics:
Linux
Related products:
Red Hat Enterprise Linux

Share:

    You need to launch a helper process, and while Linux's fork is copy-on-write (COW), the page tables still need to be duplicated, and for a large virtual address space that could result in running out of memory and performance degradation. There are a wide array of solutions available to use, but one of them, namely vfork is mostly avoided due to a few difficult issues. First is that vfork pauses the parent thread while the child executes and eventually calls an exec family function, this is a huge latency problem for applications. Secondly is that there are a great many number of considerations to take into account when using vfork in a threaded application, and missing any one of those considerations can lead to serious problems.

    It should be possible for posix_spawn to safely do all of this work via POSIX_SPAWN_USEVFORK, but often there is quite a lot of "work" that needs to be done just before the helper calls an exec family function, and that has lead to ever increasingly complex versions of posix_spawn like posix_spawn_file_actions_addclose, posix_spawn_file_actions_adddup2, posix_spawn_file_actions_destroy, posix_spawnattr_destroy, posix_spawnattr_getsigdefault, posix_spawnattr_getflags, posix_spawnattr_getpgroup, posix_spawnattr_getschedparam, posix_spawnattr_getschedpolicy, and posix_spawnattr_getsigmask. It might be simpler if the GNU C Library documented a small subset of functions you can safely call, which is in fact what the preceding functions are modelling. If you happen to select a set of operations that can't be supported by posix_spawn with vfork then the implementation falls back to fork and you don't know why. Therefore it is hard to use posix_spawn robustly.


    How do you overcome the limits of vfork without having to resort to the complexity and security considerations of IPC between a helper daemon that starts processes for you? Use pthread_create and vfork together to give you the semantics you want. Use of pthread_create and vfork gives you all the benefit of vfork, the shared page tables, the fast execution, coupled with the pauseless execution you need. Only the additional thread is paused when you vfork, the rest of the threads in the process continue executing.

    This is certainly a dirty trick, but as far as the author is concerned the example code takes into account or warns the reader of all possible considerations to doing this safely. Likewise all actions that do not impact parent state are valid to execute after the vfork and before the exec family function call, and I argue that all C libraries should document such functions.

    Cheers,
    Carlos.


    /* Copyright (c) 2014 Red Hat Inc.
    
    Written by Carlos O'Donell <codonell@redhat.com>
    
    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to deal
    in the Software without restriction, including without limitation the rights
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    copies of the Software, and to permit persons to whom the Software is
    furnished to do so, subject to the following conditions:
    
    The above copyright notice and this permission notice shall be included in
    all copies or substantial portions of the Software.
    
    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
    THE SOFTWARE. */
    
    /* Example: How to use vfork safely from a multi-threaded application.
    
    This example is intended to show the safe usage of vfork by a multi-threaded
    application. The example does not use any advanced features like clone
    without CLONE_VFORK to avoid parent suspension. The example can also be
    rewritten slightly to be used in a non-multithreaded environment and it still
    remains safe since the latter is just a degenerate case of the former with
    one main thread.
    
    The example is only valid on Linux with the GNU C Library as the core
    runtime. Other runtimes may require other actions to call vfork safely from
    a multi-threaded application.
    
    The inline comments in the code will explain each of the steps taken and
    why. Justification for some steps is rather complicated so please read it
    twice before asking questions.
    
    Any questions should go to libc-help@sourceware.org where the GNU C Library
    community can assist with interpretations of this code. */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <sys/types.h>
    #include <sys/wait.h>
    #include <signal.h>
    #include <pthread.h>
    #include <errno.h>
    #include <string.h>
    
    /* The helper thread executes this application. */
    const char *filename = "/bin/ls";
    char *const new_argv[2] = { "/bin/ls", NULL };
    char *const new_envp[1] = { NULL };
    int status;
    
    void *
    run_thread (void *arg)
    {
    int i, ret;
    pid_t child, waited;
    struct sigaction newsa, oldsa;
    
    /* Block all signals in the parent before calling vfork. This is
    for the safety of the child which inherits signal dispositions and
    handlers. The child, running in the parent's stack, may be delivered a
    signal. For example on Linux a killpg call delivering a signal to a
    process group may deliver the signal to the vfork-ing child and you want
    to avoid this. The easy way to do this is via: sigemptyset,
    sigaction, and then undo this when you return to the parent. To be
    completely correct the child should set all non-SIG_IGN signals to
    SIG_DFL and then restore the original signal mask, thus allowing the
    vforking child to receive signals that were actually intended for it, but
    without executing any handlers the parent had setup that could corrupt
    state. When using glibc and Linux these functions i.e. sigemtpyset,
    sigaction, etc. are safe to use after vfork. */
    sigset_t signal_mask, old_signal_mask, empty_mask;
    sigfillset (&signal_mask);
    
    /* One might think we need to block SIGCANCEL (cancellation handling signal)
    and SIGSETXID (set*id handling signal). These signals are a hidden part of
    the implementation, and if delivered to the child would corrupt the parent
    state. The SIGSETXID signal is only sent to threads that the
    implementation knows about and the child of vfork is not known as a thread
    and thus safe from having a set*id handler run. This is a distinct issue
    from the one below regarding calling set*id functions. The SIGCANCEL
    signal is only sent in response to a pthread_cancel call, and since the
    child has no pthread_t it will not receive that signal by any ordinary
    means. Thus it would be undefined for anything to send SIGSETXID or
    SIGCANCEL to the child thread. If you suspect something like this is
    happening you might try adding this code:
    
    #define SIGCANCEL __SIGRTMIN
    #define SIGSETXID (__SIGRTMIN + 1)
    sigaddset (&signal_mask, SIGCANCEL);
    sigaddset (&signal_mask, SIGSETXID);
    
    This will prevent cancellation and set*id signals from being acted upon.
    Please report this problem to libc-alpha@sourceware.org if you encounter
    it since the child running either handler for those signals is an
    implementation defect. */
    
    pthread_sigmask (SIG_BLOCK, &signal_mask, &old_signal_mask);
    
    /* WARNING: Do not call setuid(2) any other set*id(2) functions from other
    threads while vfork-ing. This could allow privilege escalation attacks.
    
    It is often assumed that vfork(2) stops the entire process but on many OS's
    it just suspends the thread which called vfork(2). Calling a setuid(2)
    function from another thread while vfork-ing could result in two threads
    with different UIDs or GIDs sharing the same memory space.
    
    As a concrete example a thread might be running as root, vfork a helper,
    and then proceed to setuid to an unprivileged user to run some untrusted
    code. In this case the root privilege thread shares the same address space
    as the unprivileged threads. One of the unprivileged threads could then
    remap parts of the address space to get root privileged thread, which has
    not yet exec'd, to execute arbitrary code.
    
    Therefore you need to be careful about calling set*id() functions while
    vfork-ing. You avoid this problem by coordinating your credential
    transitions to happen after you know your vfork() is complete i.e. the
    parent is resumed telling you the child has completed exec-ing. If you
    can't coordinate the use of set*id() functions, then the only option left
    is to use the posix_spawn* interfaces which serialize set*id() transitions
    in glibc (Sourceware BZ #14750 and BZ #14749 must be fixed in your version
    of glibc for this to work properly). */
    child = vfork ();
    
    if (child == 0)
    {
    /* In the child. */
    
    /* We reset all signal dispositions that aren't SIG_IGN to SIG_DFL.
    This is done because the child may have a legitimate need to
    receive a signal and the default actions should be taken for
    those signals. Those default actions will not corrupt state in
    the parent. */
    newsa.sa_handler = SIG_DFL;
    if (sigemptyset (&empty_mask) != 0)
    _exit (1);
    newsa.sa_mask = empty_mask;
    newsa.sa_flags = 0;
    newsa.sa_restorer = 0;
    for (i = 0; i < NSIG; i++)
    {
    ret = sigaction (i, NULL, &oldsa);
    /* If the signal doesn't exist it returns an error and we skip it. */
    if (ret == 0
    && oldsa.sa_handler != SIG_IGN
    && oldsa.sa_handler != SIG_DFL)
    {
    ret = sigaction (i, &newsa, NULL);
    /* POSIX says:
    It is unspecified whether an attempt to set the action for a
    signal that cannot be caught or ignored to SIG_DFL is
    ignored or causes an error to be returned with errno set to
    [EINVAL].
    
    Ignore errors if it's EINVAL since those are likely
    signals we can't change. */
    if (ret != 0 && errno != EINVAL)
    _exit (2);
    }
    }
    /* Restore the old signal mask that we inherited from the parent. */
    pthread_sigmask (SIG_SETMASK, &old_signal_mask, NULL);
    
    /* At this point you carry out anything else you need to do before exec
    like changing directory etc. Signals are enabled in the child and
    will do their default actions, and the parent's handlers do not run.
    The caller has ensured not to call set*id functions. The only remaining
    general restriction is not to corrupt the parent's state by calling
    complex functions. The safe functions should be documented by glibc
    but aren't, please reach out to libc-alpha@sourceware.org to
    discuss. */
    
    /* ... */
    
    /* The last thing we do is execute the helper. */
    ret = execve (filename, new_argv, new_envp);
    /* Always call _exit in the event of a failure with exec functions. */
    _exit (3);
    }
    if (child == -1)
    {
    /* Restore the signal masks in the parent as quickly as possible to
    reduce signal handling latency. */
    pthread_sigmask (SIG_SETMASK, &old_signal_mask, NULL);
    perror ("vfork");
    exit (EXIT_FAILURE);
    }
    else
    {
    /* In the parent. At this point the child has either succeeded at the
    exec or _exit function call. The parent, this thread, which would
    have been suspended is resumed. */
    
    /* Restore the signal masks in the parent as quickly as possible to
    reduce signal handling latency. */
    pthread_sigmask (SIG_SETMASK, &old_signal_mask, NULL);
    
    /* Wait for the child to exit and then pass back the exit code. */
    waited = waitpid (child, &status, 0);
    
    if (waited == (pid_t) -1)
    {
    perror ("wait");
    exit (EXIT_FAILURE);
    }
    if (WIFEXITED(status))
    {
    printf("Helper: Exited, status=%dn", WEXITSTATUS(status));
    }
    else if (WIFSIGNALED(status))
    {
    printf("Helper: Killed by signal %dn", WTERMSIG(status));
    }
    
    return NULL;
    }
    }
    
    int
    main (void)
    {
    int ret;
    pthread_t thread;
    
    /* The application creates a thread from which to run other processes.
    The thread will immediately attempt to execute the helper process.
    On Linux the vfork system call suspends only the calling thread, not
    the entire process. Therefore it is still useful to use vfork over
    fork for performance, particularly as the process gets larger and
    larger the cost of fork gets more expensive as page table (not
    memory, since it's all copy-on-write) size grows. */
    ret = pthread_create (&thread, NULL, run_thread, NULL);
    if (ret != 0)
    {
    fprintf (stderr, "pthread_create: %sn", strerror (ret));
    exit (EXIT_FAILURE);
    }
    
    /* Do some other work while the helper launches the application,
    waits for it, and sets the global status. */
    
    /* ... */
    
    /* Lastly, wait for the helper thread to terminate. */
    ret = pthread_join (thread, NULL);
    if (ret != 0)
    {
    fprintf (stderr, "pthread_join: %sn", strerror (ret));
    exit (EXIT_FAILURE);
    }
    exit (EXIT_SUCCESS);
    }
    Last updated: August 30, 2016

    Recent Posts

    • Migrating Ansible Automation Platform 2.4 to 2.5

    • Multicluster resiliency with global load balancing and mesh federation

    • Simplify local prototyping with Camel JBang infrastructure

    • Smart deployments at scale: Leveraging ApplicationSets and Helm with cluster labels in Red Hat Advanced Cluster Management for Kubernetes

    • How to verify container signatures in disconnected OpenShift

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue