Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Quickly determine which instruction comes next with Branch Prediction

March 9, 2016
William Cohen
Related products:
Developer Tools

Share:

    A pipelined processor requires a steady stream of instructions to be fed into the pipeline. Any delay in feeding instructions into the pipeline will hurt performance. For a sequence of instructions without branches it is relatively easy to determine the next instruction to feed into the pipeline, particularly for processors with fixed sized instructions. Variable-sized instructions might complicate finding the start of each instruction, but it is still a contiguous, linear stream of bytes from memory.

    Keeping the processor pipeline filled when there are changes in the control flow of a program due to branches is more difficult. Function calls, function returns, conditional branches, and indirect jumps all affect which instruction is the next in the sequence. In the worst case the pipeline might have to sit idle until all the prior instructions complete and this can significantly hurt performance. Given the frequency that instructions changing the control flow are executed in code these waits can greatly reduce performance.

    To improve pipeline performance the processor attempts to predict what instructions to execute next. These mechanisms help the processor speculatively execute code past the branch instructions. If the prediction is wrong, the speculative work is discarded to preserve the programmer's simple execution model semantics.

    Static Branch Prediction

    For branches where the processor has no previous information about the instruction's behavior the processor may resort to making assumptions about what the result of the instruction will be. Below is a example list of predictions made by some processors:

    • unconditional branches are taken
    • backward conditional branches are taken
    • forward conditional branches are not taken

    When the compiler generates code it will attempt to generate a sequences of instructions that follows those assumption as much as possible. The compiler estimates the relative frequency that conditional branches are taken. However, there are cases where the compiler does not have enough information to accurately structure the conditional statement to take advantage of the processor's static predictions. For example, there might be some branch in a function that is only taken for some rare condition based on a value passed into the function. The Linux kernel code uses some GCC attributes to allow the programmer to provide hints on the likelihood of the condition being true. The hinting macros are in the Linux source code file include/linux/compiler.h:

    /*
    * Using __builtin_constant_p(x) to ignore cases where the return
    * value is always the same. This idea is taken from a similar patch
    * written by Daniel Walker.
    */
    # ifndef likely
    # define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
    # endif
    # ifndef unlikely
    # define unlikely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0))
    # endif

    An example use is seen in the snippet of code for the the Linux kernel's pipe_read function where the vast majority of times there number of bytes read is going to be more than zero. The following C code hints to GCC to generate code that make the common non-zero total_len take advantage of the static prediction rules so that case is faster:

    static ssize_t
    pipe_read(struct kiocb *iocb, struct iov_iter *to)
    {
    size_t total_len = iov_iter_count(to);
    ...
    /* Null read succeeds. */
    if (unlikely(total_len == 0))
    return 0;

    These static hints are most useful for the cases where the code is run infrequently. If the code is run frequently such as as in a loop, the processor can use other dynamic prediction techniques to better determine which instruction will come next.

    Dynamic Branch Prediction

    Dynamic branch prediction techniques uses recent runtime history to predict whether the conditional branch is taken or will fall through, with the goal of being more accurate than the simple static prediction. The dynamic prediction hardware also predicts the target addresses of branch instructions with computed destinations and return instructions.

    Branch History Prediction

    When static branch prediction fails for a branch the processor will use a history of branches up to that branch and start recording the history of branches for that particular branch to improve the prediction the next time that particular branch instruction is encountered. The number of branch instructions that the processor can concurrently keep branch histories for range from a few hundred to several thousand. If there are many different conditional branch instructions are executed in the code, some of the previous predictions will be discarded to make room for more recent predictions, yet another reason to avoid "spaghetti code".

    In some cases the randomness of the branch makes it impossible for the branch prediction hardware to accurate predict the outcome of the branch. The basic implementation of absolute value function is one example of code with unpredictable branches:

    int abs (int i) {
      return i < 0 ? -i : i;
    }

    As suggested by older AMD optimization manuals, the 32-bit absolute value function could be rewritten as straight line code using bit manipulations like the following to avoid the unpredictable conditional branch:

    static inline int abs(int ecx) {
     int ebx = ecx;
     ecx = ecx >> 31;
     ebx = ebx ^ ecx;
     ebx -= ecx;
     return ebx;
    }

    Return Stack

    Return instructions can also benefit from specialized prediction hardware. The return instruction itself does not have any information about its destination. Each time a call instruction is executed the specialized prediction hardware pushes the return address onto a return address prediction stack. When a return instruction is encountered the top entry of that return address prediction stack is used as the target for the next instruction to fetch. This allows better prediction in the case that the same function might be call recursively or from different locations in the code.

    The depth of the return address prediction stack is limited. It might hold the most recent eight or 16 calls. Thus, if the call tree is very deep some of the predictions might be lost resulting in reduced performance. Inlining some functions might help by eliminating some call/return pairs and limiting the depth of the call tree.

    Branch Target Buffers

    Branch Target Buffers (BTB) hardware is another method of reducing the delay when branches and jumps are encountered in the code. These are caches that store the location of the predicted branch instruction and the associated predicted taken branch destination address based on previous execution of the branch instruction. If the predicted address was not available, the processor may be required to wait for a computation of a relative branch destination or for a register to be loaded with an indirect call for a method in an object-oriented language. The Branch Target Buffer allows the processor to speculatively do work when that information is not yet available. However, if the prediction target address is wrong, the speculative work must be discarded.

    The Branch Target Buffer can only store one address for each branch instruction. For a switch-case statement a C compiler might generate a table with the address of the code for each case and index that table to get the address of the next instruction to execute. Each time the switch-case statement is executed there might be a different case run, resulting the processor mispredicting what instruction to execute next and processor performance suffers. Object-oriented code might suffer a similar misprediction problem  if polymorphism is used (the same code being used for different classes and different classes have different implementations of the same method). Each time the code used for a different class there would be misprediction penalties.

    Investigating Branch Prediction Further

    Both Intel and AMD have documentation discussing performance tuning for their processors which includes suggestions on how to address some of these performance issues:

    • Advanced Micro Devices Software Optimization Guide for AMD Family 15h
    • Intel 64 and IA-32 Architectures Optimization Reference Manual
    Last updated: February 23, 2024

    Recent Posts

    • Migrating Ansible Automation Platform 2.4 to 2.5

    • Multicluster resiliency with global load balancing and mesh federation

    • Simplify local prototyping with Camel JBang infrastructure

    • Smart deployments at scale: Leveraging ApplicationSets and Helm with cluster labels in Red Hat Advanced Cluster Management for Kubernetes

    • How to verify container signatures in disconnected OpenShift

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue