Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Algorithms != Programs and Programs are not "One size fits all"

February 12, 2019
William Cohen Ben Woodard

    You've probably been taught that picking an algorithm that has the best Big-O asymptotic cost will yield the best performance. You might be surprised to find that on current hardware, this isn't always the case. Much of algorithmic analysis assumes very simple costs where the order of operations doesn't matter. Memory access times are assumed to be the same. However, the difference between a cache hit (a few processor clock cycles) and a cache miss that requires access to main memory (a couple hundred cycles) is immense.

    This article series is the result of the authors (William Cohen and Ben Woodard) discussion that there is a disconnect on the typical ideas of algorithm efficiency taught in computer science and computer engineering versus what is currently encountered in actual computer systems.

    Introduction

    What is the best way to implement a sort? If you paid attention in your algorithms and data structures class you might quickly answer “Quicksort”. However, the correct answer might be more complicated and depend on specific details. No one would ever advocate using the Quicksort algorithm to sort an unordered list containing only two items. This could be more efficiently and simply done with a compare and conditional swap to reorder the elements if needed. The advantages of the Quicksort algorithm would only become apparent when the number of elements to sort is much larger. However, other sorting algorithms such as merge sort might be preferred over the Quicksort algorithm even though both have the same average complexity, O(n log n), due to merge sort having better real-world performance. Incorrect assumptions developers might make during the selection of algorithms and implementing them in programs that can adversely affect performance.

    Big-O notation has often been used as the measure of goodness for an algorithm and as a tool to predict how a part of the program will behave in the physical world where programs actually operate. However, the simple sequential models used for big-O analysis do not accurately represent current computer system costs. Modern computer hardware can overlap operations and individual memory access costs can vary immensely (by a factor of one hundred or more) depending on the memory access patterns. High-performance machines can perform multi-element, vector-style operations with much higher throughput than doing the equivalent individual operations separately. The differences between the abstract model and the physical hardware could lead to very noticeable differences in delivered performance. For example iterating through elements in a linked list and array are both O(n) complexity, but on modern processors hardware it is much slower iterating through the elements in a linked list because of the random memory access pattern than iterating through an equivalent one-dimensional array of elements with the regular, easily predicted memory access pattern.

    Big-O asymptotic algorithm analysis is showing how the computational work grows in proportional to the problem size growth. However, this does not indicate which algorithm is most efficient for a particular problem size. The lack of constants makes that comparison between algorithms working on a particular problem size difficult. An algorithm that has worse big-O asymptotic performance than another algorithm might have better performance for smaller problem sizes. Given that cache misses cost hundreds of cycles the range of problems sizes where the “worse” algorithm is faster may be surprisingly large. Algorithms such as Coppersmith-Winograd matrix multiplication algorithm (O(n^2.375477)) are theoretically more efficient than the naive implementation of matrix multiply (O(n^3)) but the problem size needs to be so large to provide an advantage that they are not used in practice.

    Those constants that are so often ignored for big-O analysis have significant impact on efficiency and power use. For a battery powered tablet playing a video they could make the difference between being able to keep yourself entertained with videos for an entire eight hour transatlantic flight and only being able watch an single two hour movie before the battery is dead. At the other extreme for a high-performance computer system using millions of dollars of electricity each year the power might be a significant portion of the operating cost and there is great pressure to minimize those costs.

    Many programmers attempt to use parallel processing to obtain faster performance. At best parallel computers will speed up the program proportional to the number of processors in the system. For parallel computer systems it makes little sense to run a program across multiple processors if it yields no performance improvement over running on a single processor. If care is not taken writing the code, it is quite possible that the overhead to make the program run across multiple processors ends up making the parallel version of the program slower than the basic serial version. You can use your favorite search engine and find many examples of "parallel program slower than sequential program".

    Programmers are much more familiar with the simple programming model similar to the ones used for big-O analysis where the order of operations is not important and the operations have the same low costs. For high-performance computers these assumptions do not hold. How operations are grouped can impact cost and there can be orders of magnitudes difference in the cost of operations.

    Coming articles

    We're starting a series of articles that will talk about some of the high performance computing issues such as data layout, task granularity, load balancing, and contention. The goal of these articles is to help eliminate some of the assumptions that might have based on traditional simplified program models and steer people toward writing more efficient code for modern computers.

    While these topics are typically considered to be in the domain of high performance computing (HPC), this discussion are generally applicable. Each year, features from HPC trickle down to more traditional environments. For example vector processing was only found on supercomputers in the 80's and 90's, but now every virtually every laptop, desktop, and server computer has vector processing support and there are software libraries to use these capabilities.

    Last updated: March 26, 2019

    Recent Posts

    • Preventing GPU waste: A guide to JIT checkpointing with Kubeflow Trainer on OpenShift AI

    • How to manage TLS certificates used by OpenShift GitOps operator

    • Configure a split disk on OpenShift Container Platform

    • Red Hat Enterprise Linux 10.2 and 9.8: Top features for developers

    • What GPU kernels mean for your distributed inference

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.