Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

How to autoscale your SaaS application infrastructure

January 12, 2023
Michael Hrivnak
Related topics:
Automation and managementLinuxKubernetes
Related products:
Red Hat OpenShiftRed Hat Enterprise Linux

Share:

    This article discusses both how and why to scale your infrastructure automatically so that you aren't paying for resources you don't need. This is the last installment in the SaaS architecture checklist series.

    If you are a Software-as-a-Service (SaaS) provider, it is important to manage operational expenses while ensuring that your platform's capacity can always meet the needs of your users. Whether the traffic to your SaaS peaks on a predictable schedule (for example, office hours on weekdays or seasonal shopping) or whether you are planning for growth, Kubernetes has features to make sure you have the right level of capacity at any given time.

    SaaS revenue typically comes through recurring fees that scale based on metrics such as the number of users, quantity of data stored or processed, access to advanced features, and other similar points of value. While each SaaS provider decides on their own pricing model and consumption metric, there is one common goal: The more your SaaS gets used, the more revenue you make.

    But that usage also incurs increased infrastructure demand and expense. Maintaining a healthy operating income depends on keeping your infrastructure expenses below the corresponding revenue.

    It pays to build on a compute platform that can scale in response to demand, with automation that scales the application in real-time. This article will discuss basic techniques for automatically scaling a Kubernetes cluster and an application based on load by combining three components:

    • Cluster API to enable cluster resizing
    • Cluster Autoscaler to resize the cluster based on load
    • Horizontal Pod Autoscaler to scale an application based on load

    Kubernetes native infrastructure

    Kubernetes native infrastructure is a pattern for using the Kubernetes API to manage some or all of the compute and network infrastructure used by a cluster. This concept is the foundation for some of the scalability components described in the following sections.

    You can extend the Kubernetes API through custom resources, which allow a software developer to add a new API endpoint to a cluster, write a controller to implement the features of that endpoint, and thus create a Kubernetes-native API to manage almost anything. You might have heard of the Operator Pattern that uses this technique of extending the Kubernetes API, typically for the purpose of managing applications, but custom resources can also be used to manage infrastructure.

    Cluster API

    The Cluster API project is a popular example of Kubernetes-native infrastructure management. One of its core capabilities is to add and remove nodes in an existing cluster using a concept called Machines. The API has a Machine API resource, which represents the desire to have a node in a cluster. The API also has a MachineSet API resource (similar in concept to a ReplicaSet), which includes a desired size and a template for creating Machines.

    The Machine and MachineSet APIs let you resize a cluster just by changing an integer field to the number of nodes you desire. Figure 1 shows the API's basic behavior.

    The Machine API watches for changes and implements them using the platform-specific API.
    Figure 1: The Machine API watches for changes and implements them using the platform-specific API.

    At a very high level, cluster scaling using the cluster API project works like this:

    • A person or automation decides to change the size of a MachineSet (more on this later).
    • A controller process either creates new Machine resources or deletes existing ones based on comparing the desired number of machines to the number that actually exists.
    • When a new Machine resource gets created, a platform-specific controller uses a cloud API to add a host to the cluster. For example, on AWS the controller would use the EC2 API to provision a new host.
    • When a Machine resource gets deleted, the platform-specific controller drains the corresponding node of workloads and then uses the platform-specific API to delete the host.

    With this simple approach, it's easy to scale a cluster up and down. The API works the same across different cloud providers, so scaling changes can be made without any cloud-specific knowledge.

    But how can the scaling decision to add and remove nodes be automated based on the real-time load? That's the responsibility of the Cluster Autoscaler.

    Cluster Autoscaler

    The Cluster Autoscaler uses the Machine API to add and remove nodes in a cluster based on its observations of workload scheduling pressure and node utilization. The Autoscaler adds nodes when pods cannot be scheduled due to resource constraints and removes underused nodes. Additional configuration is available, such as limits on how many nodes to create and settings that tune how aggressively the Autoscaler removes underused nodes.

    Combining the Machine API with the Cluster Autoscaler, you can have a self-scaling compute platform that responds to the resource needs of your SaaS application.

    To learn more about how Red Hat OpenShift enables machine management and scaling, including example resources, see the overview of machine management and the autoscaler section of the product documentation.

    But there is one more piece to the puzzle: How can you scale your application, adding and removing pods as its load changes over time? That is the job of the Horizontal Pod Autoscaler.

    Horizontal Pod Autoscaler

    Horizontal Pod Autoscaling adds and removes pods in response to changing load. The service bases decisions on two metrics: CPU utilization and memory footprint. You can define thresholds at which pods will be added or removed to keep per-pod resource utilization near target values. In a common use case where a workload is defined as a Deployment, this Autoscaler adjusts the replicas field based on load.

    In the following example, a frontend-app Deployment resource varies from 10 to 100 pods. Crucially, the Deployment's pod template must include a CPU resource request. The Autoscaler uses each pod's actual CPU utilization to calculate a percentage of its CPU request and then uses the average to determine whether to scale the application up or down.

    
    apiVersion: autoscaling/v1
    kind: HorizontalPodAutoscaler
    metadata:
      name: frontend-app
      namespace: default
    spec:
      maxReplicas: 100
      minReplicas: 10
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: frontend-app
      targetCPUUtilizationPercentage: 75

    The newer autoscaling/v2 API lets you scale based on memory utilization and define more sophisticated policies. New policy options can configure how rapidly to make changes, including different policies that apply to scaling up versus scaling down. This API also supports the use of custom metrics.

    SaaS provisioning can be automated under Kubernetes

    Combining the Cluster API, Cluster Autoscaler, and Horizontal Pod Autoscaler, you can automatically scale your SaaS application up and down based on changes in utilization. The underlying infrastructure will automatically scale with your resource specifications, saving costs when utilization is low.

    Red Hat SaaS Foundations is a partner program designed for building enterprise-grade SaaS solutions on the Red Hat OpenShift or Red Hat Enterprise Linux platforms and deploying them across multiple cloud and non-cloud footprints. Email us to learn more about partnering with Red Hat to build your SaaS.

    Be sure to read all the articles in the SaaS architecture checklist series. Comment below if you have questions. We welcome your feedback. 

    Last updated: September 20, 2023

    Related Posts

    • A SaaS architecture checklist for Kubernetes

    • Approaches to implementing multi-tenancy in SaaS applications

    • SaaS security in Kubernetes environments: A layered approach

    Recent Posts

    • GuideLLM: Evaluate LLM deployments for real-world inference

    • Unleashing multimodal magic with RamaLama

    • Integrate Red Hat AI Inference Server & LangChain in agentic workflows

    • Streamline multi-cloud operations with Ansible and ServiceNow

    • Automate dynamic application security testing with RapiDAST

    What’s up next?

    As a developer, once you have built your APIs, you might need to allow controlled access to these APIs to both internal and external customers. In this tutorial, you'll learn how to share, secure, and publish APIs using Red Hat OpenShift API Management.

    Try Red Hat OpenShift API Management
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue