Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

PLPMTUD delivers better path MTU discovery for SCTP in Linux

May 23, 2022
Long Xin
Related topics:
Linux
Related products:
Red Hat Enterprise Linux

Share:

    What is path MTU discovery?

    A maximum transmission unit (MTU) is the largest packet that can be transmitted as a single entity over a network connection. Each network node defines the MTU for packets it's transmitting through a standard called path MTU discovery (PMTUD). The goal of PMTUD is to choose the most efficient packet size that will succeed in reaching the recipient. In this article, you'll learn how this process works in the Linux kernel's implementation of the Stream Control Transmission Protocol (SCTP).

    Linux SCTP uses an algorithm called Datagram Packetization Layer Path MTU Discovery (DPLPMTUD, or just PLPMTUD), which is described in RFC 8899. Unlike earlier forms of PMTUD, this method does not rely on reception and validation of Packet Too Big (PTB) ICMP messages. The new implementation is therefore more robust than the classical PMTUD.

    PLPMTUD for SCTP was implemented in the Linux kernel some months ago and will be supported on versions 8.6 and 9.0 of Red Hat Enterprise Linux.

    The general strategy behind PLPMTUD runs in the kernel's packetization layer (PL). It sends probe packets using various packet sizes to determine the largest size of unfragmented datagram that can be sent over a network path. If a probe packet is successfully delivered (as determined by the PL), the PLPMTU is raised to the size of the successful probe. If a black hole is detected (that is, if packets of size PLPMTU are consistently not received), the method reduces the PLPMTU.

    Probes and acknowledgements

    RFC 8899 defines the contents of a probe packet and the acknowledgement (ACK) returned by the recipient.

    Each probe packet consists of an SCTP common header followed by a HEARTBEAT chunk and a PAD chunk. The HEARTBEAT chunk causes the recipient to send back a HEARTBEAT ACK packet to confirm that the probe was successful. The PAD chunk specifies the length of the probe packet.

    The HEARTBEAT chunk also carries a Heartbeat Information parameter that includes the size of the probe packet in a PROBED_SIZE field. Because the recipient returns this value in its HEARTBEAT ACK packet, the sender can be certain as to what the size of the original probe is, and therefore what is safe to assign as a packet size for further traffic. If the PROBED_SIZE field in a successful exchange is bigger than the PLPMTU, the sender can start using the larger size.

    The implementation on Linux uses the timers and variables described in the following subsections.

    Timers

    There is one timer per transport. The timer is used as the PMTU_RAISE_TIMER defined in the RFC when path MTU discovery is in the Search Complete state, and as the PROBE_TIMER in other states. (You'll get an outline of these various states later in this article.)

    The timer is started once PLPMTUD is enabled and path MTU discovery enters the Base state. The timer times out after every PROBE_INTERVAL, causing the node to resend any lost probe packets.

    When in the Search Complete state, the timer times out every 30 PROBE_INTERVAL periods. The node then goes back to the Search state, using the timer to trigger resends when necessary.

    Variables

    The following variables track values and state in Linux path MTU discovery:

    • PLPMTU: Keeps the most recently confirmed PROBED_SIZE. The value equals path MTU - sizeof(IP/IPv6 header). When path MTU discovery enters the Search Complete state, the path MTU used for transmission is updated to the value PLPMTU + sizeof(IP/IPv6 header).
    • PROBE_COUNT: A count for the number of successive unsuccessful probe packets that have been sent. When a probe packet is acknowledged, the value is set to zero.
    • PROBE_INTERVAL: The time interval (in milliseconds) used to schedule the PLPMTUD probe timer. The timer expires if the node fails to receive an acknowledgement to a probe packet after this period. This variable is also the time interval between probes for the current path MTU when probe searching is done.
    • PROBED_SIZE: The size of the current probe packet as determined at the PL. This value is a tentative value for the PLPMTU, awaiting confirmation by an acknowledgement.
    • PTB/PTB_SIZE: As noted above, PTB stands for Packet Too Big; this is sometimes also called Fragmentation Needed. PTB_SIZE is related to the path MTU, but with the IP header length subtracted from the value.

    Constants

    The following constants are used in making path MTU discovery decisions:

    • BASE_PLPMTU: A configured size that is expected to work for most paths, set to 1200. PLPMTUD starts its probe with this value as PROBED_SIZE. If the real PLPMTU turns out to be smaller, the kernel enters the Error state and allows IP fragmentation.
    • MAX_PROBES: The maximum value of the PROBE_COUNT counter. If consecutive probe attempts of any size exceed this value, the state can change and start to probe in a different rhythm. This constant is set to 3.
    • BIG_STEP, MIN_STEP: The increments added to PROBED_SIZE when a probe succeeds. BIG_STEP is used when the Search state starts. If a probe fails, MIN_STEP is used instead. BIG_STEP is set to 32 and MIN_STEP to 4.

    Commands to set options for path MTU discovery

    System administrators or developers can change runtime parameters through the mechanisms in this section.

    sysctl

    The sysctl command provides a default value for new sockets' PROBE_INTERVAL value:

    
      sysctl -w net.sctp.plpmtud_probe_interval=5000
    

    The parameter is set for the network namespace (netns). A new association takes the value from its socket, and a new transport takes the value from its association. Changes to this parameter affect only sockets that are created subsequently.

    setsocketopt

    To configure the PROBE_INTERVAL on a fine-grained basis, this system call can change the value for a socket, an association, or even a transport:

    
      setsocketopt(SCTP_PLPMTUD_PROBE_INTERVAL, interval)
    

    State machine and steps in path MTU discovery

    Figure 1 shows the stages in path MTU discovery and how the node moves between them.

    Diagram showing the stages in path MTU discovery.
    Figure 1: The stages in path MTU discovery.

    In the following subsections, we'll examine the most common state transitions.

    Base → Search → Search Complete

    A normal probe starts from the Base state and probes with a path MTU of BASE_PLPMTU (1200). Once a packet is acknowledged, path MTU discovery enters the Search state. The next probe starts by incrementing the PROBED_SIZE by BIG_STEP (32).

    This probe-ack-increment-probe sequence advances until a probe packet fails to be acknowledged. When that happens, the node resends the probe MAX_PROBES-1 more times (2 times, by default) with the same PROBED_SIZE following a wait of PROBE_INTERVAL milliseconds.

    If these probes do not succeed, the node starts to probe with the PLPMTU and increments it by MIN_STEP (4) each time. It continues until one probe fails in the same way as before. The node then assumes that the proper PLPMTU is found. The node updates the path MTU on transport and enters the Search Complete state.

    Search Complete → Search → Search Complete

    In the Search Complete state, the node waits for 30 intervals of PROBE_INTERVAL milliseconds unless it notices a data retransmission.

    When there is a data retransmission or the 30 PROBE_INTERVAL periods expire, the node enters the Search state. It starts probing with a probe of PROBED_SIZE+MIN_STEP. If this probe gets acknowledged, the next probe increments the size by BIG_STEP. When a new proper PLPMTU is found, the node updates the transport path MTU with that value and re-enters the Search Complete state.

    Search/Search Complete → Base

    During a search, any probe that fails with the PROBED_SIZE set to the PLPMTU causes a "Black Hole detected" error, sending the node back to the Base state.

    If a probe fails even with the BASE_PLPMTU, the node enters the Error state, where IP fragmentation is allowed.

    Packet Too Big or Fragmentation Needed

    During PTB packet processing, if the PTB_SIZE is between the PLPMTU and PROBED_SIZE, the next probe starts with the PROBED_SIZE set to PTB_SIZE. This could save some rounds of probing when finding the proper PLPMTU.

    Path MTU discovery example scenarios

    This section contains examples showing how PROBED_SIZE changes during PLPMTUD probing in different scenarios in Linux SCTP. The examples are based on the topology in Figure 2. The topology has two clients and a router with two interfaces:

    • A host (client) at link1_1 exchanges packets with the router at its link1_2 interface.
    • A host (server) at link2_2 exchanges packets with the router at its link2_1 interface.
    Diagram showing the network topology for our examples.
    Figure 2: Network topology for our examples.

    We begin by starting an SCTP connection from client to server:

    
      sctp_darn -H 192.168.2.1 -P 8888 -l  # on Server
      sctp_darn -H 192.168.1.1 -P 8888 -h 192.168.2.1 -p 8888 -s  # on Client
    

    Each of the scenarios that follow shows system administration commands that trigger a path MTU discovery sequence change, and the steps followed by the kernel to set the path MTU.

    Basic sequence

    Many sequences in path MTU discovery return to the one in this section. The sequence can be triggered through the following commands:

    
      iptables -A INPUT -p icmp -j DROP  # on Client, disable the classical PMTUD
      ip link set link2_1 mtu 1400  # on Router
    

    Steps in path MTU discovery (Base → Search → Complete):

    1. Probed size: 1200 (Starts at BASE_PLPMTU (1200), tries to confirm)
    2. Probed size: 1200 (Confirmed and enters Search, increments by BIG_STEP)
    3. Probed size: 1232 → 1264 → ... → 1356
    4. Probed size: 1388 (3-time-rtx failed, goes back to 1356)
    5. Probed size: 1356 (increments by MIN_STEP) → 1360 → 1364 → ... → 1380
    6. Probed size: 1384 (3-time-rtx failed, tries to confirm 1380)
    7. Probed size: 1380 (confirmed, enters Complete and sets path MTU)
    8. Probed size: 1380 (raise-timer up, enters Search, increments by MIN_STEP)
    9. Probed size: 1384 (3-time-rtx failed, tries to confirm 1380)
    10. Probed size: 1380 (confirmed, enters Complete and sets path MTU)

    Other sample sequences

    In this section, you'll see the sequences that can be triggered by a number of specific commands.

    If this command is entered:

    
      ip link set link2_1 mtu 1500  # on Router
    

    Then these are the steps in path MTU discovery (Complete → Search → Complete):

    1. Probed size: 1380 (raise-timer up, tries to confirm 1380)
    2. Probed size: 1380 (confirmed, enters Search, increments by MIN_STEP)
    3. Probed size: 1384 (confirmed, increments by BIG_STEP) → 1416 → 1448 → ... → 1480
    4. Probed size: 1512 (3-time-rtx failed, goes back to 1480)
    5. Probed size: 1480 (increments by MIN_STEP)
    6. Probed size: 1484 (3-time-rtx failed, tries to confirm 1480)
    7. Probed size: 1480 (confirmed, enters Complete and sets path MTU)

    If this command is entered:

    
      ip link set link2_1 mtu 1400  # on Router
    

    Then these are the steps in path MTU discovery (Complete → Base → Search):

    1. Probed size: 1480 (raise-timer up, tries to confirm 1480)
    2. Probed size: 1480 (3-time-rtx failed, enters Base, goes back to 1200, sets path MTU)
    3. Probed size: 1200 (confirmed, enters Search, increments by BIG_STEP)
    4. Probed size: 1232
    5. Starts basic sequence

    If this command is entered:

    
      ip link set link2_1 mtu 1000  # on Router
    

    Then these are the steps in path MTU discovery (Complete → Search → Base → Error):

    1. Probed size: 1380 (raise-timer up, tries to confirm 1380)
    2. Probed size: 1380 (3-time-rtx failed, enters Base, goes back to 1200 and sets path MTU)
    3. Probed size: 1200 (3-time-rtx failed, enters Error, allows IP fragmentation)
    4. Probed size: 1200 (3-time-rtx failed, enters Error, allows IP fragmentation)
    5. Probed size: 1200 (...)

    If this command is entered:

    
      ip link set link2_1 mtu 1400  # on Router
    

    Then these are the steps in path MTU discovery (Error → Base → Search → Complete):

    1. Probed size: 1200 (confirmed, enters Base, tries to confirm 1200 again)
    2. Probed size: 1200 (confirmed, enters Search, increments by BIG_STEP)
    3. Probed size: 1232
    4. Starts basic sequence

    If this command is entered:

    
      ip link set link1_1 mtu 1500  # on Client
    

    Then these are the steps in path MTU discovery (Complete → Base):

    1. Probed size: 1380 (rtx-timer reset, enters Base, goes back to 1200)
    2. Probed size: 1200 (confirmed, enters Search, increments by BIG_STEP)
    3. Probed size: 1232
    4. Starts basic sequence

    If this command is entered:

    
      iptables -D INPUT -p icmp -j DROP  # on Client, enable the classical PMTUD
      ip link set link1_1 mtu 1430  # on Client
    

    Then these are the steps in path MTU discovery (Complete → Search → Complete):

    1. 1. Probed size: 1380 (raise-timer up, tries to confirm 1380)
    2. Probed size: 1380 (confirmed, enters Search, increments by MIN_STEP)
    3. Probed size: 1384 (confirmed, increments by BIG_STEP)
    4. Probed size: 1416 (PTB received (path MTU == 1430), tries to confirm the path MTU from it)
    5. Probed size: 1408 (confirmed, increments by BIG_STEP)
    6. Probed size: 1440 (3-time-rtx failed, goes back to 1408)
    7. Probed size: 1408 (increments by MIN_STEP)
    8. Probed size: 1412 (3-time-rtx failed, tries to confirm 1408)
    9. Probed size: 1408 (enters Complete and sets path MTU)

    If this command is entered:

    
      ip link set link2_1 mtu 1400  # on Router
    

    Then these are the steps in path MTU discovery (Complete → Base):

    1. Probed size: 1408 (raise-timer up, tries to confirm 1408)
    2. Probed size: 1408 (PTB received (path MTU < 1408), enters Base, goes back to 1200 and sets path MTU)
    3. Probed size: 1200 (confirmed, enters Search, increments by BIG_STEP)
    4. Probed size: 1232
    5. Starts basic sequence

    If these commands are entered:

    
      iptables -A INPUT -p icmp -j DROP  # on Client, disable the classical PMTUD
      ip link set link2_1 mtu 1300  # on Router
      # on Client, input 1350 bytes data in sctp_darn
    

    Then these are the steps in path MTU discovery (Complete → Base):

    1. Probed size: 1380 (Data RTX happens, tries to confirm 1380)
    2. Probed size: 1380 (3-time-rtx failed, enters Base, goes back to 1200 and sets path MTU)
    3. Probed size: 1200 (confirmed, enters Search, increments by BIG_STEP)
    4. Probed size: 1232
    5. Starts basic sequence

    Conclusion

    Packets that cause ICMP PTB or Fragmentation Needed errors are often dropped or disabled in networking routers or servers. Classical PMTUD is not able to get the proper path MTU, which causes inefficient data transmission and even packet loss. PLPMTUD provides us with an effective way to overcome this. If you are an SCTP user, this article has shown you the details of how PLPMTUD works in SCTP, and how it can be used in your SCTP programs.

    Last updated: October 8, 2024

    Recent Posts

    • Exploring Llama Stack with Python: Tool calling and agents

    • Enhance data security in OpenShift Data Foundation

    • AI meets containers: My first step into Podman AI Lab

    • Live migrating VMs with OpenShift Virtualization

    • Storage considerations for OpenShift Virtualization

    What’s up next?

    intermediate-linux cheat sheet cover

    This Linux cheat sheet introduces developers and system administrators to the Linux commands they should know.  You'll learn about text utilities, disk tools, network connectivity tools, user and user group management, and more.

    Download the free cheat sheet
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue