Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • See all Red Hat products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Red Hat OpenShift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • See all technologies
    • Programming languages & frameworks

      • Java
      • Python
      • JavaScript
    • System design & architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer experience

      • Productivity
      • Tools
      • GitOps
    • Automated data processing

      • AI/ML
      • Data science
      • Apache Kafka on Kubernetes
    • Platform engineering

      • DevOps
      • DevSecOps
      • Red Hat Ansible Automation Platform for applications and services
    • Secure development & architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & cloud native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • See all learning resources

    E-books

    • GitOps cookbook
    • Podman in action
    • Kubernetes operators
    • The path to GitOps
    • See all e-books

    Cheat sheets

    • Linux commands
    • Bash commands
    • Git
    • systemd commands
    • See all cheat sheets

    Documentation

    • Product documentation
    • API catalog
    • Legacy documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore the Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Troubleshooting with fault tree analysis and PIOSEE

January 21, 2026
Francisco De Melo Junior Alexander Barbosa Ayala
Related topics:
DevOpsLinuxKubernetesSpring Boot
Related products:
Red Hat build of OpenJDKRed Hat OpenShiftRHEL UBI

    This article explains two methods that I use frequently to troubleshoot both Red Hat OpenShift and middleware problems. These two methods apply to different scenarios rather than the same problem:

    • Fault tree analysis (FTA) helps you find the root cause of a problem by eliminating components.
    • The PIOSEE framework (Problem, Information, Options, Select, Execute, Evaluate) is for critical production environments where you must take action immediately.

    Finally, this article explores how AI tools, such as gen AI, can help you use these methods to speed up root cause analysis and decide on the best course of action. While this moves beyond our typical technical discussions, it provides frameworks for fault scenarios that can help you from now on—even with gen AI or agent-based AI tools.

    Fault tree analysis (FTA)

    Fault tree analysis, illustrated in Figure 1, is a top-down method for tracking system failures. You can use it to find application deployment issues in environments like OpenShift 4.

    The aviation industry commonly uses this method, and the Federal Aviation Administration (FAA) relies on it. Because it is not specific to aviation, other industries apply it to troubleshoot tasks in the IT field.

    A fault tree analysis diagram showing an "Undesired Event" at the top, branching down into types of failures, failed components, and possible solutions.
    Figure 1: The hierarchical structure of a fault tree analysis.

    Problematic scenarios

    These scenarios illustrate common challenges where applying a structured troubleshooting method can prevent small issues from becoming major outages.

    Ineffective diagrams for container image troubleshooting

    After identifying a container image change that caused a problem, the deployment and development teams discussed the root cause. They listed the events that contributed to the issue and realized their current diagrams did not help them analyze the problem.

    Memory fluctuations and cluster instability

    A new server deployment caused memory use to fluctuate, which affected the entire OpenShift cluster. In this complex scenario, the node is failing and might crash. This small problem could lead to a significant outage for other applications in the cluster.

    OOMKilled errors during OpenShift migration

    A Spring Boot application that uses the Red Hat Universal Base Image (UBI) for the Red Hat build of OpenJDK was moved from OpenShift 3 to OpenShift 4.14+. The application is now failing due to a cgroups OOMKilled error, even though the deployment has not changed since OpenShift 3.11. For example:

    Jun 08 21:58:58 dev-example kernel: Memory cgroup out of memory: Killed process 2826157 (vector) total-vm:1570664kB, anon-rss:1039172kB, file-rss:45312kB, shmem-rss:0kB, UID:0 pgtables:3116kB oom_score_adj:-997

    When to use fault tree analysis

    While this method was not created for IT troubleshooting, it provides high-level guidance for investigations. This method is simple if:

    • The team understands the system overview.
    • The team includes experts to help.
    • The team will have more success knowing the problematic components of the deployment

    Fault tree analysis helps you solve problems using a structured approach. These three use cases show how this method can reduce recovery time and help you find the specific root cause faster.

    PIOSEE framework introduction

    The PIOSEE method, originally from the aviation industry, helps you troubleshoot problems in production environments. PIOSEE stands for Problem, Information, Options, Select, Execute, and Evaluate, as described in the following table and illustrated in Figure 2. You can move through these steps quickly, moving to the next stage as soon as the current one is complete in a mandatory sequence of events.

    StepNameAction
    PProblemDiagnose the problem.
    IInformationGather as much information as possible.
    OOptionsVerify which options apply and consider the trade-offs.
    SSelectSelect an option based on time and trade-offs.
    EExecuteExecute the option as closely as possible to the defined procedure.
    EEvaluateEvaluate the option based on the output.
    PIOSEE Diagram - show of the steps
    This diagram shows the steps for the PIOSEE
    Figure 2: The PIOSEE decision-making framework uses six steps to guide technical troubleshooting.

    Each step requires a specific action. For example, during the first step (Problem), you should verify the specific issue in the most rational and direct way possible.

    Problematic scenarios

    These scenarios show how the PIOSEE framework helps manage complex global handovers and technical ambiguity, improving performance by helping you solve problems faster. The following scenarios describe this.

    Global handovers during regional outages

    Problem: After a new feature deployment, a Spring Boot application in OpenShift is failing in a crash back loop. 

    Root cause: The team later identified a cgroups OOMKilled error.

    How to use this method: Because the crash occurred across several regions, teams collaborated globally to find the cgroups root cause. Explaining the problem status and next steps between teams can be difficult, which slows down handovers. The PIOSEE framework helps the next shift understand the investigation's current stage and the required steps using clear language.

    Ambiguous next steps for production fixes

    Problem: After identifying the root cause of a production problem, the deployment team received clearance for the next update. However, during the review, the team lacked a clear next step. This ambiguity delayed the deployment even though the handover was complete.

    How to use this method: In this case, the PIOSEE framework clarifies the next step. If the current process is unclear, the team knows how to fall back (in objective terms of deployment).

    Best practices for using PIOSEE

    Using the PIOSEE framework is an effective way to debug live problems and improve efficiency. However, there are a few points to discuss:

    • Train the team on how to use the framework correctly
    • The framework discussion should not take more time than the solving the actual problem. This includes the time spent on communication and tools.
    • Discuss the implementation openly rather than imposing it. Open communication and transparency allow teams to suggest other methods.

    The PIOSEE framework is an effective way to debug live problems and transition investigations between teams. Finally, it can be used for discussions after the post-mortem and later due diligence. It also provides a structure for post-mortem discussions and later reviews.

    Containment and contingency plans        

    Having a containment plan and a contingency plan are helpful strategies, though they are not heuristics or methodologies. You can use them with the PIOSEE phases to speed up the recovery process.

    A containment plan is a set of steps to isolate a problem and limit its impact.

    A contingency plan might include having a load balancer or a reliable version of your application ready to deploy immediately.

    AI as a tool for troubleshooting

    AI tools, such as gen AI, can help you speed up the following processes and complement the methods described earlier:

    • Provide high level ideas and causes
    • Explain low-level details and next steps
    • Documentation 

    However, consider the following for both processes:

    • Generic ideas from AI can distract from the investigation and delay finding the root cause.
    • Low-level details might be outdated or incorrect depending on the AI's sources.
    • AI might not cite sources, making it harder to verify information.

    These tools produce great results, but you must review the output carefully to stay on track.

    To keep this article concise, I will not go into detail here. I may cover this in a future article. AI tools can help with many tasks, such as scraping data, generating statistics and code, or reviewing documentation. Here, I wanted to highlight tips for combining AI with FTA and PIOSEE.

    Conclusion

    In summary, this article covers two troubleshooting methods: FTA and PIOSEE. Both help speed up your work and guide your next steps. FTA focuses on root cause analysis, while PIOSEE helps you take immediate action and find your way through a live problem.

    Finally, we looked at how AI tools can complement these methods. Due to the article's scope, we focused on specific uses within these frameworks.

    Additional resources

    For specific inquiries, open a case with Red Hat support. Our global team of experts can help you with any issues.

    Special thanks to Joshua Brandenburg for his leadership on these issues over the years.

    Related Posts

    • Node.js 20+ memory management in containers

    • Memory management in OpenShift Virtualization

    • Integrate incident detection with OpenShift Lightspeed via MCP

    • How incident detection simplifies OpenShift observability

    • How to fully utilize OpenShift for DevOps

    • Modern Kubernetes monitoring: Metrics, tools, and AIOps

    Recent Posts

    • Simplifying transit router deployment in Open Virtual Network

    • Selective network hosting with BGP router in OpenShift

    • Troubleshooting with fault tree analysis and PIOSEE

    • AI quickstart: How to build an AI-driven product recommender with Red Hat OpenShift AI

    • Deploy an Oracle SQLcl MCP server on Red Hat OpenShift

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue