Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Lessons on reproducing R1-like reasoning in small LLMs

Without using DeepSeek-R1-Zero (or its derivatives)

February 25, 2025
Akash Srivastava Isha Puri Kai Xu Shivchander Sudalairaj Mustafa Eyceoz Oleg Silkin Abhishek Bhandwaldar Aldo Pareja GX Xu
Related topics:
Artificial intelligence
Related products:
Red Hat AI

Share:

     

    Our latest research reveals novel inference scaling methods to boost reasoning ability in LLMs without additional training. By using techniques like CCG-IS, our methods guide small models to discover their potential, improving performance and scaling up to rival larger models.

    The path from CoT to inference-time scaling

    Our journey in R1-like reasoning starts from how Chain of Thought (CoT) data was synthesized to improve models.

    CoT

    The LLM community’s journey to inference scaling begins long, long ago with the arrival of CoT. In their 2022 paper, the team over at Google Brain discovered that just asking models to “think step-by-step” before answering a question boosted reasoning ability—simply by structuring the model’s response as a sequence of logical steps.

    Orca

    Researchers soon realized that prompting alone had its limits. While CoT-style reasoning helped models on some reasoning benchmarks, it wasn’t an inherent skill the models possessed—it was just an artifact of the inference time prompt. Microsoft’s Orca paper tackled this problem head on by distilling intermediate reasoning steps from GPT-4 into smaller models. Instead of just training a model to generate correct final answers, Orca explicitly supervised the reasoning process by encouraging models to imitate the structured, step-by-step reasoning of stronger models. 

    This created a fundamental shift: Instead of just CoT-style prompting, we were now CoT-style training, assigning reasoning to be an explicit learned objective. We also realized that smaller models could punch above their weight, so to speak, by mimicking the reasoning structures of more powerful models.

    Tree/Forest of Thought and Beam Search

    While CoT/Orca focused on single threads of reasoning, there was still a gap. Human problem solving is not linear! We explore multiple different solutions before deciding on just one! This insight leads us to Tree of Thought, where models generate multiple reasoning paths and explore different possible steps before committing to a final answer. Instead of solving problems in a single pass, Tree of Thought branches into multiple possibilities, evaluates and prunes bad reasoning paths, and in general allows for self correction and more robust decision making. Forest of Thought extended this idea by allowing parallel rees explore multiple reasoning paradigms at once.

    Process supervision

    At this stage, a crucial problem emerges. Even if a model generates a correct answer, its reasoning process may be unreliable or produce shortcut solutions. We need a way to validate the process of thinking itself. Indeed, this is what we learned in school—show your work! 

    This leads us to process supervision, where models are rewarded not just for getting the right answer, but also for following a reasoning path that aligns with human expectations. Unlike general outcome based reward models, Process Supervision evaluates the logical correctness of each intermediate step. If a model follows a flawed but correct-looking reasoning chain, it is penalized even if the final answer is correct. 

    This is where we introduce process reward models (PRMs), which learn to predict the likelihood of each reasoning step being correct. By using PRMs, we get more granular control over the logical consistency of a model’s reasoning steps. They offer a pathway toward allowing models to self-verify and guide their own reasoning trajectories.

    Inference scaling

    We have arrived at the inference time scaling stop on our journey through time. Inference scaling refers to methods that improve a model’s reasoning ability and performance at runtime, without any requiring additional training or fine tuning. 

    As shown by the Large Language Monkeys paper that came out of Stanford, many small, open sourced language models will eventually produce a correct answer for even challenging problems when prompted enough times. This suggests that smaller language models have better performance ability locked up within them—we just have to coax it out of them! 

    The question then becomes: How can we intelligently navigate the space of the LM’s possible answers to help it achieve its full potential?

    This is where our method comes in.

    Our inference scaling method (particle filtering)

    So what’s wrong with current inference scaling methods?

    Well, many current inference-time scaling methods “guide” their search process with Process Reward Models—off-the-shelf models that take a problem and a (partial or complete) answer and return a reward score. These methods (beam search, DVTS, etc) take the “top-N” options at every step and explore these.

    The problem, however, is that PRMs, as in the case of almost all Reward Models, are imperfect. They are often inadequate approximations of the ground truth, and following them leads to Reward Hacking, where the final output is optimized to score well according to the reward model but fails to be useful and/or correct.

    We ask the following: Can we frame inference-time scaling as a probabilistic inference task?

    What do we do differently? 

    Instead of allowing the PRM to completely determine which answers we unroll further, we do the following:

    1. Initialize a set of “particles.” 
    2. Generate a “step” for each particle. (How an answer is broken into “steps” is determined by automatic delimiters—in this case, we use \n\n.)
    3. Use the PRM to give a score to each particle given the question and answer so far. 
    4. Convert this raw PRM score into a “weight” using softmax. 
    5. Resample the particles according to these weights—every particle in this next stage can be sampled from every particle in the previous stage with whatever the softmax score probability was! 
    6. Continue steps 2-4 until all the particles have generated completed answers! 
    7. Pick your “final answer” by choosing whichever particle has the highest RM score. 

    The results

    So what are our results? We get some very cool numbers, as shown in Figures 1-3. 

    Our method scales 4-16x better than the deterministic inference scaling methods out there. On the MATH dataset, our method:

    • Can scale Qwen2.5 Math 1.5B Instruct to GPT-4o accuracy with only 4 rollouts.
    • Can scale Qwen2.5 Math 7B Instruct achieves o1 level accuracy with only 32 rollouts.
    • Can scale Llama 1B model to almost reach Llama 70B and can scale Llama 8B model to reach GPT-4o.
    Table 1: Results of various LLMs on MATH500 and AIME 2024 where bold indicates the best in each category and italic indicates the overall best.
    ModelMethodMATH500AIME 2024
    Closed source LLMs
    GPT-4o-76.213.3
    o1-preview-87.040.0
    Claude3.5-Sonnet-78.316.0
    Open source LLMs
    Llama-3.1-70B-Instruct-65.716.6
    Qwen2.5-Math-72B-Instruct-82.030.0
    Open source SLMs
    Llama-3.2-1B-InstructPass@126.80.0
     BoN46.63.3
     WBoN47.83.3
     DVTS52.86.6
     Ours - PF59.610.0
    Llama-3.1-8B-InstructPass@149.96.6
     BoN58.610.0
     WBoN59.010.0
     DVTS65.713.3
     Ours - PF74.416.6
    Open-Source Math SLMs
    Qwen2.5-Math-1.5B-InstructPass@170.010.0
     BoN82.613.3
     WBoN82.813.3
     DVTS83.416.6
     Ours - PF85.423.3
    Qwen2.5-Math-7B-InstructPass@179.616.6
     BoN83.020.0
     WBoN84.620.0
     DVTS85.420.0
     Ours - PF87.023.3
    Line plot comparing accuracy vs. budget for different Llama-3.2-1B-Instruct methods against baselines. Ours-Particle Filtering performs best.
    Figure 1: Performance of different methods on Llama-3.2-1B-Instruct as budget increases.
    Line plot comparing accuracy vs. budget for different Llama-3.1-8B-Instruct methods against baselines. Ours-Particle Filtering performs best.
    Figure 2: Performance of different methods on Llama-3.1-8B-Instruct as budget increases.
    Line plot comparing accuracy vs. budget for different Qwen2.5-Math-7B-Instruct methods against baselines. Ours-Particle Filtering performs best.
    Figure 3: Performance of different methods on Qwen2.5-Math-7B-Instruct as budget increases.

    Why is this so cool?

    We do all of this—scaling small models to such fantastic numbers—without training anything at all! Our method is able to efficiently guide a small, open source off-the-shelf model to “discover its potential” and make truly staggering improvements, just by intelligently navigating the search space.

    Just to provide some comparison points, this recent work trains a base model on trajectories generated from Qwen-2.5-Math-7B-Instruct and achieves an accuracy of 85.6% on MATH500, while our method is able to get to an accuracy of 87% just by milking the Qwen 2.5 Math 7B Instruct model itself for all its worth. Our results underline the novelty and elegance of this inference scaling method and again points to the core idea: How can we guide a language model through a search space to help it reach its full potential?

    How we connect inference-time scaling to training

    Inference-time scaling is pretty wild—it lets us take a (relatively speaking) small model and boost its performance to rival some of the biggest names in mathematical reasoning benchmarks (OpenAI’s o1, for example). In simple terms, this means a tiny model (or for the RL folks, the policy) actually can reason mathematically, given the right conditions. But that raises the obvious question: how do we train a model to learn to inference-scale instead of orchestrating it with a PRM, at runtime?

    Let’s break it down. As we saw earlier, inference-time scaling needs a method—like best-of-N (BoN) or particle filtering—and a reward model. This setup can be super expensive to run. So, how do we make it cheaper? One way is to train the model to imitate the inference-time scaling method (learn to generate trajectories similar to particle filtering or beam search) and self-evaluate (to replace PRM/ORM) its own reasoning before locking in an answer. Basically, instead of paying the full cost every time we run inference, we get the model to internalize the process through training.

    Training models to learn inference-time scaling

    So how do we actually do this? Well, we already have most of the pieces in place: a policy, a reward model, and an exploration method. If we throw in a reinforcement learning approach—like PPO or GRPO—we can close the loop and teach the model to “think” or “reason” on its own.

    But that’s not the only approach. Another way is to use a small or a larger teacher model to generate trajectory data using inference time-scaling first. We can then fine-tune a smaller model with Supervised Fine-Tuning (SFT), and maybe follow it up with GRPO to reinforce the best reasoning strategies.

    Learning from R1: A model that knows when to backtrack

    Lately, I’ve been fascinated by R1’s approach. What stands out is how it generates multiple answers, evaluates them at intermediate steps or at the end, and then decides whether to backtrack, restart, or finalize an answer if it thinks it’s correct. This feels eerily similar to how a model under particle filtering would behave—exploring different paths before settling on the best one.

    Of course, the “oh wait” moments (aka reflections) aren’t as organic here because a reward model is guiding the process. But it’s still super impressive that DS managed to set up conditions where the model learned inference-time scaling just by using RL. The downside? Running that process requires a ton of compute, way more than we’d like to spend.

    So instead, we’re taking what we’ve learned—that “it’s likely” that training a model to learn inference-time scaling leads to reasoning-like abilities—and using it to make small models smarter.

    A live experiment blog

    The next section is where things get real. We’re going to document our experiments in real time—sharing our code, models, ideas, and results (including what didn’t work). The goal? Finding a more efficient way to make small models reason like o1 and R1—without the ridiculous compute costs. Let’s figure this out together. 

    Our recipe (a work in progress)

    Building on our findings of inference-time scaling methods, we came up with a recipe to bake R1-like reasoning ability to small LLMs efficiently.

    First of all, how is DeepSeek-R1 trained? DeepSeek-R1 is trained in a few phases:

    1. R1-Zero / reasoning data: DeepSeek built R1-Zero by taking its existing MoE model, DeepSeek-V3-base, and applying a fresh, large-scale reinforcement learning training approach. It is then used to generate high-quality synthetic reasoning data, together with some other synthetic data approaches like few-shot CoT, reflection-verification prompting.
    2. Warm start: The synthetic data from the previous step is used to fine-tune DeepSeek-V3-base with standard instruction tuning to warm up small LLMs to be ready for reasoning RL (although they do have a limited style of reasoning at this moment). If you wanted to learn how to do full finetuning on LLMs, here is ICLR paper from our group that’s almost a practitioner’s guide to SFT: https://arxiv.org/abs/2412.13337
    3. RL for reasoning: After that, the model goes through another round of RL-based training (just like in step 1) to free up the limitations and unlock new reasoning capabilities, boosting its reasoning skills even further.
    4. Rejection sampling + SFT: Up to this point, the model has mainly been focused on reasoning. Now, it’s trained on a broader set of skills, kind of like what we do in InstructLab.
    5. Preference tuning: Finally, the model goes through a final round of RL-based tuning to make sure it’s safe and aligned with human preferences.

    If you’re curious about steps 4 and 5, here’s a paper from our team explaining how to do this without human annotations.

    So now we are looking for a way to obtain R1-like reasoning in small LLMs. We came up with a recipe that does not use DeepSeek-R1 nor its derivatives, and that is we are working on at the moment. Here is a side-by-side comparison between DeepSeek’s approach and our recipe:

     DeepSeekOur recipe
    R1-Zero / Reasoning dataR1-Zero (???B), few-shot CoT, reflection-verification promptingPhi-4 (14B) w/ inference-time scaling (BoN, Gibbs, PF)
    Warm startDeepSeek-V3-base (???B)Granite-3.1-Lab (8B), Llama-3.1-Instruct (8B), Phi-4 (14B)
    RL for reasoningGRPOGRPO
    General capabilityRejection sampling + SFT -> preference tuningDPO w/ Dr. SoW

    Here’s a breakdown:

    1. R1-Zero / reasoning data: Instead of training R1-Zero to obtain high-quality reasoning data, we instead use inference-time scaling methods to generate reasoning trajectories that can be used to synthesize high-quality reasoning data (detailed below).
    2. Warm start: With the high-quality reasoning data generated, we then do the similar SFT on the models we are interested in training.
    3. RL for reasoning: After warming up, we perform the same RL training with GRPO.
    4. General capability: Our approach to obtaining general capability is based on DPO with preference data annotated by a reward model. This reward model is based on recent work from our team, and it’s a human annotation-free method called Dr. SoW. It matches (or even outperforms) the state of the art. Check it out here: https://arxiv.org/abs/2411.02481.

    Results: What worked, what didn’t, and what’s next

    Disclaimer: These results are a messy, chaotic, absolutely massive work in progress, and we’ll update code and results as we continue to get them. We have not gotten very much sleep and we sincerely apologize for any gaps in the results table. Please pray to the LLMs gods with us to help populate it! Thank you for your cooperation. <3

    Bespoke dataset: A good start, but not enough

    Like pretty much everyone else, we kicked things off by working with the reasoning dataset created by Bespoke. From what we know, it’s distilled from R1, making it a solid starting point. But for our specific experiments, it didn’t quite hit the mark. Still, we wanted to see how far small models could go by just fine-tuning (SFT) on this dataset—and more importantly, whether training on a math-heavy reasoning dataset could help generalize the model’s reasoning ability to non-math tasks.

    Turns out, it kind of worked. Training on this dataset did help smaller Granite and Llama models mimic R1-like reasoning patterns. However, there was a catch: while they looked like they were reasoning, their actual benchmark performance didn’t improve after SFT.

    So, we took the best SFT checkpoint and ran GRPO on it, hoping that the reasoning skills bootstrapped during fine-tuning would become more refined. Once again… no major improvement in benchmarks for Llama or Granite.

    BoN with Phi-4: Prompting it to think

    After some quick brainstorming and research, we decided to give Phi-4 a shot. This model is seriously underrated—it performs ridiculously well on nearly every benchmark and even shows promising reasoning skills, at least in math. The best part? It’s only 14B parameters and fully open source. Huge shoutout to Microsoft Research for that one!

    Given our past work with InstructLab, we like to think we know a thing or two about synthetic data generation (SDG). So, it didn’t take long to figure out how to prompt Phi-4 into generating reasoning trajectories.

    Here’s what we did:

    1. For each question, we generated 64 samples.
    2. We used a verifier (from our inference-scaling paper) to pick the best trajectory—i.e., the one with the correct answer.
    3. We reformatted the data to match R1’s style, wrapping it in <thought> blocks.
    4. We used this dataset (D-verified) to fine-tune our small models, followed by another round of GRPO.

    This time, we saw some really interesting results:

    • Math scores improved for Llama.
    • AIME scores improved.
    • It even solved a problem posed by our chief scientist!

    What we learned: Two big gaps

    After looking at our results (and a bit of soul-searching), we realized two major issues in how our models reasoned compared to R1:

    • Short responses: Our model’s answers were too concise—R1-style reasoning tends to be longer.
    • No backtracking: The model wasn’t self-reflecting or revising its own answers. We had hoped GRPO would make this emerge naturally, but it didn’t.

    What’s next: Iterating on Phi-4 as teacher and student

    Back to the whiteboard we went! One key insight: Maybe it’s just too hard for an 8B model to develop reflection without being explicitly trained on reflection examples. R1 does this, so maybe we need to as well.

    Since R1’s own paper suggests that the model being trained with RL needs to be reasonably strong, we decided to keep using Phi-4—not just as the teacher, but also as the student in our next experiments. Stay tuned. 

    Teaching models to revise their own thinking

    Naïve stitch: Making mistakes on purpose (and fixing them)

    Our first shot at making the model reason more like R1 was pretty simple: force it to revise its own mistakes. We took our BoN-generated data and created what we’re calling D-stitch.

    Here’s how it works:

    1. For each question, we start with the thought process from a wrong sample.

    2. Then we add a transition phrase, like this: Hold on! I made a mistake. Let me try again.

    3. Finally, we append the thought process and solution from a verified correct sample.

    4. Bonus: We can add more than one wrong thought before the correct one!

    The results? A slight improvement, but the real win was that the model actually started revising its full reasoning process. That’s a good sign! Encouraged by this, we decided to push further and generate even more R1-like training data.

    PF backtrack: Getting the model to doubt itself

    While revising an entire answer is nice, it’s still not quite the reasoning we’re after. What we really want is partial backtracking—where the model recognizes errors midway, doubts itself, and changes course like R1.

    This reminded us of something: particle filtering (or any tree search method). Algorithmically, this kind of reasoning looks a lot like pruning bad search paths in a tree. So, we decided to generate backtrack data using particle filtering.

    Here’s how we did it:

    • We ran our particle filtering method, recording all the “dead” particles at each step (basically, failed reasoning paths).

    • This gave us a search tree where we could verify the final solutions from the correct paths.

    • We then synthesized new reasoning data by intentionally stitching in incorrect branches before returning to a correct path.

    • Whenever the model backtracked, we added a phrase like Hold on! I made a mistake. Let me try again.

    We’re calling this dataset D-backtrack, and it’s designed to train models to doubt and backtrack at intermediate steps, not just at the end.

    Gibbs with “but wait”: Inspired by S1

    While we were working on this, the S1 paper dropped, giving us even more ideas. Inspired by their approach, we created D-but-wait, a dataset designed to push the model toward deeper reasoning.

    Here’s how we built it:

    • When generating reasoning steps, we force the model to pause after completing the first thought.
    • Then, we append a phrase like But wait and force it to continue reasoning further before finalizing the solution.

    This setup encourages the model to naturally question its first thought, a bit like Gibbs sampling where you iterate until the solution stabilizes.

    Next steps: Refining backtracking and doubt mechanisms

    With these different approaches—D-stitch, D-backtrack, and D-but-wait—we’re getting models that at least attempt to revise themselves. But there’s still more to do! We’re now exploring how to make backtracking even more structured and whether we need larger models to fully develop this behavior.

    Let’s see where this takes us.

    ModelDatasetMethodAIME 2024 (Pass@8)MATH500 (Pass@8)
    Llama 3.1 8B Instruct--6/3073.6
     bespokelabs/Bespoke-Stratos-35kSFT⏳⏳
     Think-v1-13k (ours)SFT + GRPO7/3080.2
    Phi-4--12/3088.2
     Think-v1-13k (ours)SFT + Dr.SoW10/3090.8
     But-wait-10k (ours)SFT + GRPO10/3087.8
     Backtrack-22k(ours)SFT + GRPO10/30⏳

    Read the next part in this series here: How particle filtering makes small LLMs think big

    Last updated: May 15, 2025

    Related Posts

    • LLM Compressor is here: Faster inference with vLLM

    • Multimodal model quantization support through LLM Compressor

    • LLM Compressor: Optimize LLMs for low-latency deployments

    • 2:4 Sparse Llama: Smaller models for efficient GPU inference

    • How Marlin pushes the boundaries of mixed-precision LLM inference

    • Distributed inference with vLLM

    Recent Posts

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    • How to encrypt RHEL images for Azure confidential VMs

    • How to manage RHEL virtual machines with Podman Desktop

    • Speech-to-text with Whisper and Red Hat AI Inference Server

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue