Mustafa Eyceoz

Mustafa Eyceoz's contributions

Simplify LLM post-training with the Training Hub library, which provides a common, pythonic interface for running language model post-training algorithms.

Post-training methods for language models

Mustafa Eyceoz +1

November 4, 2025

Dive into LLM post-training methods, from supervised fine-tuning and continual learning to parameter-efficient and reinforcement learning approaches.

Discover Async-GRPO, a new library for reinforcement learning tasks that efficiently handles large models, eliminates bottlenecks, and accelerates experiments.

Discover how the adaptive SVD approach enables LLMs to continually learn and adapt without forgetting previously acquired knowledge.

Learn about an efficient inference scaling method that can improve your model's reasoning ability and performance at runtime while saving on compute costs.

On reasoning versus inference-time scaling

Akash Srivastava +8

February 17, 2025

Progress in small LLM reasoning: Our Qwen-32B model, using particle filtering, now surpasses o1-preview on Math500.

Granite, LIMO, and small LLM reasoning

Akash Srivastava +8

February 7, 2025

On reproducing R1-like reasoning in small LLMs: LIMO dataset ineffective for Llama/Granite; synthetic data generation shows promise but fine-tuning is tricky.

An update on reproducing R1-like reasoning in small LLMs: Granite models show big gains with particle filtering, outperforming GPT-4o on benchmarks.

Report a website issue

Red Hat Developer Sandbox

Programming languages & frameworks

System design & architecture

Developer experience

Automated data processing

Platform engineering

Secure development & architectures

E-books

Cheat sheets

Documentation

Mustafa Eyceoz

Mustafa Eyceoz's contributions

Get started with language model post-training using Training Hub

Post-training methods for language models

Async-GRPO: Open, fast, and performant

Sculpting subspaces: How we solved continual learning in LLMs

Lessons on reproducing R1-like reasoning in small LLMs

On reasoning versus inference-time scaling

Granite, LIMO, and small LLM reasoning

How particle filtering makes small LLMs think big

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue