
Llama 4 herd is here with Day 0 inference support in vLLM
Discover the new Llama 4 Scout and Llama 4 Maverick models from Meta, with mixture of experts architecture, early fusion multimodality, and Day 0 model support.
Discover the new Llama 4 Scout and Llama 4 Maverick models from Meta, with mixture of experts architecture, early fusion multimodality, and Day 0 model support.
Explore how RamaLama makes it easier to share data with AI models using retrieval-augmented generation (RAG), a technique for enhancing large language models.
Learn how quantized vision-language models enable faster inference, lower costs, and scalable AI deployment without compromising capability.
This article provides automation strategies to help you scale smarter for better infrastructure with Ansible Automation Platform and AWS (part 1 of 3).
Explore inference performance improvements that help vLLM serve DeepSeek AI models more efficiently in this technical deep dive.
With the growth in the use of containers, the need to bundle your application
Explore new open source quantized reasoning models based on the DeepSeek-R1-Distill suite that deliver near-perfect accuracy and inference speed improvements.
Red Hat® Summit is where ideas and innovation come together to shape the future of enterprise IT. With a variety of offerings for this year’s event, you have the opportunity to shape conversations around open cloud technology, digital transformation, and much more.
Explore how vLLM's new multimodal AI inference capabilities enhance performance, scalability, and flexibility across diverse hardware platforms.
Explore multimodal model quantization in LLM Compressor, a unified library for optimizing models for deployment with vLLM.
This article explores how developing technical expertise can improve collaboration and enhance UX design outcomes.
This year's top articles on AI include an introduction to GPU programming, a guide to integrating AI code assistants, and the KServe open source project.
Explore the evolution and future of Quarkus, Red Hat’s next-generation Java framework designed to optimize applications for cloud-native environments.
This year's RustConf in Montreal hosted many good talks from the Rust Project
The RamaLama project simplifies AI model management for developers by using OCI containers to automatically configure and run AI models.
This guide walks through how to create an effective qna.yaml file and context file for fine-tuning your personalized model with the InstructLab project.
Creating Grafana dashboards from scratch can be tedious. Learn how to create a premade dashboard when using the Performance Co-Pilot plug-in for Grafana.
The .NET 9 release is now available, targeting Red Hat Enterprise Linux (RHEL) 8.10, RHEL 9.5, RHEL 10, and Red Hat OpenShift. Here's a quick overview of what developers need to know about this new major release.
Red Hat was recently at NodeConf EU, which was held on November 4-6th 2024. This
Repo, Red Hat Developer's new mascot, is curious, helpful, and eager to guide
Consistently code, build, and monitor for a trusted software supply chain across
This article is mostly a story about my attempt at getting new contributors
The brand new Traces UI plug-in has now been released as a technology preview, as part of the Cluster Observability Operator 0.4.0 version. Read more about the latest enhancements, charts, and the overall improved user navigation.
A detailed look at the new functionalities of Observability signal correlation for Red Hat OpenShift in technology preview.