Agent Skills: Explore security threats and controls
Learn how to manage the security threats and access controls associated with adopting the new Agent Skills functionality.
Learn how to manage the security threats and access controls associated with adopting the new Agent Skills functionality.
Learn how to run high-performance computing workloads managed by Slurm within a containerized OpenShift environment using the Slinky operator.
Learn how to improve the performance of your vLLM deployments with a diagnostic workflow that isolates latency issues and server saturation. Discover the key metrics to monitor and techniques to alleviate memory pressure.
Learn how the Responses API in Llama Stack automates complex tool calling while maintaining granular control over conversation flow for AI agents. Discover the benefits and implementation details.
Learn the many ways you can interact with GPU-hosted large language models (LLMs
Learn how to run OpenAI's Whisper model through vLLM on Apple Silicon, giving you an OpenAI-compatible endpoint on localhost. Then, discover how to take this architecture into production using Red Hat AI Inference Server.
Learn how to create a baseline RAG system using Apache Camel, PostgreSQL, and pgvector. This implementation demonstrates a 'boring' approach to RAG, making it easy to understand and debug. Discover the anatomy of a RAG pipeline, including indexing, retrieval, and answering.
Discover how I used an AI assistant to develop a production-grade Ansible Playbook to audit RHEL versions across a fleet of servers, generating a clean report.
Learn how to estimate memory requirements for your LLM fine-tuning experiments using Red Hat Training Hub's memory_estimator.py API. This guide covers the memory components, adjusting training setups for specific GPU specifications, and using the memory estimator in your code. Streamline your model fine-tuning process with runtime estimates and automated hyperparameter suggestions.
Learn how to deploy and test an Earth and space model inference service on Red Hat AI Inference Server and Red Hat OpenShift AI. This article includes two self-contained activities, one deploying Prithvi using a traditional Deployment object and another serving the model using KServe and observing Knative scaling.
Understand the PyTorch autograd engine internals to debug gradient flows. Learn about computational graphs, saved tensors, and performance optimization techniques.
Optimize vLLM performance with practical tuning tips. Learn how to use GuideLLM for benchmarking, adjust GPU ratios, and maximize KV cache to improve throughput.
Learn how to fine-tune AI pipelines in Red Hat OpenShift AI 3.3. Use Kubeflow Trainer and modular components for reproducible, production-grade model tuning.
Build better RAG systems with SDG Hub. Generate high-quality question-answer-context triplets to benchmark retrievers and track LLM performance over time.
Explore big versus small prompting in AI agents. Learn how Red Hat's AI quickstart balances model capability, token costs, and task focus using LangGraph.
Learn how ATen serves as PyTorch's C++ engine, handling tensor operations across CPU, GPU, and accelerators via a high-performance dispatch system and kernels.
Learn how integrating Red Hat Lightspeed Model Context Protocol (MCP) and Red Hat Lightspeed advisor optimizes infrastructure health management.
Learn how vibe coding and spec-driven development are shaping the future of software development. Discover the benefits and challenges of each approach, and how to combine them for sustainable software development.
Learn how to design agentic workflows, and how the Red Hat AI portfolio supports production-ready agentic systems across the hybrid cloud.
Automate Ansible error resolution with AI. Learn how to ingest logs, group templates, and generate step-by-step solutions using RAG and agentic workflows.
Learn how to integrate model context protocol (MCP) servers for Red Hat Enterprise Linux and Red Hat Lightspeed into your IDE for data-driven troubleshooting and proactive analytics. Improve your development workflow with actionable intelligence from natural language queries.
One conversation in Slack and email, real tickets in ServiceNow. Learn how the multichannel IT self-service agent ties them together with CloudEvents + Knative.
Learn how to deploy Voxtral Mini 4B Realtime, a streaming automatic speech recognition model for low-latency voice workloads, using Red Hat AI Inference Server.
Headed to DevNexus? Visit the Red Hat Developer booth on-site to speak to our expert technologists.
Learn how to integrate OpenShift Lightspeed into an IDE using the MCP server to generate configurations and query cluster resources without leaving your IDE.