
Expand Model-as-a-Service for secure enterprise AI
Discover the comprehensive security and scalability measures for a Models-as-a-Service (MaaS) platform in an enterprise environment.
Discover the comprehensive security and scalability measures for a Models-as-a-Service (MaaS) platform in an enterprise environment.
Learn how to overcome compatibility challenges when deploying OpenShift AI and OpenShift Service Mesh 3 on one cluster.
Harness Llama Stack with Python for LLM development. Explore tool calling, agents, and Model Context Protocol (MCP) for versatile integrations.
This beginner's guide to Podman AI Lab walks through setting up Podman Desktop, installing the AI Lab extension, and launching your first RAG chatbot.
Get started with AI in Node.js. This cheat sheet covers selecting models, using servers like Ollama, and client libraries like LangChain.js for AI integration.
Ollama makes it easy for developers to get started with local model experimentation, while vLLM provides a path to reliable, efficient, and scalable deployment.
Learn how to build a Model-as-a-Service platform with this simple demo. (Part 3 of 4)
Catch up on the most popular articles published on Red Hat Developer this year. Get insights on Linux, AI, Argo CD, virtualization, GCC 15, and more.
Learn how RamaLama's integration with libkrun and microVMs enhances AI model isolation, security, and resource efficiency for deployments.
Boost inference performance by up to 2.5X with vLLM's Eagle 3 speculative decoding integration. Discover how in this blog post.
Explore the architecture of a Models-as-a-Service (MaaS) platform and how enterprises can create a secure and scalable environment for AI models. (Part 2 of 4)
Download the Red Hat Enterprise Linux 10 cheat sheet for a quick reference guide to essential commands, image building, and system management with RHEL.
Discover how to communicate with vLLM using the OpenAI spec as implemented by the SwiftOpenAI and MacPaw/OpenAI open source projects.
Discover how model compression slashes LLM deployment costs for technical practitioners, covering quantization, pruning, distillation, and speculative decoding.
This article introduces Models-as-a-Service (MaaS) for enterprises, outlining the challenges, benefits, key technologies, and workflows. (Part 1 of 4)
Learn how to evaluate the performance of your LLM deployments with the open source GuideLLM toolkit to optimize cost, reliability, and user experience.
RamaLama's new multimodal feature integrates vision-language models with containers. Discover how it helps developers download and serve multimodal AI models.
Integrate Red Hat AI Inference Server with LangChain to build agentic document processing workflows. This article presents a use case and Python code.
Learn how OpenShift Lightspeed performed when asked handle complex OpenShift scenarios, such as application security and advanced configurations.
Explore OpenShift Lightspeed through a certification-like exercise, pitting the AI assistant against real-world OpenShift certification questions.
Discover how to deploy compressed, fine-tuned models for efficient inference with the new Axolotl and LLM Compressor integration.
Red Hat OpenShift 4.19 brings a new unified perspective, AI chat assistant OpenShift Lightspeed, simultaneous VM migrations, and other features for developers.
Learn how to run vLLM on CPUs with OpenShift using Kubernetes APIs and dive into performance experiments for LLM benchmarking in this beginner-friendly guide.
Discover why Kafka is the foundation behind modular, scalable, and controllable AI automation.
Learn how to secure, observe, and control AI models at scale without code changes to simplify zero-trust deployments by using service mesh.