William Caban Babilonia's contributions
Article
Store immutable AI evaluation records with EvalHub and OCI
William Caban Babilonia
+1
Discover how to use EvalHub and OCI persistence to make your AI evaluation results immutable, content-addressable, and fully auditable.
Article
Add automated AI evaluations to your CI/CD pipeline
William Caban Babilonia
+2
Learn how to use the EvalHub CLI to automate AI evaluations in your CI/CD pipelines. Install the SDK, configure profiles, and set up a production gate.
Article
Bring your own evaluation framework to EvalHub
William Caban Babilonia
+2
Learn how to onboard a custom evaluation framework into EvalHub using one class, one method, and a container image. This guide covers the contract, data structures, and a complete minimal adapter.
Article
Understanding evaluation collections in EvalHub
William Caban Babilonia
+2
Learn how to read an existing system collection, understand its threshold logic, and build your own collection that encodes your actual measurement strategy with thresholds that mean something.
Article
Evaluation-driven development with EvalHub
William Caban Babilonia
+1
Learn how evaluation-driven development (EDD) turns AI optimization from an art into an engineering discipline with EvalHub.
Article
EvalHub: Because "looks good to me" isn't a benchmark
William Caban Babilonia
+1
Learn about the five primary structural challenges in enterprise AI evaluation and how EvalHub addresses them with a unified foundation for AI evaluation.
Article
How EvalHub manages two-layer Kubernetes control planes
William Caban Babilonia
+4
Learn how Red Hat AI 3.4 uses EvalHub to orchestrate AI evaluations on Kubernetes. Scale frameworks like Garak and LightEval with built-in MLflow tracking.
Article
Synthetic data for RAG evaluation: Why your RAG system needs better testing
Aditi Saluja
+2
Build better RAG systems with SDG Hub. Generate high-quality question-answer-context triplets to benchmark retrievers and track LLM performance over time.
Store immutable AI evaluation records with EvalHub and OCI
Discover how to use EvalHub and OCI persistence to make your AI evaluation results immutable, content-addressable, and fully auditable.
Add automated AI evaluations to your CI/CD pipeline
Learn how to use the EvalHub CLI to automate AI evaluations in your CI/CD pipelines. Install the SDK, configure profiles, and set up a production gate.
Bring your own evaluation framework to EvalHub
Learn how to onboard a custom evaluation framework into EvalHub using one class, one method, and a container image. This guide covers the contract, data structures, and a complete minimal adapter.
Understanding evaluation collections in EvalHub
Learn how to read an existing system collection, understand its threshold logic, and build your own collection that encodes your actual measurement strategy with thresholds that mean something.
Evaluation-driven development with EvalHub
Learn how evaluation-driven development (EDD) turns AI optimization from an art into an engineering discipline with EvalHub.
EvalHub: Because "looks good to me" isn't a benchmark
Learn about the five primary structural challenges in enterprise AI evaluation and how EvalHub addresses them with a unified foundation for AI evaluation.
How EvalHub manages two-layer Kubernetes control planes
Learn how Red Hat AI 3.4 uses EvalHub to orchestrate AI evaluations on Kubernetes. Scale frameworks like Garak and LightEval with built-in MLflow tracking.
Synthetic data for RAG evaluation: Why your RAG system needs better testing
Build better RAG systems with SDG Hub. Generate high-quality question-answer-context triplets to benchmark retrievers and track LLM performance over time.