Data scientists often use notebooks to explore data and create and experiment with models. At the end of this exploratory phase is the product-delivery phase, which is basically getting the final model to production. Serving a model in production is not a one-step final process, however. It is a continuous phase of training, development, and data monitoring that is best captured or automated using pipelines. This brings us to a dilemma: How do you move code from notebooks to containers orchestrated in a pipeline, and schedule the pipeline to run after specific triggers like time of day, new batch data, and monitoring metrics?
Continue reading From notebooks to pipelines: Using Open Data Hub and Kubeflow on OpenShift
The main goal of Kubernetes is to reach the desired state: to deploy our pods, set up the network, and provide storage. This paradigm extends to Operators, which use custom resources to define the state. When the Operator picks up the custom resource, it will always try to get to the state defined by it. That means that if we modify a resource that is managed by the Operator, it will quickly replace it to match the desired state.
Continue reading Open Data Hub and Kubeflow installation customization
This article is about my experience installing Red Hat Data Grid (RHDG) on Red Hat CodeReady Containers (CRC) so that I could set up a local environment to develop and test a Quarkus Infinispan client. I started by installing CodeReady Containers and then installed Red Hat Data Grid. I am also on a learning path for Quarkus, so my last step was to integrate the Quarkus Infinispan client into my new development environment.
Initially, I tried connecting the Quarkus client to my locally running instance of Data Grid. Later, I decided I wanted to create an environment where I could test and debug Data Grid on Red Hat OpenShift 4. I tried installing Data Grid on OpenShift 4 in a shared environment, but maintaining that environment was challenging. Through trial-and-error, I found that it was better to install Red Hat Data Grid on CodeReady Containers and use that for my local development and testing environment.
In this quick tutorial, I guide you through setting up a local environment to develop and test a Quarkus client—in this case, Quarkus Infinispan. The process consists of three steps:
- Install and run CodeReady Containers.
- Install Data Grid on CodeReady Containers.
- Integrate the Quarkus Infinispan client into the new development environment.
Continue reading “Develop and test a Quarkus client on Red Hat CodeReady Containers with Red Hat Data Grid 8.0”
It is just a few short weeks since we released Open Data Hub (ODH) 0.6.0, bringing many changes to the underlying architecture and some new features. We found a few issues in this new version with the Kubeflow Operator and a few regressions that came in with the new JupyterHub updates. To make sure your experience with ODH 0.6 does not suffer because we wanted to release early, we offer a new (mostly) bugfix release: Open Data Hub 0.6.1.
Continue reading Open Data Hub 0.6.1: Bug fix release to smooth out redesign regressions
Open Data Hub (ODH) is a blueprint for building an AI-as-a-service platform on Red Hat’s Kubernetes-based OpenShift 4.x. Version 0.6 of Open Data Hub comes with significant changes to the overall architecture as well as component updates and additions. In this article, we explore these changes.
From Ansible Operator to Kustomize
If you follow the Open Data Hub project closely, you might be aware that we have been working on a major design change for a few weeks now. Since we started working closer with the Kubeflow community to get Kubeflow running on OpenShift, we decided to leverage Kubeflow as the Open Data Hub upstream and adopt its deployment tools—namely KFdef manifests and Kustomize—for deployment manifest customization.
Continue reading “Open Data Hub 0.6 brings component updates and Kubeflow architecture”
With the rise of social networks and people having more free time due to isolation, it has become popular to see lots of maps and graphs. These are made using big spatial data to explain how COVID-19 is expanding, why it is faster in some countries, and how we can stop it.
Continue reading Working with big spatial data workflows (or, what would John Snow do?)
Red Hat Data Grid helps applications access, process, and analyze data at in-memory speed. Red Hat Data Grid 8.0 is included in the latest update to Red Hat Runtimes, providing a distributed in-memory, NoSQL datastore. This release includes a new Operator for handling complex applications, a new server architecture that reduces memory consumption and increases security, a faster API with new features, a new CLI, and compatibility with a variety of observability tools.
Continue reading Red Hat Data Grid 8.0 brings new server architecture, improved REST API, and more
Most programs need data in order to work. Sometimes this data is provided to the program when it runs, and sometimes the data is built into the program. In this article, I’ll explain how to store large amounts of data inside a program so that it is there when the program runs.
Continue reading “How to store large amounts of data in a program”