Stream Processing

Build a data streaming pipeline using Kafka Streams and Quarkus

Build a data streaming pipeline using Kafka Streams and Quarkus

In typical data warehousing systems, data is first accumulated and then processed. But with the advent of new technologies, it is now possible to process data as and when it arrives. We call this real-time data processing. In real-time processing, data streams through pipelines; i.e., moving from one system to another. Data gets generated from static sources (like databases) or real-time systems (like transactional applications), and then gets filtered, transformed, and finally stored in a database or pushed to several other systems for further processing. The other systems can then follow the same cycle—i.e., filter, transform, store, or push to other systems.

Continue reading Build a data streaming pipeline using Kafka Streams and Quarkus

Share
New language support features in Apache Camel VS Code extension 0.0.27

New language support features in Apache Camel VS Code extension 0.0.27

In this article, I share several new language support features in the recently released Language Support for Apache Camel VS Code extension 0.0.27. Before I discuss these improvements, please note that updates to the VS Code extension are available in other IDEs that support the Camel Language Server, including Eclipse IDE, Eclipse Che, and more. It is simply easier to focus on one IDE for my demonstrations, so I’ve chosen VS Code.

Note: Apache Camel is a versatile open source integration framework based on known enterprise integration patterns.

Continue reading “New language support features in Apache Camel VS Code extension 0.0.27”

Share
Kubernetes-native Apache Kafka with Strimzi, Debezium, and Apache Camel (Kafka Summit 2020)

Kubernetes-native Apache Kafka with Strimzi, Debezium, and Apache Camel (Kafka Summit 2020)

Apache Kafka has become the leading platform for building real-time data pipelines. Today, Kafka is heavily used for developing event-driven applications, where it lets services communicate with each other through events. Using Kubernetes for this type of workload requires adding specialized components such as Kubernetes Operators and connectors to bridge the rest of your systems and applications to the Kafka ecosystem.

In this article, we’ll look at how the open source projects Strimzi, Debezium, and Apache Camel integrate with Kafka to speed up critical areas of Kubernetes-native development.

Note: Red Hat is sponsoring the Kafka Summit 2020 virtual conference from August 24-25, 2020. See the end of this article for details.

Continue reading “Kubernetes-native Apache Kafka with Strimzi, Debezium, and Apache Camel (Kafka Summit 2020)”

Share
Introduction to Strimzi: Apache Kafka on Kubernetes (KubeCon Europe 2020)

Introduction to Strimzi: Apache Kafka on Kubernetes (KubeCon Europe 2020)

Apache Kafka has emerged as the leading platform for building real-time data pipelines. Born as a messaging system, mainly for the publish/subscribe pattern, Kafka has established itself as a data-streaming platform for processing data in real-time. Today, Kafka is also heavily used for developing event-driven applications, enabling the services in your infrastructure to communicate with each other through events using Apache Kafka as the backbone. Meanwhile, cloud-native application development is gathering more traction thanks to Kubernetes.

Thanks to the abstraction layer provided by this platform, it’s easy to move your applications from running on bare metal to any cloud provider (AWS, Azure, GCP, IBM, and so on) enabling hybrid-cloud scenarios as well. But how do you move your Apache Kafka workloads to the cloud? It’s possible, but it’s not simple. You could learn all of the Apache Kafka tools for handling a cluster well enough to move your Kafka workloads to Kubernetes, or you could leverage the Kubernetes knowledge you already have using Strimzi.

Note: Strimzi will be represented at the virtual KubeCon Europe 2020 conference from 17-20 August 2020. See the end of the article for details.

Continue reading “Introduction to Strimzi: Apache Kafka on Kubernetes (KubeCon Europe 2020)”

Share
HTTP-based Kafka messaging with Red Hat AMQ Streams

HTTP-based Kafka messaging with Red Hat AMQ Streams

Apache Kafka is a rock-solid, super-fast, event streaming backbone that is not only for microservices. It’s an enabler for many use cases, including activity tracking, log aggregation, stream processing, change-data capture, Internet of Things (IoT) telemetry, and more.

Red Hat AMQ Streams makes it easy to run and manage Kafka natively on Red Hat OpenShift. AMQ Streams’ upstream project, Strimzi, does the same thing for Kubernetes.

Setting up a Kafka cluster on a developer’s laptop is fast and easy, but in some environments, the client setup is harder. Kafka uses a TCP/IP-based proprietary protocol and has clients available for many different programming languages. Only the JVM client is on Kafka’s main codebase, however.

Continue reading “HTTP-based Kafka messaging with Red Hat AMQ Streams”

Share
Choosing the right asynchronous-messaging infrastructure for the job

Choosing the right asynchronous-messaging infrastructure for the job

The term asynchronous means “not occurring at the same time.” In the context of distributed systems and messaging, this term implies that request processing will occur at an arbitrary point in time. Asynchronous interactions hold many advantages over synchronous ones, but they also introduce new challenges. In this article, we will focus on specific considerations for choosing the asynchronous-messaging infrastructure for your event-driven systems.

Continue reading Choosing the right asynchronous-messaging infrastructure for the job

Share
Build a simple cloud-native change data capture pipeline

Build a simple cloud-native change data capture pipeline

Change data capture (CDC) is a well-established software design pattern for a system that monitors and captures data changes so that other software can respond to those events. Using KafkaConnect, along with Debezium Connectors and the Apache Camel Kafka Connector, we can build a configuration-driven data pipeline to bridge traditional data stores and new event-driven architectures.

This article walks through a simple example.

Continue reading “Build a simple cloud-native change data capture pipeline”

Share
Tracking COVID-19 using Quarkus, AMQ Streams, and Camel K on OpenShift

Tracking COVID-19 using Quarkus, AMQ Streams, and Camel K on OpenShift

In just a matter of weeks, the world that we knew changed forever. The COVID-19 pandemic came swiftly and caused massive disruption to our healthcare systems and local businesses, throwing the world’s economies into chaos. The coronavirus quickly became a crisis that affected everyone. As researchers and scientists rushed to make sense of it, and find ways to eliminate or slow the rate of infection, countries started gathering statistics such as the number of confirmed cases, reported deaths, and so on. Johns Hopkins University researchers have since aggregated the statistics from many countries and made them available.

In this article, we demonstrate how to build a website that shows a series of COVID-19 graphs. These graphs reflect the accumulated number of cases and deaths over a given time period for each country. We use the Red Hat build of Quarkus, Apache Camel K, and Red Hat AMQ Streams to get the Johns Hopkins University data and populate a MongoDB database with it. The deployment is built on the Red Hat OpenShift Container Platform (OCP).

Continue reading “Tracking COVID-19 using Quarkus, AMQ Streams, and Camel K on OpenShift”

Share
Event streaming and data federation: A citizen integrator’s story

Event streaming and data federation: A citizen integrator’s story

Businesses are seeking to benefit from every customer interaction with real-time personalized experience. Targeting each customer with relevant offers can greatly improve customer loyalty, but we must first understand the customer. We have to be able to draw on data and other resources from diverse systems, such as marketing, customer service, fraud, and business operations. With the advent of modern technologies and agile methodologies, we also want to be able to empower citizen integrators (typically business users who understand business and client needs) to create custom software. What we need is one single functional domain where the information is harmonized in a homogeneous way.

Continue reading Event streaming and data federation: A citizen integrator’s story

Share