Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Using GeoJSON with Apache Camel K for spatial data transformation

November 23, 2020
Maria Arias de Reyna Dominguez
Related topics:
JavaArtificial intelligenceContainersMicroservicesKubernetes

Share:

    In this article, we will define and run a workflow that demonstrates how Apache Camel K interacts with spatial data in the standardized GeoJSON format. While the example is simplified, you can use the same workflow to handle big data and more complex data transformations.

    You will learn how to use Camel K to transform data in common formats like XML and JSON. You will also see how to connect to a database and extract the data that you want from it. After we've defined the workflow, we'll run the integration on Red Hat OpenShift.

    Note: Please see the extended example for step-by-step instructions to set up a demonstration environment with Visual Studio Code, OpenShift, and Apache Camel K.

    Working with geospatial formats

    There are many ways to encode geographic information. In a geospatial format, we want to capture both the data and the location context, usually in geographic coordinates. Stored images, such as photos from a satellite, are known as raster data. This type of image usually contains metadata attributes with the coordinates defining where the photo was taken.

    When we want to represent alphanumeric data, we use vector data. Vector data comprises features: Objects or elements that can be placed or associated with spatial coordinates. A feature could be a bank, a city, a road, and so on. Each of these elements has an associated point, line, or polygon defining its geographic location.

    Two of the most-used formats for vector data are Keyhole Markup Language (KML), based on XML; and GeoJSON, based on JSON.

    GeoJSON

    GeoJSON is a well-known standard used to encode data structures. It can be used by any application or library that understands JSON. GeoJSON includes specific attributes for the geospatial context. For example,  the following snippet defines a point:

    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [125.6, 10.1]
      },
      "properties": {
        "name": "Dinagat Islands",
        "country": "Philippines",
        "type": "Province"
      }
    }

    The example includes a JSON attribute, type, which tells us that we are looking at a feature representing an object. Another attribute, geometry, defines a point in space. There is also an object containing properties, which are the attributes associated with the given point in space. According to the properties, the point described is located in a province called Dinagat Islands in the Philippines. GeoJSON does not define the semantics for the properties; that depends on each use case.

    I've given a brief overview of working with geospatial data and GeoJSON. Now, let's explore our data transformation workflow.

    The data transformation workflow

    We start by reading a CSV file and looping over each row independently. We'll query an XML API to add more data to each element in each row. Finally, we'll collect and aggregate all of the rows to build a GeoJSON file and store it in a PostgreSQL database. While it's not demonstrated, we're using the PostGIS extension to add spatial capabilities to the PostgreSQL database. Figure 1 describes the workflow.

    Workflow diagram
    Workflow diagram
    Figure 1: The spatial data transformation workflow.

    After we've defined the workflow, we'll use a Java file to implement it with Camel K. Note that Camel K supports many languages besides Java. You could use any of the supported languages to implement this workflow.

    Step 1: Start the data transformation workflow

    For simplicity, we'll start with a timer. A timer is a type of step that defines when to run a workflow based on a given duration. For example, we could run this workflow every 120 seconds:

    from("timer:java?period=120s")

    We might replace this timer with a trigger like a Kafka or RabbitMQ message in a real-life workflow. Using a message as a trigger ensures that the data is processed the moment that it is updated.

    Step 2: Process the original data set in a CSV file

    Next, we read the CSV file. For this example, we're using a small subset of an air quality dataset maintained by the European Environment Agency. We collect the data, which is hosted on a remote server, and process it as a CSV file:

    .to("{{source.csv}}")
    .unmarshal("customCSV")
    

    This command tells Camel K to convert the CSV data to a list of elements, with one element per row. Each element is a map whose keys are column names. We want to iterate over each row, so we split the body and start streaming it to the rest of the workflow:

    .split(body()).streaming()

    The split is how we process each row in the CSV file. Each row is now an element in the list.

    Step 3: Extract the data

    At each step, we want to extract the data that interests us. The best way to do this is to use a processor that extracts the values. We start with a CSV processor:

    .process(processCsv)

    Remember that a map represents each element. For each element, we call a Nominatim service with a query about the address where the measurement happens:

    .setBody().constant("")
    .setHeader(Exchange.HTTP_METHOD, constant("GET"))
    .setHeader(Exchange.HTTP_QUERY, simple("lat=${exchangeProperty.lat}&lon=${exchangeProperty.lon}&format=xml"))
    .to("https://nominatim.openstreetmap.org/reverse")
    

    The response is in XML format, which we can unmarshal for easier processing:

    .unmarshal().jacksonxml()

    Figure 2 shows the Nominatim response in XML.

    Nominatim example response, xml
    Nominatim example response, xml
    Figure 2: The Nominatim example response in XML.

    We can now use another processor to extract more of the data that interests us and add it to our stored data collection. This time, we'll use an XML processor:

    .process(processXML)
    

    We can also query a PostgreSQL database for even more data:

    .setBody().simple("SELECT info FROM descriptions WHERE id like '${exchangeProperty.pollutant}'")
    .to("jdbc:postgresBean?readSize=1")
    

    We'll use another processor to extract the data that we want from the database:

    .process(processDB)

    At this point, we have processed each row of the original CSV file and added data from both a remote service and a database.

    Step 4: Aggregate the data into a single GeoJSON file

    Now, we will aggregate all of the rows to make a single GeoJSON file. This file contains a feature collection, where each feature is built from a row of the original CSV file:

    .aggregate(constant(true), aggregationStrategy)
    .completionSize(5)
    .process(buildGeoJSON)
    .marshal().json(JsonLibrary.Gson)
    

    We store the GeoJSON result in our PostgreSQL database:

    .setBody(simple("INSERT INTO measurements (geojson) VALUES (ST_GeomFromGeoJSON('${body}'))"))
    .to("jdbc:postgresBean")
    

    Step 5: Run the integration with Camel K

    So far, we have defined a workflow, but we haven't yet run it. This is where the magic starts.

    To deploy our integration, we need a Kubernetes environment like OpenShift. Assuming we have the OpenShift environment set up, we can use the kamel client to launch our integration:

    kamel run OurIntegration.java
    

    When we run the integration, our workflow deploys on a container and runs seamlessly.

    Step 6: View the output

    As our last step, we can query the data stored on the database to visualize it over a map, as shown in Figure 3. Because we are using a standard format (GeoJSON), we can use it in any geospatial application. In this case, we are using QGIS, the leading geographic information system for the desktop, which is free and open source.

    QGIS querying the database with GeoJSON
    Figure 3: QGIS querying the database with GeoJSON.

    Conclusion

    This article has presented a brief introduction to GeoJSON and showed you how to use it to document spatial data for Camel K. No matter how specific your use case is, Camel K can adapt to any circumstances.

    Last updated: November 19, 2020

    Recent Posts

    • Storage considerations for OpenShift Virtualization

    • Upgrade from OpenShift Service Mesh 2.6 to 3.0 with Kiali

    • EE Builder with Ansible Automation Platform on OpenShift

    • How to debug confidential containers securely

    • Announcing self-service access to Red Hat Enterprise Linux for Business Developers

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue