NVIDIA Morpheus AI framework detects sensitive data

The growth of cloud-native applications has driven an explosion of east-west network traffic within a datacenter where applications can create hundreds of thousands of network connections among virtual machines and containers. As a consequence, the ability to track, monitor, and secure a datacenter in a timely manner has risen above that of any individual or team, thus requiring the help of AI and machine learning (AI/ML) to enable ITOps, infrastructure security, and DevSecOps teams to manage the complexity of modern cloud-native applications and the underlying platforms.

Red Hat and NVIDIA have been working together to bring the security analytics capabilities of the NVIDIA Morpheus AI application framework to Red Hat infrastructure platforms for cybersecurity developers. This article provides a set of configuration instructions to Red Hat developers working on applications that use the NVIDIA Morpheus AI application framework and NVIDIA BlueField data processing units (DPUs) to secure interservice communication.

Prerequisites

Architecture overview

This architecture consists of several software and hardware components.

NVIDIA's Morpheus AI Cybersecurity Framework is at the heart of the system. The software runs on Red Hat OpenShift and uses AI/ML to continuously inspect network and server telemetry at scale. As a quick refresher, the NVIDIA Morpheus AI application framework is built on the RAPIDS libraries, deep learning frameworks, and NVIDIA Triton Inference Server. It can run on site or in the cloud, and simplifies the analysis of logs and telemetry to help detect and mitigate security threats so that developers can create and deploy AI-powered security solutions more quickly. The framework helps clean, filter, and pre-process the telemetry data before sending it to the NVIDIA Triton server for inference. The framework also post-processes inference results before returning them to a client application such as a monitoring dashboard.

The telemetry is gathered by the NVIDIA NetQ Agent running on the NVIDIA BlueField-2 DPU installed on the local server and running Red Hat Enterprise Linux. The agent captures network packets from the DPU on the server and sends them to the NVIDIA Morpheus AI framework for inspection.

An Apache Kafka broker is used for streaming telemetry data from data sources to the NVIDIA Morpheus AI engine and sending inspection output from the engine, via a Kafka output topic, to a client application (e.g., monitoring dashboard).

Note: This guide does not cover a client application for consuming inspection output. Instead, we will be directly monitoring the Kafka output topic.

The particular case of collecting telemetry using the NVIDIA NetQ Agent requires the NVIDIA Endpoint Gateway (also part of the NVIDIA Morpheus AI framework) that allows sending telemetry samples to a gRPC endpoint and forwarding them to the input topic of the Apache Kafka broker.

The architecture is depicted at a high level in Figure 1.

Diagram of the NVIDIA Morpheus AI architecture. — Figure 1: The high level architecture.

It is possible to deploy NVIDIA Morpheus AI with Red Hat OpenShift on both physical and cloud infrastructures. However, to simplify the access for a wide variety of developers, we are using AWS infrastructure to host the NVIDIA Morpheus AI software and selecting an EC2 instance type that includes NVIDIA GPU resources. Additionally, we are going to assume that the developer has access to a physical server running Red Hat Enterprise Linux that has a NVIDIA BlueField-2 data DPU installed.

Installing and running NVIDIA Morpheus AI on Red Hat OpenShift

Now that you have all the prerequisites and requirements, let's get going! In the next few steps, you will:

Install a Red Hat OpenShift cluster on AWS, including configuration of an AWS account, connecting the installation program to AWS, and customizing installation files.
Install the NVIDIA GPU Operator on the OpenShift cluster. You will also obtain a cluster entitlement and install the Node Feature Discovery (NFD) Operator as part of this procedure.
Install the NVIDIA Morpheus AI Engine with an Apache Kafka broker and Endpoint Gateway, and deploy the sensitive information detection (SID) ML model.
Configure required Kafka topics.
Configure and install the NVIDIA Morpheus SDK CLI.
Install Red Hat Enterprise Linux on the NVIDIA BlueField-2 DPU on a local server.
Install and configure NVIDIA NetQ Agent on the DPU.
Simulate a Web server that receives sensitive data over HTTP, and observe how the NVIDIA Morpheus AI Framework performs sensitive information detection.

Installing Red Hat OpenShift cluster

First, we are going to install an OpenShift cluster on AWS using a preselected EC2 instance type with appropriate NVIDIA GPU resources.

To prepare for installing Red Hat OpenShift on AWS, follow the Preparing to install on AWS and Configuring an AWS account guides.
Proceed to Installing a cluster on AWS with customizations. In step 2 of Creating the installation configuration file, modify the worker instance type and replicas count in the generated install-config.yaml as follows:
```
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform: 
    aws:
      type: g4dn.xlarge
  replicas: 1
controlPlane:
```
Copy snippet
Deploy the cluster.

Note the location of kubeconfig once the cluster deployment has completed. Example:

INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/iris/iris-20210825/auth/kubeconfig'

Sensitive information detection using the NVIDIA Morpheus AI framework

Share:

Prerequisites

Architecture overview

Installing and running NVIDIA Morpheus AI on Red Hat OpenShift

Installing Red Hat OpenShift cluster

Installing NVIDIA GPU operator

Deploying the NVIDIA Morpheus AI framework

Configuring Kafka topics

Deploying SID model

Deploying NVIDIA Morpheus SDK CLI with SID pipeline

Setting up NVIDIA NetQ agent on NVIDIA BlueField-2 DPU

Installing Red Hat Enterprise Linux on NVIDIA BlueField-2 DPU

Configuring NVIDIA NetQ agent for packet mirroring

Simulating an application service

Detecting sensitive information through ML inference

Deployment cleanup

How to avoid common issues

Conclusion

New C++ features in GCC 15

Evaluating memory overcommitment in OpenShift Virtualization

Cracking the code: How neural networks might actually “think”

How to use content templates in Red Hat Insights

Fine-tune LLMs with Kubeflow Trainer on OpenShift AI

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue