Page

Model deployment and troubleshooting with RHEL AI

July 22, 2025

Mriganka Paul

With our enhanced model ready, we'll deploy it using RHEL AI for initial testing and validation.

Prerequisites:

To install RHEL AI on AWS, you must have:
- An active AWS account with proper permissions.
- A Red Hat subscription to access RHEL AI downloads.
- An AWS CLI installed and configured with your access key ID and secret access key.
- Sufficient AWS resources: VPC, subnet, security group, and SSH key pair.
Storage requirements:
- Minimum 1TB for /home directory and 120GB for /path.

In this lesson, you will:

Set up RHEL AI on AWS by creating S3 buckets, configuring IAM roles, and converting raw images to AMIs.
Deploy a GPU-enabled EC2 instance with RHEL AI for optimal model serving performance.
Transfer your trained model from a local development environment to the cloud-based RHEL AI instance.
Configure InstructLab for GGUF models by switching from vLLM to llama-cpp backend for proper compatibility.
Serve your model in production using RHEL AI's built-in capabilities and test interactive functionality.
Troubleshoot common deployment issues, including taxonomy validation errors, environment setup problems, and AWS-specific challenges.
Optimize performance considerations for AWS RHEL AI, including instance types, storage, security, and cost management.
Validate model functionality in a cloud environment before scaling to production deployment.

Phase 2: Model deployment with RHEL AI

RHEL AI provides a bootable image with everything needed to run and serve our model.

Set up RHEL AI on AWS

For scalable deployment and testing, we'll use RHEL AI on AWS.

Note: Before starting the RHEL AI installation on AWS, ensure you have satisfied the prerequisites.

Follow these steps to install RHEL AI:

Install and configure the AWS CLI (if not already installed):

# Download and install AWS CLI v2
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Configure AWS CLI
aws configure

Create and set up the necessary environment variables:

export BUCKET=<custom_bucket_name>
export RAW_AMI=rhel-ai-nvidia-aws-1.5-1747399384-x86_64.raw
export AMI_NAME="rhel-ai"
export DEFAULT_VOLUME_SIZE=1000  # Size in GB

Create an S3 bucket and an IAM setup for image conversion:
```
aws s3 mb s3://$BUCKET
```

Create a trust policy file for VM import:

printf '{ 
  "Version": "2012-10-17", 
  "Statement": [ 
    { 
      "Effect": "Allow", 
      "Principal": { 
        "Service": "vmie.amazonaws.com" 
      }, 
      "Action": "sts:AssumeRole", 
      "Condition": { 
        "StringEquals":{ 
          "sts:Externalid": "vmimport" 
        } 
      } 
    } 
  ] 
}' > trust-policy.json

# Create the IAM role
aws iam create-role --role-name vmimport --assume-role-policy-document file://trust-policy.json

Create a role policy for S3 bucket access:

printf '{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Action":[
            "s3:GetBucketLocation",
            "s3:GetObject",
            "s3:ListBucket" 
         ],
         "Resource":[
            "arn:aws:s3:::%s",
            "arn:aws:s3:::%s/*"
         ]
      },
      {
         "Effect":"Allow",
         "Action":[
            "ec2:ModifySnapshotAttribute",
            "ec2:CopySnapshot",
            "ec2:RegisterImage",
            "ec2:Describe*"
         ],
         "Resource":"*"
      }
   ]
}' $BUCKET $BUCKET > role-policy.json

aws iam put-role-policy --role-name vmimport --policy-name vmimport-$BUCKET --policy-document file://role-policy.json

Download and convert the RHEL AI image:
1. Go to the Red Hat Enterprise Linux AI download page.
2. Download the RAW image file (rhel-ai-nvidia-aws-1.5-1747399384-x86_64.raw).
3. Upload it to your S3 bucket:
```
aws s3 cp rhel-ai-nvidia-aws-1.5-1747399384-x86_64.raw s3://$BUCKET/
```

Create the import configuration and convert the image:

# Create import configuration
printf '{ 
  "Description": "RHEL AI Image", 
  "Format": "raw", 
  "UserBucket": { 
    "S3Bucket": "%s", 
    "S3Key": "%s" 
  } 
}' $BUCKET $RAW_AMI > containers.json

# Start the import process
task_id=$(aws ec2 import-snapshot --disk-container file://containers.json | jq -r .ImportTaskId)

# Monitor import progress
aws ec2 describe-import-snapshot-tasks --filters Name=task-state,Values=active

Wait for the import to complete, then register the AMI:

# Get snapshot ID from completed import
snapshot_id=$(aws ec2 describe-import-snapshot-tasks --import-task-ids $task_id | jq -r '.ImportSnapshotTasks[0].SnapshotTaskDetail.SnapshotId')
# Tag the snapshot
aws ec2 create-tags --resources $snapshot_id --tags Key=Name,Value="$AMI_NAME"
# Register AMI from snapshot
ami_id=$(aws ec2 register-image \
  --name "$AMI_NAME" \
  --description "$AMI_NAME" \
  --architecture x86_64 \
  --root-device-name /dev/sda1 \
  --block-device-mappings "DeviceName=/dev/sda1,Ebs={VolumeSize=${DEFAULT_VOLUME_SIZE},SnapshotId=${snapshot_id}}" \
  --virtualization-type hvm \
  --ena-support \
  | jq -r .ImageId)
# Tag the AMI
aws ec2 create-tags --resources $ami_id --tags Key=Name,Value="$AMI_NAME"

Set up instance configuration variables:

instance_name=rhel-ai-instance
ami=$ami_id  # From previous step
instance_type=g4dn.xlarge  # GPU-enabled instance for AI workloads
key_name=<your-key-pair-name>
security_group=<your-sg-id>
subnet=<your-subnet-id>
disk_size=1000  # GB

Launch the RHEL AI instance:

aws ec2 run-instances \
  --image-id $ami \
  --instance-type $instance_type \
  --key-name $key_name \
  --security-group-ids $security_group \
  --subnet-id $subnet \
  --block-device-mappings DeviceName=/dev/sda1,Ebs='{VolumeSize='$disk_size'}' \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value='$instance_name'}]'

Connect to your instance:

ssh -i your-key.pem cloud-user@<instance-public-ip>

Verify RHEL AI installation:

# Verify InstructLab tools
ilab --help
# Initialize InstructLab (first time)
ilab config init

Transfer your enhanced model

After training a model with InstructLab on your local development machine, you need to transfer it to your RHEL AI AWS instance as follows:

Export your model:

# On your local development machine where you trained the model
# Identify the model path (look for the converted GGUF model)
ls ./instructlab-granite-7b-lab-trained/

# Archive the model for transfer
tar -czvf telecom-model.tar.gz ./instructlab-granite-7b-lab-trained/

Transfer to a RHEL AI AWS Instance:

# Using scp to transfer to AWS instance
scp -i your-key.pem telecom-model.tar.gz cloud-user@<instance-public-ip>:/home/cloud-user/

Extract on RHEL AI:

# SSH into your RHEL AI AWS instance
ssh -i your-key.pem cloud-user@<instance-public-ip>

# Extract the model
mkdir -p ~/models
tar -xzvf telecom-model.tar.gz -C ~/models

Deploy the model with RHEL AI built-in capabilities

RHEL AI comes with InstructLab pre-installed and includes support for serving models. To utilize these capabilities:

Configure InstructLab for GGUF Models:

# Check current InstructLab configuration
ilab config show

# Edit the configuration file to use llama-cpp backend
vi ~/.config/instructlab/config.yaml

Update the serve section to use llama-cpp backend:

serve:
  backend: llama-cpp  # Change from vllm to llama-cpp

In the first terminal, start the model server:
```
# Start the model server
ilab model serve --model-path ~/models/instructlab-granite-7b-lab-trained/instructlab-granite-7b-lab-Q4_K_M.gguf
```
The server will start and display messages indicating it's ready to accept connections. Keep this terminal open and running.

In a second terminal, connect to the served model:

# Open a new terminal and SSH into your RHEL AI instance again
ssh -i your-key.pem cloud-user@<instance-public-ip>

# Connect to the served model for interactive chat
ilab model chat

You can now interact with your fine-tuned telecom support assistant to test it. Try asking questions like:

"What is fiber optic internet?"
"How does 5G compare to 4G?"
"What are the benefits of VoIP?"

Performance considerations for AWS RHEL AI

When running AI models on AWS RHEL AI, consider the following:

Instance types: Choose GPU-enabled instances (g4dn, p3, p4d) for optimal AI workload performance. The g4dn.xlarge instance type provides a good balance of cost and performance for testing.
Storage: RHEL AI requires a minimum 1TB for /home directory (InstructLab data) and 120GB for /path (system updates). The AWS setup automatically configures appropriate storage.
Security: Configure security groups to allow necessary ports:
- Port 22 for SSH access.
- Port 8000 for the model server (if exposing externally).
Cost management: Monitor AWS costs as GPU instances can be expensive. Consider using spot instances for development and testing to reduce costs.

This setup provides a production-ready environment for testing your model's responses and making iterative improvements.

Common issues and troubleshooting

When working with InstructLab and RHEL AI, you might encounter some common issues.

Taxonomy validation errors

InstructLab has strict requirements for taxonomy files:

No trailing spaces: Make sure there are no spaces at the end of the lines in your YAML files.
Taxonomy version: Use version 3 for knowledge taxonomies.
Required fields: Ensure all required fields (domain, document, questions_and_answers) are present.
Minimum examples: Knowledge taxonomies require at least five seed examples.
Repository references: The document section must reference a valid GitHub repository.

To check for these issues, run:

ilab taxonomy diff

If you encounter validation errors, carefully review the error messages and fix each issue.

Environment setup issues

If you're missing tools like yq, run:

# For macOS
brew install yq

# For Ubuntu/Debian
sudo apt-get install yq

# For Fedora/RHEL
sudo dnf install yq

Model training performance

For better performance during model training:

Use GPU acceleration when available.
Start with smaller datasets for initial testing.
Consider using OpenShift AI for distributed training at scale (in future deployments).

MacOS model compatibility

If you encounter errors about vLLM not supporting your platform during local development, remember to convert your model to GGUF format:

ilab model convert --model-dir 
~/.local/share/instructlab/checkpoints/YOUR_MODEL

AWS RHEL AI deployment issues

Common issues when deploying on AWS and how you might solve them include:

Import task fails: Check IAM permissions and S3 bucket access.
AMI registration fails: Verify snapshot completed successfully.
Instance launch fails: Check VPC, subnet, and security group configurations.
Connection issues: Verify security group allows SSH (port 22) from your IP.
Model serving issues: Ensure GPU drivers are properly configured and model paths are correct.

RHEL AI model serving issues

You may encounter issues with model serving on RHEL AI:

vLLM compatibility errors: GGUF files are not compatible with vLLM. Always configure InstructLab to use the llama-cpp backend for GGUF models by editing ~/.config/instructlab/config.yaml.
Model server fails to start: Check the server logs for specific error messages.

Common issues include:

Insufficient GPU memory.
Port already in use.
Incorrect Model file path.

Having deployed our model on RHEL AI, examined possible issues we may face, and learned how to troubleshoot them, we are ready to productionize and scale the model with OpenShift AI.

Red Hat Developer Sandbox

Programming Languages & Frameworks

System Design & Architecture

Developer Productivity

Automated Data Processing

Platform Engineering

Secure Development & Architectures

E-Books

Cheat Sheets

Documentation

A practical guide to InstructLab and RHEL AI

Model deployment and troubleshooting with RHEL AI

Prerequisites:

In this lesson, you will:

Phase 2: Model deployment with RHEL AI

Set up RHEL AI on AWS

Transfer your enhanced model

Deploy the model with RHEL AI built-in capabilities

Performance considerations for AWS RHEL AI

Common issues and troubleshooting

Taxonomy validation errors

Environment setup issues

Model training performance

MacOS model compatibility

AWS RHEL AI deployment issues

RHEL AI model serving issues

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue

A practical guide to InstructLab and RHEL AI

Path resource: Model deployment and troubleshooting with RHEL AI

Prerequisites:

In this lesson, you will:

Phase 2: Model deployment with RHEL AI

Set up RHEL AI on AWS

Transfer your enhanced model

Deploy the model with RHEL AI built-in capabilities

Performance considerations for AWS RHEL AI

Common issues and troubleshooting

Taxonomy validation errors

Environment setup issues

Model training performance

MacOS model compatibility

AWS RHEL AI deployment issues

RHEL AI model serving issues

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue