Running HPC workloads across multiple architectures with Red Hat Enterprise Linux

In this article, I want to provide some background details about our recently developed demonstration video – “Running Game of Life across multiple architectures with Red Hat Enterprise Linux“.

This video shows the Game of Life running in a heterogeneous environment using three 64-bit hardware architectures: aarch64 (ARM v8-A), ppc64le (IBM Power little endian) and x86_64 (Intel Xeon). If you are not familiar with the rules of this cellular automaton, they are worth checking out via the reference above.

In this multi-architecture demonstration, all systems were located in different test labs, were running Red Hat Enterprise Linux and used Open MPI over Gigabit Ethernet to communicate among themselves. The same program source code was compiled for each of the three architectures and the respective binaries were run on remote systems.

Since this demo was built as a proof-of-concept, the inter-system communications to remote system labs via Open MPI were expected to be rather slow. In addition, variable latency visible in the video was introduced by the remote GUI display.  

These Games of life calculations used a total of 12 MPI processes with 4 processes running on each physical system. The Life matrix size was determined by the size of the GUI display.  For each new generation, the master MPI process sends the matrix dimensions to the slave MPI processes followed by several rows of the current generation matrix. This information is sufficient for a particular slave process to compute and send back the new generations of the rows assigned to it.

Interestingly, Open MPI tools and libraries used in this demonstration were based on OpenHPC repository (using the 1.3.1 branch). Red Hat recently contributed changes to the OpenHPC build procedures to simplify the process of specifying alternate or newer compilers for the build process. The OpenHPC packages were built for all involved architectures (aarch64, ppc64le and x86_64). This is a first-ever proof-of-concept demonstration of OpenHPC-based software stack running on top of Red Hat Enterprise Linux.

In addition, Red Hat enables developers to access the latest, stable open source C and C++ compilers, complementary development, and performance profiling tools via Red Hat Developer Toolset. It is now available across multiple architectures with the following Red Hat Enterprise Linux subscriptions:

  • Red Hat Enterprise Linux on x86 systems (Intel and AMD)
  • Red Hat Enterprise Linux for IBM Power
  • Red Hat Enterprise Linux for IBM z Systems
  • Red Hat Enterprise Linux Server for ARM Developer Preview

Finally, I’ve used Ansible for easier and faster installation across three systems. Thanks to Ansible’s system dependent variables, the same playbook was used for all systems.

The first step is to list all involved systems in the host's file:


This is the group of hosts used in the following Ansible playbook. In the first part of the playbook, the necessary repositories are defined and the needed packages are installed:

- name: game

 hosts: game

 user: root


 - name: copy OpenHPC repository file


 src: files/ohpc/ohpc-{{ ansible_machine }}.repo

 dest: /etc/yum.repos.d/ohpc-{{ ansible_machine }}.repo

 - name: install OpenHPC packages

   action: package name={{ item }} state=latest


    - lmod-ohpc

    - openmpi-gnu-ohpc

After Ansible installed the necessary packages, the next step is to make sure that the compiler and MPI are loaded:

- name: automatically load openHPC compiler and Open MPI


   dest: /etc/profile.d/

   content: "module load gnu openmpi\n"

The next step is to create a hostfile for Open MPI. It should contain names of systems involved in the calculation and how many processes should be spawned on each of them:

- name: create Open MPI hostfile


   dest: /home/test/hostfile

   content: "{% for host in groups['game']%}

   {{ host }} slots=4\n{% endfor %}"

 become: yes

 become_user: test

Initially, I wanted to use ansible_processor_vcpus but there seems to be a bug somewhere in Ansible, which makes Ansible report the wrong number of CPUs on aarch64. Because of that,  I am hardcoding the number of CPUs available to Open MPI on aarch64 to be 4.

The last step of my ansible playbook is the download and compilation of the actual Game of Life implementation using mpicc. Note there are no source changes necessary to compile on the three different host architectures. Although not required, each of the binaries used in this demonstration was compiled in exactly the same way, including the GUI code that actually runs only in the master MPI process. That was done to demonstrate that the same source code compiles and runs in the same way with Red Hat Enterprise Linux on each of the three architectures:

- name: download Game of Life



   dest: /home/test/MPI-Game-of-Life

 become: yes

 become_user: test


   - mpi_life.c


- name: build Game of Life

 shell: “cd /home/test/MPI-Game-of-Life;

 mpicc -Wall -g  -DMASTER_GUI  mpi_life.c

 -o mpi_life  `pkg-config --cflags gtk+-2.0`

 `pkg-config --libs gtk+-2.0`”

At this point, the demo is ready to run and the Game of Life can be started just as it can be seen in the video:

mpirun --hostfile hostfile MPI-Game-of-Life/mpi_life

In this demonstration, I’ve run “top” in the terminal window to show four MPI processes running on each of the three systems. You can observe the actual workload running across the testbed, as many new cell generations are being calculated and displayed during this experiment.

I hope this video piqued your interest and demonstrated how you can run HPC-focused workloads and packages, including OpenHPC, across multiple-architectures with Red Hat Enterprise Linux.

Join the Red Hat Developer Program (it’s free) and get access to related cheat sheets, books, and product downloads.

Take advantage of your Red Hat Developers membership and download RHEL today at no cost.