In this article, I want to provide some background details about our recently developed demonstration video - “Running Game of Life across multiple architectures with Red Hat Enterprise Linux“.
This video shows the Game of Life running in a heterogeneous environment using three 64-bit hardware architectures: aarch64 (ARM v8-A), ppc64le (IBM Power little endian) and x86_64 (Intel Xeon). If you are not familiar with the rules of this cellular automaton, they are worth checking out via the reference above.
In this multi-architecture demonstration, all systems were located in different test labs, were running Red Hat Enterprise Linux and used Open MPI over Gigabit Ethernet to communicate among themselves. The same program source code was compiled for each of the three architectures and the respective binaries were run on remote systems.
Since this demo was built as a proof-of-concept, the inter-system communications to remote system labs via Open MPI were expected to be rather slow. In addition, variable latency visible in the video was introduced by the remote GUI display.
These Games of life calculations used a total of 12 MPI processes with 4 processes running on each physical system. The Life matrix size was determined by the size of the GUI display. For each new generation, the master MPI process sends the matrix dimensions to the slave MPI processes followed by several rows of the current generation matrix. This information is sufficient for a particular slave process to compute and send back the new generations of the rows assigned to it.
Interestingly, Open MPI tools and libraries used in this demonstration were based on OpenHPC repository (using the 1.3.1 branch). Red Hat recently contributed changes to the OpenHPC build procedures to simplify the process of specifying alternate or newer compilers for the build process. The OpenHPC packages were built for all involved architectures (aarch64, ppc64le and x86_64). This is a first-ever proof-of-concept demonstration of OpenHPC-based software stack running on top of Red Hat Enterprise Linux.
In addition, Red Hat enables developers to access the latest, stable open source C and C++ compilers, complementary development, and performance profiling tools via Red Hat Developer Toolset. It is now available across multiple architectures with the following Red Hat Enterprise Linux subscriptions:
- Red Hat Enterprise Linux on x86 systems (Intel and AMD)
- Red Hat Enterprise Linux for IBM Power
- Red Hat Enterprise Linux for IBM z Systems
- Red Hat Enterprise Linux Server for ARM Developer Preview
Finally, I’ve used Ansible for easier and faster installation across three systems. Thanks to Ansible’s system dependent variables, the same playbook was used for all systems.
The first step is to list all involved systems in the host's file:
[game]
aarch64.system.name
ppc64le.system.name
x86_64.system.name
This is the group of hosts used in the following Ansible playbook. In the first part of the playbook, the necessary repositories are defined and the needed packages are installed:
- name: game
hosts: game
user: root
tasks:
- name: copy OpenHPC repository file
copy:
src: files/ohpc/ohpc-{{ ansible_machine }}.repo
dest: /etc/yum.repos.d/ohpc-{{ ansible_machine }}.repo
- name: install OpenHPC packages
action: package name={{ item }} state=latest
with_items:
- lmod-ohpc
- openmpi-gnu-ohpc
After Ansible installed the necessary packages, the next step is to make sure that the compiler and MPI are loaded:
- name: automatically load openHPC compiler and Open MPI
copy:
dest: /etc/profile.d/ohpc.sh
content: "module load gnu openmpi\n"
The next step is to create a hostfile for Open MPI. It should contain names of systems involved in the calculation and how many processes should be spawned on each of them:
- name: create Open MPI hostfile
copy:
dest: /home/test/hostfile
content: "{% for host in groups['game']%}
{{ host }} slots=4\n{% endfor %}"
become: yes
become_user: test
Initially, I wanted to use ansible_processor_vcpus but there seems to be a bug somewhere in Ansible, which makes Ansible report the wrong number of CPUs on aarch64. Because of that, I am hardcoding the number of CPUs available to Open MPI on aarch64 to be 4.
The last step of my ansible playbook is the download and compilation of the actual Game of Life implementation using mpicc. Note there are no source changes necessary to compile on the three different host architectures. Although not required, each of the binaries used in this demonstration was compiled in exactly the same way, including the GUI code that actually runs only in the master MPI process. That was done to demonstrate that the same source code compiles and runs in the same way with Red Hat Enterprise Linux on each of the three architectures:
- name: download Game of Life
get_url:
url: http://some.host/life/{{item}}
dest: /home/test/MPI-Game-of-Life
become: yes
become_user: test
with_items:
- mpi_life.c
- name: build Game of Life
shell: “cd /home/test/MPI-Game-of-Life;
mpicc -Wall -g -DMASTER_GUI mpi_life.c
-o mpi_life `pkg-config --cflags gtk+-2.0`
`pkg-config --libs gtk+-2.0`”
At this point, the demo is ready to run and the Game of Life can be started just as it can be seen in the video:
mpirun --hostfile hostfile MPI-Game-of-Life/mpi_life
In this demonstration, I’ve run “top” in the terminal window to show four MPI processes running on each of the three systems. You can observe the actual workload running across the testbed, as many new cell generations are being calculated and displayed during this experiment.
I hope this video piqued your interest and demonstrated how you can run HPC-focused workloads and packages, including OpenHPC, across multiple-architectures with Red Hat Enterprise Linux.