The Benefits of Red Hat Enterprise Linux for Real Time
To deliver the best of predictability on real-time workloads, the Red Hat Enterprise Linux for Real Time provides state-of-art on determinism for the bullet-proof RHEL (Red Hat Enterprise Linux) platform. The availability of this product raises some questions, like: Do I need a real-time operating system? Or What are the benefits and drawbacks of running the RHEL for Real Time? This article aims to clarify how to leverage the success of your business using a real-time operating system, and what kind of workloads or type of industry can benefit from RT.
Linux is the best choice for High-Performance Computing (HPC) due to the years of Linux kernel optimization focused on delivering high average throughput for a vast number of different workloads. Being optimized for throughput means that the algorithms for processing data are geared towards processing the most amount of data in the least amount of time. Examples of throughput-oriented operations are transferring megabytes/second over a network connection or the amount of data read from or written to a storage medium. These optimizations are the basis for the success of RHEL on servers and HPC environments.
Nevertheless, these optimizations for high throughput can cause drawbacks on other specific workloads. For example, the RHEL kernel uses a busy loop wait approach to avoid the scheduling overhead on some mutual exclusion methods, like spin locks and read/write locks. While busy waiting to enter in a critical section of code, the task waiting delays the scheduling of other potentially higher priority tasks in the same CPU. As a result, the higher priority task can not be scheduled and executed until the waiting task has completed, causing a delay on the high priority task’s response.
Although this delay is acceptable for the majority of common workloads, it is not acceptable for the class of tasks where correctness depends on meeting timing deadlines. This class of tasks, often classified as real-time tasks, has strict timing constraints where an answer must be delivered within a certain time period, and a late answer means it is wrong or fails.
For example, processing a 30 frames per second video requires the ability to deliver one frame every 33 milliseconds. If the system fails to deliver a frame every 33 milliseconds, the video processing will not be only late, but also wrong. It is natural to think then that real-time means delivering a quick response to an event and assume that real-time can be achieved only by making the system run faster. This assumption is a misconception, however. For instance, if the above-mentioned system can run fast enough to deliver a frame every 16.6 ms (60 frames per seconds), the video will be reproduced twice as fast. A faster response is not the expected behavior for processing this video, so the system will deliver not only early results but also wrong results. Hence, real-time systems are those that deliver a predictable timing behavior instead of just trying to deliver faster results.
To provide an enterprise environment for real-time workloads on Linux, Red Hat provides the RHEL for Real Time product. RHEL for Real Time is composed of the RHEL kernel optimized for determinism, along with a set of integrated tuning tools to provide the state-of-art of determinism on Linux. The deterministic timing behavior depends on both an application’s determinism using its own algorithms and on the Linux kernel determinism in managing the systems shared resources.
RHEL and RHEL for Real Time: A Fast Comparative
Of all resources managed by any Linux kernel, CPU time is the principal resource. The CPU time is shared between applications according to a scheduling algorithm that uses priorities and real-time policies to control execution. Under the fixed-priority real-time scheduler, tasks can be classified in 99 priorities, between 1 (lowest) and 99 (highest). The CPU will be granted to the highest priority task that is able to run. When two tasks ready to run share the same priority, the scheduling behavior will then depend on the tasks real-time policy. If the tasks were classified under the SCHED_FIFO (First In First Out) policy, the first awakened task will be scheduled and will run until it finishes executing and then the CPU will be granted to the other task. On the other hand, if two tasks that share the same priority were classified under the SCHED_RR (Round Robin) policy, each task will only run for a predetermined amount of time before relinquishing the CPU.
Although both RHEL and RHEL for Real Time kernels use the same scheduler, they differ in how much preemption they allow and how they avoid priority inversions. The RHEL for Real Time kernel has changed kernel locking so that interrupts and preemption are enabled for a much wider range of time than on RHEL kernel. Allowing interrupts and preemption to be enabled allows more opportunities for a scheduling decision to be made, meaning that the time between an event occurring and the thread that responds to that event being scheduled (the scheduling latency) is reduced.
A priority inversion takes place any time a high priority task is forced to wait because a lower priority task holds a resource it needs. A priority inversion is said to be bounded when it is limited to the time it takes for the low priority task to complete the critical section and release the resource. An unbounded priority inversion can also happen, which means the high-priority task is further delayed for an unknown amount of time.
Bounded priority inversion is inherent of mutual exclusion methods. The RHEL for Real time kernel acts to reduce priority inversion sections by changing the Linux preemption model and locking methods, along with a fine-grained locking control on many subsystems. RHEL kernel is susceptible to unbounded priority inversion on preemptive lock mechanisms whereas RHEL for Real Time kernel is not. The RHEL For Real Time kernel prevents unbounded priority inversion by implementing the priority inheritance protocol on preemptive locking methods.
These improvements on increased preemption and avoiding and bounding priority inversions gives the RHEL for Real Time kernel the predictably needed to run time-constrained real-time tasks on Linux. The RHEL for Real Time is certified to deliver needed system resources to the high priority real-time task within 150 microseconds (us). In comparison, it is common to have many milliseconds (ms) latencies on the RHEL kernel, which is optimized for throughput instead of determinism. Increased latencies may not be acceptable for every workload, however. For example, some milliseconds latency can make the difference between a trade and a lost customer in the highly competitive financial services marketplace, where consistent response times are a huge competitive advantage.
To better illustrate the difference between RHEL and RHEL for Real Time, a real-time task was run on both kernels on the same hardware. The real-time task was activated periodically at every millisecond. Every task’s job must be finished before next activation. Therefore, each job has a 1 ms relative deadline. The task was the only user task on real-time and ran under FIFO scheduler with priority 95. On both cases, the same non-real-time workload was dispatched to simulate a production environment .
The latency, e.g., the delay between the expected and real activation time of a task’s job, was taken for 10000 activations (10 seconds run) and a histogram of both systems’ latencies was plotted in Figure 1:
While RHEL for Real Time showed 6 µs on max latency, within 150 µs boundary, the RHEL kernel often exceeded this limit. In the worst case, an activation was delayed for 3391 µs, and that is a big issue. This problem is better illustrated in Figure 2:
Figure 2 shows the percentage of jobs that were served within some thresholds. While 100% of jobs running on RHEL for Real Time were served within the 150 µs boundary, 44 RHEL jobs crossed the 150 µs barrier. As the real-time application has a 1 ms deadline, surpassing the 150 µs is not a real issue. However, exceeding the 1 ms latency is a failure, because such a delay makes a job miss its deadline. On this 10 seconds experiment, 8 activations suffered latencies longer than 1 ms, resulting in a failure. Furthermore, in the worst case 3.3 ms latency, not only did one job miss its deadline, but the next two jobs also missed their deadline as a side effect of the latency: it is a domino effect. It is important to note that there is no warranty on maximum latency on a RHEL kernel. Hence, it is not possible to define beforehand the max latency that a task can suffer.
To be able to deliver such warranties, many RHEL for Real Time kernel subsystems were either tuned or modified. Some of these modifications are presented in the next section.
RHEL for Real Time Internals
On the RHEL for Real Time kernel, once a high priority task is awakened, the kernel will schedule it immediately, as long as the preemption is not explicitly disabled. In contrast, the RHEL kernel may delay the scheduling of the highest priority task until a point in the code where it is known that scheduling another task will not impact the system’s throughput.
For instance, let’s suppose that a low priority task is writing data to a disk when a high priority real-time task arrives. On the RHEL for Real Time kernel, this high priority task will preempt the low priority task and start to run. In contrast, the RHEL kernel may delay the scheduling of the high priority task until the low priority task finishes writing all the data to the disk. The RHEL kernel assumes it is better to allow the low priority task to finish and take advantage of the already allocated cache memory for the disk operations to increase the average throughput than preempt the low priority task to reduce the system’s latency.
Technically speaking, both kernels differ on its preemption mode. While the RHEL kernel voluntarily chooses the point to preempt a low priority task, the RHEL for Real Time kernel is fully preemptible. It means that the RHEL for Real Time kernel will always preempt a low priority task to run a high priority task, with exception of fewer explicitly non-preemptible sections. These non-preemptible sections are bounded to 150 µs, however.
In order to bound preemption and interrupt disabled sections, the way the kernel handles hard and soft interruptions were modified on the RHEL for Real Time kernel. On the RHEL kernel, all interruptions are handled in a special context, the interruption context. The interruption context is not schedulable. Hence, once it starts, it disables process scheduling until the job finishes. The advantage of this approach is to reduce the scheduling overhead, increasing system throughput, but the drawback is increased OS latency. On RHEL for Real Time kernels, all device’s and soft IRQs were converted to preemptible kernel threads. Although the interrupt context still exists, it does not handle device’s IRQ, but only wakes up the threads that will handle the IRQ. This reduces the interference of the interrupt context on the process, reducing the system latency.
Although it is possible to enable threaded interrupt handlers on the RHEL kernel, it still uses busy loop mutual exclusion methods that rely on disabling preemption which can potentially cause long latencies. In contrast, the threaded interrupt handlers in the RHEL for Real Time kernel are allowed to use another type of lock called the Real-time Mutex or rt mutex.
The main RT change and feature is the conversion of most spin locks to rt mutex locks. A rt mutex is a sleepable lock, meaning a task blocked on a rt mutex will go to sleep waiting for the lock to be released. Although it resembles the traditional mutex implementation, it has a fundamental feature for real-time.
There exists a condition on sleepable locks, like in the traditional kernel mutex implementation, that an Unbounded Priority Inversion can take place. It requires three or more threads operating at different real-time priorities and sharing a lock. What happens is that a low-priority thread claims a lock and a high-priority thread wants it but cannot run because the lock is held by the low-priority thread. The delay comes in if there are threads with priorities between the low and high priority threads that are ready to run. The scheduler (quite properly) allows these threads to run because 1) they are higher priority than the low-priority thread holding the lock and 2) the high-priority thread cannot run because it is blocked on a lock. Running the intermediate priority threads means the low priority thread cannot run and release the lock. The high-priority thread is delayed indefinitely due to this Unbounded Priority Inversion.
To avoid such anomaly, the rt mutex implements the math proof Priority Inheritance Protocol. The rt mutex lock has a priority-inheritance chain that allows holders of the lock to be boosted in priority to the same level has a requester, thereby allowing the scheduler to schedule the holder so it can make progress and release the lock. This is Unbounded Priority Inversion avoidance and is the main feature of a RHEL for Real Time kernel. Without this feature, you allow Unbounded Priority Inversion in the kernel, which is a main contributor to decreasing real-time predictability.
These differences allow the Red Hat Enterprise Linux for Real Time to deliver the best on predictability, transforming the rock-solid Red Hat Enterprise Linux on an Enterprise Grade Real-Time Operating System.
 This workload is provided by the rteval tool, that is the same workload used on RHEL for Real Time certification. More information about RHEL for Real Time certification can be found here.
About the author:
Daniel Bristot de Oliveira is Senior Software Maintenance Engineer at Red Hat, working on the Red Hat Enterprise Linux for Real Time, mainly with the real-time kernel. He is also a Ph.D. student at the Federal University of Santa Catarina, in Brazil, where he researches about real-time operating systems and real-time Linux. He is a reviewer of the IEEE Latin America Transaction Magazine.