In our previous article about Stack Clash, we covered the basics of the Stack Clash vulnerability. To summarize, an attacker first uses various means to bring the heap and stack close together. A large stack allocation is then used to "jump the stack guard." Subsequent stores into the stack may modify objects in the heap or vice versa. This, in turn, can be used by attackers to gain control over applications.
GCC has a capability (-fstack-check
), which looked promising for mitigating Stack Clash attacks. This article will cover how -fstack-check
works and why it is insufficient for mitigating Stack Clash attacks.
Background
GCC has a flag, -fstack-check
, that is used to probe stack allocations. Carefully probing stack allocations can prevent the stack and heap from colliding and thus protect against Stack Clash attacks. So, can we use -fstack-check
to prevent the stack and heap from colliding? We must understand the design/implementation decisions for -fstack-check
before determining whether it's appropriate for Stack Clash mitigation.
Ada programs enable -fstack-check
to detect infinite recursion and stack overflows between different threads. Detection of either condition must result in the program reporting an error (typically via a userspace signal handler).
Requiring the program to run a signal handler on error implies that enough stack is always available for the signal handler to execute. Thus, -fstack-check
must probe beyond the current function's actual stack requirements to ensure stack space is available for the signal handler.
Also, the entire program is assumed to be compiled with -fstack-check
(a reasonable assumption if you're writing Ada code).
The combination of those two properties is critical. Because each function probes 1-3 pages beyond its current need, any functions that are subsequently called can skip the first 1-3 pages when they probe. For example, consider this code:
extern void bar (char *);
void
foo(void)
{
char z[8192];
bar (z);
}
Compiled with -O2 -fstack-check
results in:
subq $12328, %rsp
orq $0, 4104(%rsp)
orq $0, 8(%rsp)
orq $0, (%rsp)
addq $4128, %rsp
The first instruction allocates 12328 bytes of stack space (again, more than it needs so that there is always sufficient stack to run a userspace signal handler). At that point, we've already lost because that allocation could jump the stack guard and subsequent stores could be writing into the heap. The probes do not touch every allocated page. Finally, you'll note that the orq instructions (the probes) write addresses beyond the currently allocated stack. This causes significant problems for critical tools such as Valgrind.
Using -fstack-check is insufficient for mixed environments
Outside the Ada world, we must assume a mixed environment. The most common scenario would be to have key libraries (e.g., glibc) provided by an OS vendor interacting with userspace code provided by an ISV or customer.
In this case, the OS vendor may have compiled the system libraries with stack checking, but the OS vendor has no control over whether or not ISVs or customers compile their code with stack checking. Let's consider what can happen in that scenario.
To begin, let's assume the customer code does not have any large stack allocations (perhaps that's why they compiled without -fstack-check
). However, the customer code has a memory leak. Assume the customer code calls into one of the glibc routines that have large stacks, and that glibc was compiled with -fstack-check
.
This seems like a safe combination, but it is not.
Next, exploit a memory leak to bring the stack and heap close (perhaps within a page) to each other. Then call a glibc routine with a large stack. If the glibc routine were compiled with -fstack-check
, then it would skip probing the first 1-3 pages (due to the design decisions/assumptions of -fstack-check
). The stack pointer would now point into the heap, and stores into the stack would actually modify the heap. The stack and heap have clashed, and now there is a reasonable chance an attacker could build an exploit to gain complete control of the program.
More issues with -fstack-check
On some targets, the current -fstack-check
implementation allocates stack space all at once, then probes at page intervals within the just allocated stack. So, what happens if the program has a signal handler installed and receives an asynchronous signal between the allocation of the stack space and probing of the pages?
In that case, the stack pointer could be pointing beyond the guard into the heap. The signal arrives and the kernel transfers control to the registered signal handler. That signal handler is then running while its stack is pointing into the heap. Thus, the attacker has clashed the stack and heap, and there's a reasonable chance they can gain control over the program.
To exploit this scenario, the signal delivery must occur at the right point (after stack allocation, but before probing). This further illustrates the attention to detail that is needed to protect systems from Stack Clash style attacks.
So what should we do? Stay tuned for more details in future articles.
Last updated: April 26, 2019