Following a discussion on the Linux Kernel Mailing List, and further discussion about the design, we added a new feature to the extended inline assembly supported by GCC.
The problem was that that there was no way to tell GCC that the inline assembly produces useful information in the eflags
register. To work around this, programs must either copy the data from the eflags
register to a general register, or re-test whatever condition was contained within the flags.
For instance, the linux kernel has the following function:
int variable_test_bit(long n, volatile const unsigned long *addr) { int oldbit; asm volatile("bt %2,%1\n\t" "sbb %0,%0" : "=r" (oldbit) : "m" (*(unsigned long *)addr), "Ir" (n)); return oldbit; }
Here, the bt
instruction tests whether the nth bit beginning at addr is set; it copies that bit to the carry bit in eflags
. The sbb
instruction then copies the carry bit to oldbit via borrow out of the subtract.
The inefficiency comes when the surrounding code goes to use that result. For instance
if (variable_test_bit(n, addr)) do_stuff();
compiles to
// eax = n, edx = addr, ecx = oldbit bt %eax, (%edx) sbb %ecx, %ecx test %ecx, %ecx jz .L1 call do_stuff .L1
which is two more instructions than is ideal.
With the new asm flags feature, this function can be rewritten as
int variable_test_bit(long n, volatile const unsigned long *addr) { int oldbit; asm volatile("bt %2,%1" : "=@ccc" (oldbit) : "m" (*(unsigned long *)addr), "Ir" (n)); return oldbit; }
This tells GCC that oldbit is a boolean value that can be formed by testing the carry bit of the eflags register. Now GCC can optimize the previous fragment to
bt %eax, (%edx) jnc .L1 call do_stuff .L1
All of the conditions described in the i386 manual for conditional branches are supported:
a “above” or unsigned greater than ae “above or equal” or unsigned greater than or equal b “below” or unsigned less than be “below or equal” or unsigned less than or equal c carry flag set e,z “equal” or zero flag set g signed greater than ge signed greater than or equal l signed less than le signed less than or equal o overflow flag set p parity flag set s sign flag set na,nae,nb,nbe,nc,ne,ng,nge,nl,nle,no,np,ns,nz “not” flag, or inverted versions of those above
The presence of the asm flags feature may be detected by the built-in preprocessor define __GCC_ASM_FLAG_OUTPUTS__
. It is being used in mainline MinGW-w64, and while it isn't yet used in the Linux kernel, a patch has been submitted.
GCC 6 has not yet been released, but is expected within the first half of 2016.