The LLVM packaging team recently ran into a compiler problem. A build of the LLVM package with Clang, with link-time optimization activated, failed validation. This article steps through how we explored, identified, and ultimately fixed the problem.
The LLVM package build
The build on the Fedora Rawhide version should be an easy task at the packaging level:
diff --git a/llvm.spec b/llvm.spec
(...)
@@ -1,3 +1,5 @@
+%global toolchain clang
(...)
+BuildRequires: clang
Now, let's introduce the characters in the mystery to follow.
Disabling runtime type information
By default, LLVM compiles with the -fno-rtti
option, which disables runtime type information. According to the LLVM coding standard, the compiler disables the information to reduce code and executable size.
Yet, sometimes, a type must be associated with a unique identifier. One example involves llvm::Any
, the LLVM version of std::any
. A typical implementation of std::any
involves typeid
, as showcased by the libcxx
version.
The typeid
operator cannot be used with the -fno-rtti
option, so we had to find an alternative. The current implementation of llvm::Any
and std::any
can be mocked by the following snippet, mapping &TypeId<MyType>::Id
to a unique identifier:
template <typename T> struct TypeId { static const char Id; };
Hidden visibility
Parts of LLVM are compiled with -fvisibility=hidden
. This option forces the default visibility of all symbols to be hidden
, which prevents them from being visible across library boundaries. Hiding symbols offers better control over exported symbols in a shared library.
What happens when the TypeId
construct from the previous section is combined with hidden visibility? Let's compile two shared libraries out of the same code:
template <typename T> struct TypeId { static const char Id; };
template <typename T> const char TypeId<T>::Id = 0;
#ifdef FOO
const char* foo() { return &TypeId<int>::Id; }
#else
const char* bar() { return &TypeId<int>::Id; }
#endif
We compile two binaries, one with FOO
defined and one without, to carry out the different #ifdef
paths:
> clang++ -DFOO foo.cpp -shared -o libfoo.so
> clang++ foo.cpp -shared -o libbar.so
Without hidden visibility, both libraries place our Id
at the same address:
> llvm-nm -C libfoo.so
(...)
0000000000000679 V TypeId<int>::Id
> llvm-nm -C libbar.so
(...)
0000000000000679 V TypeId<int>::Id
The V
in the output indicates that the symbol is a weak object, which means that only one of the items will be chosen by the linker. Therefore, we keep unicity of the symbol and its address across compilation units.
But when compiled with -fvisibility-hidden
, the symbols no longer are weak:
> clang++ -fvisibility=hidden foo.cpp -shared -o libbar.so
> llvm-nm -C libfoo.so
(...)
0000000000000629 r TypeId<int>::Id
The r
next to the address means that the symbol is in a read-only data section. The symbol is not dynamically linked (as we can confirm from the output of llvm-nm -D
), so it gets different addresses in libfoo
and libbar
. In short, unicity is not preserved.
Fine-grained control over symbol visibility
A straightforward fix for the incompatibility we've uncovered is to explicitly state that TypeId::Id
must always have the default visibility. We can make this change as follows:
template <typename T> struct __attribute__((visibility("default"))) TypeId { static const char Id; };
Let's check that the fix works:
> clang++ -fvisibility=hidden foo.cpp -shared -o libbar.so
> llvm-nm -C libfoo.so
(...)
0000000000000659 V TypeId<int>::Id
The V
for a weak symbol has returned, but that's not the end of the story.
Instead of parameterizing TypeId
by int
, let's parameterize it by a HiddenType
class declared with hidden visibility:
struct HiddenType {};
template <typename T> struct __attribute__((visibility("default"))) TypeId { static const char Id; };
template <typename T> const char TypeId<T>::Id = 0;
const char* foo() { return &TypeId<HiddenType>::Id; }
When compiling this code with -fvisibility-hidden
, where does TypeId<HiddenType>::Id
end up?
> clang++ -fvisibility=hidden foo.cpp -shared -o libbar.so
> llvm-nm -CD libbar.so | grep -c TypeId<HiddenType>::Id
0
Fascinating! This exercise shows that a template function with default visibility, instantiated with a type of hidden visibility, ends up with hidden visibility. Indeed, flagging HiddenType
with __attribute__((visibility("default")))
restores the expected behavior.
Where theory meets LLVM
Once we isolated the behavior described in the preceding section, we could easily provide the relevant patches in LLVM:
- Force visibility of
llvm::Any
to external - Fine grain control over some symbol visibility
- Add extra check for
llvm::Any::TypeId
visibility
These patches fix the build issue mentioned at the beginning of the article and ensure that it won't reproduce, which is the kind of outcome programmers always look for.
Acknowledgments
The author would like to thank Béatrice Creusillet, Adrien Guinet, and the editorial team for their help on this article.
Last updated: August 14, 2023