Featured image for "Get started with clang-tidy in Red Hat Enterprise Linux."

The LLVM packaging team recently ran into a compiler problem. A build of the LLVM package with Clang, with link-time optimization activated, failed validation. This article steps through how we explored, identified, and ultimately fixed the problem.

The LLVM package build

The build on the Fedora Rawhide version should be an easy task at the packaging level:

diff --git a/llvm.spec b/llvm.spec
(...)
@@ -1,3 +1,5 @@
+%global toolchain clang
(...)
+BuildRequires: clang

Now, let's introduce the characters in the mystery to follow.

Disabling runtime type information

By default, LLVM compiles with the -fno-rtti option, which disables runtime type information. According to the LLVM coding standard, the compiler disables the information to reduce code and executable size.

Yet, sometimes, a type must be associated with a unique identifier. One example involves llvm::Any, the LLVM version of std::any. A typical implementation of std::any involves typeid, as showcased by the libcxx version.

The typeid operator cannot be used with the -fno-rtti option, so we had to find an alternative. The current implementation of llvm::Any and std::any can be mocked by the following snippet, mapping &TypeId<MyType>::Id to a unique identifier:

template <typename T> struct TypeId { static const char Id; };

Hidden visibility

Parts of LLVM are compiled with -fvisibility=hidden. This option forces the default visibility of all symbols to be hidden, which prevents them from being visible across library boundaries. Hiding symbols offers better control over exported symbols in a shared library.

What happens when the TypeId construct from the previous section is combined with hidden visibility? Let's compile two shared libraries out of the same code:

template <typename T> struct TypeId { static const char Id; };
template <typename T> const char TypeId<T>::Id = 0;
#ifdef FOO
    const char* foo() { return &TypeId<int>::Id; }
#else
    const char* bar() { return &TypeId<int>::Id; }
#endif

We compile two binaries, one with FOO defined and one without, to carry out the different #ifdef paths:

> clang++ -DFOO foo.cpp -shared -o libfoo.so
> clang++       foo.cpp -shared -o libbar.so

Without hidden visibility, both libraries place our Id at the same address:

> llvm-nm -C libfoo.so
(...)
0000000000000679 V TypeId<int>::Id
> llvm-nm -C libbar.so
(...)
0000000000000679 V TypeId<int>::Id

The V in the output indicates that the symbol is a weak object, which means that only one of the items will be chosen by the linker. Therefore, we keep unicity of the symbol and its address across compilation units.

But when compiled with -fvisibility-hidden, the symbols no longer are weak:

> clang++ -fvisibility=hidden foo.cpp -shared -o libbar.so
> llvm-nm -C libfoo.so
(...)
0000000000000629 r TypeId<int>::Id

The r next to the address means that the symbol is in a read-only data section. The symbol is not dynamically linked (as we can confirm from the output of llvm-nm -D), so it gets different addresses in libfoo and libbar. In short, unicity is not preserved.

Fine-grained control over symbol visibility

A straightforward fix for the incompatibility we've uncovered is to explicitly state that TypeId::Id must always have the default visibility. We can make this change as follows:

template <typename T> struct __attribute__((visibility("default"))) TypeId { static const char Id; };

Let's check that the fix works:

> clang++ -fvisibility=hidden foo.cpp -shared -o libbar.so
> llvm-nm -C libfoo.so
(...)
0000000000000659 V TypeId<int>::Id

The V for a weak symbol has returned, but that's not the end of the story.

Instead of parameterizing TypeId by int, let's parameterize it by a HiddenType class declared with hidden visibility:

struct HiddenType {};
template <typename T> struct __attribute__((visibility("default"))) TypeId { static const char Id; };
template <typename T> const char TypeId<T>::Id = 0;
const char* foo() { return &TypeId<HiddenType>::Id; }

When compiling this code with -fvisibility-hidden, where does TypeId<HiddenType>::Id end up?

> clang++ -fvisibility=hidden foo.cpp -shared -o libbar.so
> llvm-nm -CD libbar.so | grep -c TypeId<HiddenType>::Id
0

Fascinating! This exercise shows that a template function with default visibility, instantiated with a type of hidden visibility, ends up with hidden visibility. Indeed, flagging HiddenType with __attribute__((visibility("default"))) restores the expected behavior.

Where theory meets LLVM

Once we isolated the behavior described in the preceding section, we could easily provide the relevant patches in LLVM:

These patches fix the build issue mentioned at the beginning of the article and ensure that it won't reproduce, which is the kind of outcome programmers always look for.

Acknowledgments

The author would like to thank Béatrice Creusillet, Adrien Guinet, and the editorial team for their help on this article.

Last updated: August 14, 2023