Debugging software is something akin to an art form but, regardless of the approach you prefer, having good information on what's happening in your application is key.
ltrace is one tool you may wish to add to your belt - a debugging tool that attaches to a running process, and prints to the terminal or a log file the library calls and/or system calls made by that process. In both its mode of operation and command line interface,
ltrace is similar to
strace only works for system calls, however,
ltrace has no such restriction.
ltrace is useful for roughly pinning down crashes of programs. It puts things into context, providing a sequence of events that led to the crash. It's also a useful tool for introspection, for figuring out what it is that the application does, what files it reads and what environment variables it inspects, whether there are certain calls. One could use
ltrace to figure out whether some plugin is called, for example, and if yes, with what result, figuring out whether the application responds to configuration the expected way.
As a small example, let's take a look what library calls a simple invocation of "echo" entails:
$ ltrace -n2 echo __libc_start_main([ "echo" ] <unfinished ...> getenv("POSIXLY_CORRECT") = nil strrchr("echo", '/') = nil setlocale(LC_ALL, "") = "cs_CZ.UTF-8" bindtextdomain("coreutils", "/usr/share/locale") = "/usr/share/locale" textdomain("coreutils") = "coreutils" __cxa_atexit(0x402310, 0, 0, 0x736c6974756572) = 0 __overflow(0x7f3d476b18e0, 10, 0x7f3d476b25f0, 0) = 10 exit(0 <unfinished ...> __fpending(0x7f3d476b18e0, 0, 0x7f3d476b25f0, 4) = 0 fclose(0x7f3d476b18e0) = 0 __fpending(0x7f3d476b1800, 0, 0x7f3d476b3150, 0) = 0 fclose(0x7f3d476b1800) = 0 +++ exited (status 0) +++
ltrace uses the Linux kernel interface called
ptrace, which allows one process to meddle with another process's innards.
ptrace provides valuable service to both
ltrace in that it can report when a traced process exits, forks or clones, or performs a system call. Library calls however take place purely in user space, so
ptrace can't report them directly. Instead,
ltrace leverages two tools to achieve that end. The first is a mechanism that is used to implement calls to shared libraries—procedure linkage tables, or PLT's. Ian Lance Taylor published a good treatment of the way dynamic linking works, for us the necessary thing is that inter-library calls aren't direct, but instead are routed via the PLT (of which each module has one—the main binary and any shared libraries).
The second tool is breakpoints.
ltrace loads the ELF file corresponding to each module, finds its PLT table, and puts a breakpoint to each entry. If a program runs into one of these breakpoints, it stops, and
ltrace is notified by
ptrace. From the program counter it can deduce which PLT entry was hit, and from that which function was called.
This essentially simple abstract view of
ltrace operation is complicated by a number of details, but the fundamental mode of operation has remained as stated above since before Red Hat Enterprise Linux.
When printing out a call, it would be very nice to print its arguments as well (just like
strace does). But the set of system calls is finite and fairly constant, and at least on Linux, very very stable, so it's possible to encode the knowledge of those into
strace itself. But of library calls there's essentially infinite number, so that approach can't work.
One possibility is to decode argument types from the DWARF debug information. DWARF isn't always available however, and it isn't always practical to install either. Besides, DWARF won't tell us whether a char* is a character pointer or a string, that an int-typed member somewhere happens to be a conceptual enum, or which pointers are output arguments. DWARF support would still be nice to have as a backup plan, and indeed work is being done upstream to add that support, but it can't be the full story.
ltrace supports something called prototype libraries. Those are plain-text formatted files with descriptions of prototypes. Each prototype library is named after the shared library that it describes. For example,
ltrace itself ships prototype libraries called
libacl.so.conf. These files are stored in directory
/usr/share/ltrace, and any library developer can ship such files with their library. Apart from
/usr/share/ltrace/, there's a number of other places that
ltrace looks into—
$HOME/.ltrace.conf. A user can also specify a location or a prototype library to load using the
-F command line switch. This is all described in more detail in ltrace(1).
A prototype library definition could look like this:
# From actual libm.so.conf: void sincos(double, +double*, double*); # From nascent libgo.so.0.conf: typedef __go_string = struct(string(array(char, elt2)*), hide(int)); ulong __go_type_hash_string(__go_string*, hide(ulong));
For simple functions, the syntax may seem (almost) like that of C, but the latter more involved example shows that it's not C at all. The syntax is described in detail in ltrace.conf(5).
With this information in hand,
ltrace still needs to figure out where the parameter values actually reside (e.g. in registers and which ones, on stack and where exactly). For this it has backends that know about parameter passing conventions on a given architecture.
If a prototype couldn't be discovered,
ltrace assumes it's a function with four long parameters, returning a long. E.g. the
__fpending calls in the example above have clearly this default prototype.
Today, an average application is linked to dozens of shared libraries, and not all of them are necessarily interesting. The overhead imposed by context switching due to
ptrace is fairly significant, so it is best if the set of traced symbols is as small as possible. So
ltrace makes it possible to select what symbols a user wishes to trace. There are two options for filtering PLT slots:
If you are interested in calls related to a certain shared library, use
-l with an argument of a
SONAME of the interesting library. For example,
date is linked to
librt.so.0. We can request tracing calls into that library e.g. like this:
$ ltrace -l librt* date date->clock_gettime(0, 0x7fff0d00a090, 0x7f80980c21b4, 0) = 0 St čen 4 16:26:59 CEST 2014 +++ exited (status 0) +++
Note that as before
ltrace doesn't really know whether the call to clock_gettime really ends up being serviced by a routine in
librt.so.0. The call can end up somewhere else due to symbol interposition, e.g. if you use
LD_PRELOAD. The important thing is that a function is called which is exported from
The other filtering option is
-e, and it makes it possible to explicitly control which PLT slots of which PLT tables get selected. For example, to trace all calls to a function named "free" (i.e. to select "free" slots in all PLT tables), one would do:
$ ltrace -e free ls libc.so.6->free(0x1de2030) = <void> libc.so.6->free(nil) = <void> libc.so.6->free(0x1de7c30) = <void> ls->free(0x1de7c10) = <void> ls->free(nil) = <void> ls->free(0x1de7be0) = <void> libselinux.so.1->free(0x1de2010) = <void> libselinux.so.1->free(nil) = <void> libselinux.so.1->free(nil) = <void> libselinux.so.1->free(nil) = <void> [... snip ...]
On the other hand, to show only calls from
libselinux.so.1 (i.e. to select all slots, but only in the PLT table of
libselinux.so.1), one would do:
$ ltrace -e '@libselinux*' ls libselinux.so.1->free(0x1a62010) = <void> libselinux.so.1->free(nil) = <void> libselinux.so.1->free(nil) = <void> [...] libselinux.so.1->free(nil) = <void> libselinux.so.1->__cxa_finalize([...]) = 0x7f5d8c0cb9d0 +++ exited (status 0) +++
Both of the above syntaxes can be used together, e.g.
The selector syntax is actually richer—above we've used globs, but regular expressions can be used as well. One can also construct the set of traced symbols incrementally, e.g.
'f*-foo*' denotes that
ltrace should trace calls to all functions that start with "f" except those that start with "foo". This is all described in ltrace(1).
Tracing Symbol Entry Points
There's one additional filtering option, and that is
-x. It works very similarly to
-e, but doesn't apply to PLT slots, but directly to symbols in the ELF symbol table—i.e. to entry points of those functions. To illustrate the difference, consider this example:
$ ltrace -e textdomain -x textdomain ls ls->textdomain("coreutils" <unfinished ...> firstname.lastname@example.org("coreutils") = "coreutils" <... textdomain resumed> ) = "coreutils"
The first message shows that a call to textdomain was made from the main binary. The second shows that the call actually landed in
libc.so.6. If there were more libraries that implement textdomain, this would show us which of them actually services the symbol.
Tracing System Calls
While the main task of
ltrace is showing function calls (whether they are local or library calls), it can trace system calls as well. The command line switch that enables this is
-S. You can also use
ltrace to trace purely the system calls by additionally passing
-L, which switches off tracing of library calls:
$ ltrace -S -L date >/dev/null brk@SYS(nil) = 0xd68000 mmap@SYS(nil, 4096, 3, 34, -1, 0) = 0x7ffb1b4ac000 access@SYS("/etc/ld.so.preload", 04) = -2 open@SYS("/etc/ld.so.cache", 0, 01) = 3 [...] write@SYS(1, "Wed Jun 18 11:27:45 CEST 2014n", 30) = 30 Wed Jun 18 11:27:45 CEST 2014 close@SYS(1) = 0 munmap@SYS(0x7ffb1b4ab000, 4096) = 0 close@SYS(2) = 0 exit_group@SYS(0 <no return ...> +++ exited (status 0) +++
ltrace doesn't have the same introspective power as
strace when it comes to system call parameters—e.g. it can't factor out individual flags from a bitfield, doesn't show errno identifiers for negative return values, and possibly more.
System call prototypes are described in
syscalls.conf that's shipped with
ltrace. The above-mentioned switch
-F can be used to point at a directory with alternative
ltrace will use instead of the system one, so it's possible to tweak and tune the prototypes.
What's New in Red Hat Enterprise Linux 7?
Between Red Hat Enterprise Linux releases 6 and 7,
ltrace has seen many changes and improvements to its core functionality. Red Hat also hopes to include an updated version of
ltrace as part of a future release of Red Hat Developer Toolset, which should bring newly-capable
ltrace to Red Hat Enterprise Linux 6 developers as well.
The changes are in particular:
- Calls made from shared libraries couldn't be traced by
ltraceshipped with Red Hat Enterprise Linux 6. A limited support for tracing calls into libraries opened with dlopen was present, but in Red Hat Enterprise Linux 7,
-lall work consistently irrespective of whether the module in question is the main binary, a linked-in shared library or a shared library opened with dlopen.
- Support for calling conventions was vastly improved. Previous versions couldn't handle passing structures by value, passing double on 32-bit architectures and many other edge cases. The whole parameter-fetching logic was rewritten from scratch. One remaining limitation is lack of support for "long long".
- Richer configuration language. It's possible to express NULL or NUL-terminated arrays, to hide certain parameters, convert values to hexadecimal or octal representation, and more.
ltracecan show stack trace of displayed event (use
-w Nto enable, where N is number of stack frames to show).
- What was said about prototype libraries above is actually new to
ltraceas shipped with Red Hat Enterprise Linux 7.
There are still some obvious improvements to be made to
ltrace. The emerging DWARF support has already been mentioned above. Besides, it would make sense to support tracing systemtap probe points. But there are also more fundamental issues—lack of support for long long has been limiting proper support of some interfaces.
The manual pages shipped with
ltrace (ltrace(1) and ltrace.conf(5)) provide a fair amount of detail about odds and ends of
ltrace. What's above are perhaps the most interesting or useful functions, but there's quite a bit more.
In conclusion, we would like to hear from you! If this topic is interesting to you, we can provide more in-depth, or perhaps more user-centric articles in future. If
ltrace is interesting to you, we would like to hear how. Please let us know in comments or directly.