Since our first blog post on how to retrieve packet drop reasons in the Linux kernel, upstream development of the feature has continued and new additions have been made. Drop reasons can be retrieved manually, but they are also used by an increasing number of utilities such as the Network Observability operator for Red Hat OpenShift Container Platform, which can report packets being dropped with their reasons.
Let's see what happened recently in the drop reason space of the Linux kernel and how to avoid pitfalls, especially between kernel versions. It's worth noting tools designed on top of drop reasons, like the above operator, are already doing the right thing and do not need special care. But as we saw in the previous article, drop reasons can be retrieved manually when debugging networking issues which can be error prone when not understanding in depth how this works or when not using the right tools.
Non-core drop reasons
In addition to core drop reasons, discussed in the previous blog post and defined in
enum skb_drop_reason, support for registering non-core drop reasons was added. This allows other parts of the Linux networking stack to register their own drop reasons to improve visibility into why packets are being dropped there.
At the time of writing, two non-core parts of the Linux networking stack register their own drop reasons: the IEEE 802.11 stack (mac80211) and Open vSwitch.
This works by allowing registering at runtime an additional set of drop reasons, which virtually extends the core definition. Since all drop reasons, core and non-core, have a unique value and can be used in the same core functions, current tools and facilities do not need any modification to report the new drop reasons raw values. However converting those to text is not supported everywhere. We'll see this below.
Drop reasons pitfalls
As we just saw, converting drop reasons to text, especially non-core ones, is not always built-in. But it's not the biggest pitfall. Drop reasons are defined in kernel enums and are not part of a stable ABI. This means, and that was actually the case a few times already, that their raw value can change between kernel releases—for example, when a new reason is added in between existing ones, or when reasons are rearranged. Because of this, different versions of the Linux kernel, including Red Hat Enterprise Linux (RHEL), might report different raw values for the same drop reason.
This is not an issue for tools converting the raw value to a text representation, but not all perform this raw to text translation. This means a raw drop reason value should be checked against the running kernel definition. Of course, there are better ways.
There are two ways of performing a raw value to text conversion for drop reasons while still being version dependent: using an in-kernel conversion or inspecting the running kernel internal definitions and using those.
We'll see below three different tools you can use to inspect drop reasons, that (mostly) fit the above requirement.
By adding a probe on the
skb:kfree_skb tracepoint, we can use its in-kernel translation of drop reasons. However, at the time of writing, this implementation did not support converting non-core drop reasons to a text representation.
While this is not perfect, using
perf on the above tracepoint is a good way of reporting drop reasons when inspecting drops happening in the core networking stack; also because this is a very simple way of getting this information as
perf is widely available.
$ perf record -e skb:kfree_skb sleep 10
$ perf script
curl 103998  40186.014474: skb:kfree_skb: [...] reason: NO_SOCKET
curl 103998  40186.014555: skb:kfree_skb: [...] reason: NO_SOCKET
irq/178-iwlwifi 1289  44222.379744: skb:kfree_skb: [...] reason: 0x10002
In the above example we can see two packets being dropped because no matching socket was found and one packet dropped with a raw drop reason, 0x10002. This drop reason is a non-core one and on the machine used it corresponds to a mac80211 drop reason, namely
dropwatch uses the kernel
dropmon infrastructure which is, at the time of writing, the only in-kernel implementation for non-core drop reasons as text. Because of this, using
dropwatch is one of the preferred ways of inspecting drops in the kernel with their associated reasons.
For an example of how to use
dropwatch, see the previous blog post on drop reasons.
Last but not least, a new kernel packet inspection tool was developed recently, supporting collecting packets in various places of the Linux networking stack: Retis. When asked to report drop reasons, Retis performs a runtime conversion of drop reasons to a text representation by inspecting the running kernel internal definitions using a technology called BPF Type Format (BTF). This means it always has a right raw to text drop reasons translation, regardless of the kernel version running on the system.
Retis is highly configurable but provide sane built-in defaults such as its drop monitoring profile,
$ retis -p dropmon collect
16:52:39 [INFO] Applying profile dropmon: Default
16:52:39 [INFO] 4 probe(s) loaded
40648351222101 [curl] 104769 [tp] skb:kfree_skb drop (NO_SOCKET)
if 1 (lo) rxif 1 ::1.52414 > ::1.80 ttl 64 label 0x98864 len 40 proto TCP (6) flags [S] seq 2567277025 win 33280
In the above example, we can see an IPv6 packet to
[::1]:80 was dropped because no socket is listening for such flow. It also reported detailed information about the packet itself, as well as a stack trace.
Thanks to its automatic translation of drop reasons and because it offers flexibility and additional features (probing in many places of the stack in parallel, packets tracking, conntrack and Open vSwitch support, post-processing capabilities, etc.), Retis is a good choice for tracking dropped packets as well as inspecting the Linux networking stack in general. A packet can not only be seen while being dropped, but tracked in the whole networking stack.
Kernel support for drop reasons is increasing over time, now offering drop reasons from non-core parts of the Linux networking stack. All this is very good news as this improves visibility and gives more insight about why some packets are being dropped. While retrieving and making sense of the drop reasons can be tricky due to its implementation, it's easy to avoid pitfalls by understanding how drop reasons work and by using the right tools. Non-core drop reasons are available in recent RHEL 9.2 releases and in RHEL 9.3.Last updated: January 29, 2024