Loading…
May 10-12, 2023
Vancouver, British Columbia, Canada + Virtual
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit North America 2023 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC/GMT -8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Thursday, May 11 • 6:00pm - 6:45pm
All Your Queues Are Belong to Us: The Hunt for a Network Bug in the Kernel - Laurent Bernaille & Eric Mountain, Datadog

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
A few months back, we discovered some applications had lower performance on a few Kubernetes clusters on a specific provider. After some investigation, we discovered network throughput was lower than expected because we were using a single transmit queue on the external interface of the instance. Probably just a configuration issue, either at the Operating System level or in our CNI plugin, right? This is when things started to get really interesting: we could not find any significant differences between the clusters that were affected and those that were not. At that point, we had no other choice than to dive deeper, trace packet paths within Linux itself and read kernel code. In this talk, we will focus on one of the most complex network performance issues we have faced in our Kubernetes environment. We will go through the debugging steps in detail and explain how we tracked the issue using bpftrace and ultimately uncovered a small kernel bug. Finally, we will discuss the upstream kernel fix and the ebpf-based mitigation we were able to deploy quickly with the help of the Cilium team.

Speakers
avatar for Laurent Bernaille

Laurent Bernaille

Principal Engineer, Datadog
Laurent Bernaille worked several years as a consultant specializing in cloud, containers, and automation and helped organizations migrate to the public cloud and adopt containers. He is now Principal Engineer at Datadog and works closely with infrastructure teams, which are responsible... Read More →
avatar for Eric Mountain

Eric Mountain

Senior Software Development Engineer, Datadog
Eric Mountain began working with Kubernetes in 2014 migrating applications built in a custom middleware ecosystem to container and cloud technology. Eric is now a Senior Engineer in Datadog’s Compute team providing large scale Kubernetes to our internal users. Eric enjoys debugging... Read More →



Thursday May 11, 2023 6:00pm - 6:45pm PDT
118 (Level 1)