Jitter in low latency environments

Not long ago, if you had asked me to define jitter, I would have spoken about it purely as a networking construct. To me, jitter was simply the variance in packet arrival times between a source and a destination.

But my understanding of system performance has changed significantly.

Now, when I think about jitter in a high-performance environment, I don’t just look at the network path. I understand it as three distinct components: the network path, the server hardware, and how software interacts with that hardware. To achieve true determinism, all of these components must be tuned. Stock hardware and default operating systems are engineered to be good at a lot of general tasks, but they are not optimized for ultra-low latency out of the box.

A fantastic video on this concept is the Jane Street presentation, “System Jitter and Where to Find It: A Whack-a-Mole Experience.” The talk focuses heavily on the server and software side of their trading stack rather than line-rate networking, but the engineering principles highlighted by the engineer match exactly what is required to build predictable, low-latency systems and mirrors what I’ve seen in similar environments.

Here are my main takeaways from Jane Street’s perspective on tracking down system jitter, combined with a few lessons I’ve picked up along the way.

1. Bare Metal vs. Virtualization

Jane Street points out that running virtual machines (VMs) in latency-sensitive environments is a poor choice. A standard OS kernel is tuned by default to handle many tasks simultaneously. When you layer a hypervisor, a host OS, and multiple guest VMs on top of each other, the sheer volume of context switches and the risk of “noisy neighbors” stealing CPU cycles makes determinism impossible.

If you want to eliminate virtualization jitter, the cleanest move is to drop the VM entirely and run directly on bare metal hardware.

2. Network Interrupts and Kernel Bypass

By default, standard Linux systems utilize hardware interrupts when packets arrive at a Network Interface Card (NIC). When a packet hits the wire, the NIC fires an interrupt, forcing a CPU core to pause its current user space application loop, switch context into kernel space, and process the packet.

This constant shifting between user space and kernel space introduces jitter. To achieve highly predictable processing, low-latency systems utilize kernel bypass technologies (like Solarflare’s OpenOnload or DPDK). This allows the application (living in the “user space”) to pull packets directly from the NIC ring buffer, avoiding the kernel entirely.

3. Core Isolation via `isolcpus`

The Linux kernel scheduler natively strives for “fairness,” dynamically moving tasks across available CPU cores to balance the load. This thread load balancing introduces latency.

By utilizing the kernel parameter isolcpus (e.g., isolcpus=1,3,5,7), you can explicitly remove specific cores from the general kernel scheduler. Once isolated, the OS will not schedule standard tasks on them, leaving those cores completely clean of random background processes.

4. Application Pinning via `taskset`

Isolating your cores with isolcpus is only half the battle; the kernel scheduler can no longer use those cores but it won’t automatically assign your application to them either.

By using CPU affinity utilities like taskset, you manually pin your latency-sensitive application directly to the isolated cores you harvested. This ensures your application gets dedicated hardware execution threads all to itself, and because the scheduler is bypassed, your threads will never be moved or migrated during their execution lifetime.

5. Stopping the Clock Tick with `nohz_full`

Even if a core is isolated and your application is pinned, the OS normally forces a coordinated timer tick to check if other tasks need attention. To silence this, you can configure the nohz_full kernel boot parameter. As long as there is only one runnable process on your isolated core, nohz_full stops the periodic timer tick, allowing your application to run continuously without the OS interrupting it to ask for a status update.

6. Hyperthreading

Hyperthreading (Simultaneous Multithreading) is an incredible feature for general-purpose computing. When one application thread stalls or pauses to fetch data, the physical core instantly context-switches to execute a thread from a different application.

While great for throughput and heavy multitasking, its net effect is the exact opposite of what you want in a deterministic system. If a background OS task grabs another thread it can lead to unpredictable execution times. For low-latency stacks, disabling Hyperthreading in the BIOS is a standard prerequisite.

My observations as a network engineer

As a network engineer, it is easy to default to looking at packet delivery whenever the word “jitter” comes up. But digging into these system-level tweaks has changed my perspective.

Ultimately, achieving true determinism isn’t just about a efficient routing. It requires looking past the network layer and understanding how your packets, your OS, and your hardware interact under the hood.

lowlatencynetworking

Latest Posts

Cut-through switching: Why it shines in low latency environments.

Jitter in low latency environments

1. Bare Metal vs. Virtualization

2. Network Interrupts and Kernel Bypass

3. Core Isolation via isolcpus

4. Application Pinning via taskset

5. Stopping the Clock Tick with nohz_full

6. Hyperthreading

Latest Posts