io_uring for Systems Engineers

My former colleagues and I wrote a detailed paper1 to better understand io_uring. This post is a concise, high-level overview aimed at systems engineers: It should help you decide whether io_uring is worth integrating into your system.

io_uring is a relatively new Linux kernel API for high-performance I/O. At its core, it is an asynchronous, batched system call interface.

Let’s unpack that.

At a high level, io_uring consists of two lock-free ring buffers shared between the user space and the kernel:

  • the Submission Queue (SQ), where user space places requests.
  • the Completion Queue (CQ), where the kernel places responses.

The application issues requests by writing entries into the submission queue, and later collects results from the completion queue. This mechanism decouples issuing work from waiting for it to complete, allowing the application to perform other tasks while the kernel processes multiple operations in the background.

Batching is one of the main ways io_uring improves performance. Instead of making one system call (syscall) per operation, applications can queue many operations into the submission ring. Then, they can use a single syscall to instruct the kernel to process all operations in one go. This approach amortizes the syscall overhead over multiple operations, reducing context switches and syscall entry/exit costs. For workloads with many small operations, this can make a significant difference.

io_uring serves as a unified interface for a wide range of operations, including file I/O, networking, timeouts, and other syscalls, such as madvise. This is an important difference compared to older, specialized interfaces. For example, libaio focuses solely on disk/block I/O, and epoll on network I/O. With io_uring, applications can use a single interface for various types of I/O, thereby avoiding the need to juggle multiple APIs for different subsystems.

Now that we have a high-level understanding of what io_uring is, let us look at when it is useful.

You might want to consider io_uring if:

  • System calls are consuming a lot of CPU time. Profiling shows a noticeable percentage of cycles spent in syscall overhead or context switching. Most of io_uring’s benefit comes from amortizing CPU overhead, so if your profile shows a lot of time spent crossing the kernel boundary, it is a good candidate.
  • Your workload is I/O-heavy (network or SSD) and latency-sensitive. You are pushing high IOPS or many small messages and need to squeeze more throughput out of the same hardware.
  • You want a unified interface for different I/O types2. Instead of mixing epoll, read/write, send/recv, and various asynchronous libraries, you would like a single, coherent API.
  • You care about future kernel optimizations. Many new I/O-related features and optimizations are being developed with io_uring in mind, making it a more future-proof choice to adopt.
  • You are memory-bandwidth-bound by I/O copies. io_uring features, such as registered buffers, can help reduce copy overhead in certain scenarios.

You might not want to use it if:

  • You do not benefit from asynchrony or batching. Your workload is simple, blocking, and latency is dominated by something other than syscalls (for example, waiting for a single disk read).
  • You only perform one type of I/O and are already well served. For example, a disk-only workload where the performance of libaio is enough, or large, sequential reads/writes where syscall overhead is negligible.3
  • Portability is a hard requirement. io_uring is a Linux-specific feature that requires relatively recent kernels. If you need to support other OSes, Docker, or kernels older than 5.1, abstracting it or avoiding it might be simpler.

In summary, our evaluation suggests that io_uring is the best option for new systems and is also worth considering for many existing ones, as illustrated by PostgreSQL’s recent adoption of io_uring.


  1. Do not be afraid if you are not building a DBMS; the insights are applicable to many applications.↩︎

  2. This is a big reason, besides performance, why we at TigerBeetle like io_uring so much.↩︎

  3. A few percent can still be gained by using io_uring.↩︎

Enjoyed this?