Hands-on Performance Comparison of NVMe I/O Completion Modes
2026-1 Systems TechnologyYou must have already completed the platform-specific QEMU installation and Ubuntu VM setup.
uname -r → 5.18.0-051800-generic)
preadv2(RWF_HIPRI).
This feature was removed starting from Linux kernel 5.19, so 5.18 is the last kernel that supports it.
Since the QEMU Guest VM runs kernel 5.18, the polling lab works regardless of the host OS kernel version.
Install the DPAS-patched kernel using the provided deb packages.
.deb file is a software installation package used in Ubuntu/Debian-based Linux distributions.
It serves the same role as .msi installers on Windows or .pkg on macOS.
Install with sudo dpkg -i file.deb, which automatically places system files such as kernel images and headers in the correct locations.
uname -m inside the Guest VM to check the architecture.
| Host Environment | uname -m Output | Required Package |
|---|---|---|
| Apple Silicon Mac (M1/M2/M3/M4) | aarch64 | ARM64 |
| Intel Mac / Linux PC / Windows (WSL2) | x86_64 | x86_64 |
Download the package matching your VM's CPU architecture.
Transfer the downloaded deb files from the host to the VM.
# Run on the host terminal
scp -P 2222 linux-image-5.18.0-dpas_*.deb <username>@localhost:~/
scp -P 2222 linux-headers-5.18.0-dpas_*.deb <username>@localhost:~/
# Run inside the VM
sudo dpkg -i ~/linux-image-5.18.0-dpas_*.deb ~/linux-headers-5.18.0-dpas_*.deb
sudo reboot
uname -r
# 5.18.0-dpas
sudo grub-reboot "Advanced options for Ubuntu>Ubuntu, with Linux 5.18.0-dpas"sudo reboot
Create the following 4 scripts inside the VM. Usage: ./script.sh <device> <cpu> <numjobs>
#!/bin/bash
sudo modprobe -r nvme; sudo modprobe nvme poll_queues=0
fio --filename=/dev/$1 --size=100m --direct=1 --bs=4k --ioengine=pvsync2 \
--iodepth=1 --rw=randread --runtime=10 --numjobs=$3 --time_based \
--group_reporting --name=test --eta-newline=1 --cpus_allowed=$2 \
--nice=-10 --prioclass=2 --prio=0
#!/bin/bash
sudo modprobe -r nvme; sudo modprobe nvme poll_queues=2
fio --filename=/dev/$1 --ramp_time=3 --size=100m --direct=1 --bs=4k \
--ioengine=pvsync2 --iodepth=1 --rw=randread --runtime=10 \
--numjobs=$3 --time_based --group_reporting --name=test \
--eta-newline=1 --cpus_allowed=$2 --nice=-10 --prioclass=2 \
--prio=0 --hipri
#!/bin/bash
sudo modprobe -r nvme; sudo modprobe nvme poll_queues=2
echo 0 > /sys/block/$1/queue/io_poll_delay
echo 1 > /sys/block/$1/queue/pas_enabled
echo 1 > /sys/block/$1/queue/pas_adaptive_enabled
fio --filename=/dev/$1 --direct=1 --bs=4k --ioengine=pvsync2 \
--iodepth=1 --rw=randread --runtime=10 --numjobs=$3 --time_based \
--group_reporting --name=test --eta-newline=1 --cpus_allowed=$2 \
--nice=-10 --prioclass=2 --prio=0 --hipri
#!/bin/bash
sudo modprobe -r nvme; sudo modprobe nvme poll_queues=2
echo 0 > /sys/block/$1/queue/io_poll_delay
echo 1 > /sys/block/$1/queue/pas_enabled
echo 1 > /sys/block/$1/queue/pas_adaptive_enabled
echo 1 > /sys/block/$1/queue/switch_enabled
echo 10 > /sys/block/$1/queue/switch_param1
echo 10 > /sys/block/$1/queue/switch_param2
echo 10 > /sys/block/$1/queue/switch_param3
echo 1 > /sys/block/$1/queue/switch_param4
fio --filename=/dev/$1 --ramp_time=3 --size=100m --direct=1 --bs=4k \
--ioengine=pvsync2 --iodepth=1 --rw=randread --runtime=10 \
--numjobs=$3 --time_based --group_reporting --name=test \
--eta-newline=1 --cpus_allowed=$2 --nice=-10 --prioclass=2 \
--prio=0 --hipri
cat /sys/block/$1/queue/switch_stat
dmesg | tail -10
Grant execute permissions:
chmod +x fioint.sh fiocp.sh fiopas.sh fiodpas.sh
Pin to CPU 0 and compare the 4 modes while varying the number of jobs.
# Example: INT mode, 1 job
./fioint.sh nvme0n1 0 1
# Repeat for each mode with jobs = 1, 2, 4, 8
# INT: ./fioint.sh nvme0n1 0 {1,2,4,8}
# CP: ./fiocp.sh nvme0n1 0 {1,2,4,8}
# PAS: ./fiopas.sh nvme0n1 0 {1,2,4,8}
# DPAS: ./fiodpas.sh nvme0n1 0 {1,2,4,8}
Record the IOPS and avg latency from the fio output of each run.
| Mode | Jobs=1 | Jobs=2 | Jobs=4 | Jobs=8 |
|---|---|---|---|---|
| INT | ||||
| CP | ||||
| PAS | ||||
| DPAS |
After running DPAS, check the switch_stat and dmesg output:
| MODE Value | Meaning |
|---|---|
| 0 | INT (Interrupt) |
| 1 | CP (Continuous Polling) |
| 2 | PAS (Initial State) |
| 3 | OL (Overloaded) |
| Jobs | Expected Final Mode | Reason |
|---|---|---|
| 1 | CP (MODE 1) | QD=1 → PAS→CP transition, lowest latency via polling |
| 2+ | INT (MODE 0) | QD>1 → PAS→OL→INT transition, yields CPU via interrupts |
pas io count always less than the polled io count in switch_stat?The DPAS kernel provides the following parameters under /sys/block/nvme0n1/queue/.
| Parameter | Description | Default |
|---|---|---|
switch_enabled | Enable DPAS mode switching | 0 |
switch_stat | Per-CPU mode statistics output (read-only) | — |
pas_enabled | Enable PAS mode | 0 |
pas_adaptive_enabled | Enable adaptive sleep | 0 |
switch_param1 | PAS→OL threshold (tf > param1) | 0 |
switch_param2 | OL→PAS threshold (avg QD ≤ param2) | 10 |
switch_param3 | OL→INT threshold (avg QD > param3) | 10 |
switch_param4 | Enable PAS→CP transition (0/1) | 1 |
param2=10 means average QD ≤ 1.0.
This design provides decimal precision using integer arithmetic.
Instead of using deb packages, apply the DPAS patch directly to the kernel source and build it yourself.
sudo apt install -y build-essential libncurses-dev bison flex libssl-dev \
libelf-dev bc pahole dwarves zstd
cd ~
wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.18.tar.xz
tar xf linux-5.18.tar.xz
cd linux-5.18
# After transferring the patch from host to VM
patch -p1 < ~/dpas.patch
echo "-dpas" > localversion
cp /boot/config-$(uname -r) .config
yes '' | make localmodconfig
No space left on device).
localmodconfig includes only currently loaded modules, drastically reducing build time and disk usage.
make -j$(nproc) bindeb-pkg
Upon successful build, linux-image-*.deb and linux-headers-*.deb files will be generated in the home directory.
sudo dpkg -i ~/linux-image-5.18.0-dpas_*.deb ~/linux-headers-5.18.0-dpas_*.deb
sudo reboot
| Symptom | Cause | Solution |
|---|---|---|
No space left on device | Build artifacts exceed disk space | make clean, delete tarball, use localmodconfig |
Makefile Hunk #1 FAILED | Patch version mismatch | Ignore, use localversion instead |
| Kernel unchanged after reboot | GRUB default boot kernel | Select DPAS kernel with grub-reboot |
| Multiple hunk failures | Patch re-applied to already-patched source | Re-extract the source and try again |
| 15GB+ required for build | Insufficient free disk space | Check df -h /, extend LVM or clean up files |
| Item | QEMU | Real Hardware |
|---|---|---|
| CP vs INT comparison | Valid (1.7-1.8x IOPS difference) | Valid |
| PAS adaptive sleep | Inaccurate timers, tf noise | Accurate |
| DPAS mode transition | Works, but absolute performance is for reference only | Works with default parameters |
| Multi-core scaling | Distorted by emulation overhead | Near-linear |