Per-I/O Adaptive Sleep — A New Approach to I/O Completion
2026-1 Systems TechnologyAfter an application sends an I/O request to storage, the mechanism that determines how to wait until the device finishes processing. This "waiting strategy" significantly affects I/O latency and CPU utilization.
After issuing I/O, the CPU yields and sleeps. The device wakes it via interrupt on completion.
Pro: CPU efficient — other tasks can run during sleep
Con: Interrupt handling overhead, context switch delay
Interrupt-basedAfter issuing I/O, the CPU continuously checks for completion (busy-wait).
Pro: Minimum latency — detects completion immediately
Con: 100% CPU usage, no other work possible
Polling-basedSleeps for a predicted duration, then polls. Adjusts sleep time per epoch.
Pro: CPU yield during sleep + fast detection via polling
Con: Epoch-based → slow adaptation, latency shelving problem
HybridAdjusts sleep duration immediately via binary feedback on every I/O.
Pro: Fast convergence, per-I/O adaptation, no shelving
Con: Can be sensitive to individual I/O noise
Hybrid (Per-I/O)
LHP measures average I/O latency per epoch
and sets the next epoch's sleep time as average_latency × attenuation.
When oversleep occurs (sleep time > actual device completion time), the application-observed I/O latency becomes "sleep time + context switch." This inflated latency feeds back into the next epoch's sleep calculation, creating a positive feedback loop:
oversleep → observed latency increases → sleep time increases → more oversleep → ...
This phenomenon is called latency shelving. Once shelving begins, it does not recover until the epoch ends, and performance can degrade to interrupt levels or worse during this period.
After each I/O completion, PAS checks one condition:
wake_time = sleep_duration + context_switch_overhead
if (device_completion_time > wake_time)
result = UNDER // woke up too early → busy-wait occurred
else
result = OVER // woke up too late → unnecessary waiting
Based on this UNDER/OVER verdict, sleep duration is adjusted on every I/O.
PAS uses the combination of the previous and current I/O results (2-bit history)
to determine the adjustment factor (adj).
Sleep duration is computed as dur = dur_last × adj:
| Previous | Current | Interpretation | adj Update |
|---|---|---|---|
| UNDER | UNDER | Consecutive undersleep → sleep is clearly insufficient | Accumulate: adj += UP |
| UNDER | OVER | Direction change → near optimal | Reset + slight decrease: adj = 1 − DN |
| OVER | UNDER | Direction change → near optimal | Reset + slight increase: adj = 1 + UP |
| OVER | OVER | Consecutive oversleep → sleep is clearly excessive | Accumulate: adj −= DN |
Key design principles:
adj accumulates UP/DN. The longer the streak, the faster the adjustment.adj resets near 1. The optimal point has been reached, so large adjustments are unnecessary.// Pseudocode
wake_time = dur + cs_overhead
if (device_latency > wake_time) sr = UNDER
else sr = OVER
switch (sr_prev, sr_curr):
(UNDER, UNDER): adj += UP // accumulate: 1+UP, 1+2UP, 1+3UP, ...
(UNDER, OVER): adj = 1 - DN // reset (slight decrease)
(OVER, UNDER): adj = 1 + UP // reset (slight increase)
(OVER, OVER): adj -= DN // accumulate: 1-DN, 1-2DN, 1-3DN, ...
dur_next = dur_current * adj
Now that you understand the algorithm, observe its behavior in the simulator and emulator.
Part 2: PAS Simulator / Emulator → Compare simulator (tracking) and emulator (feedback loop) to observe PAS behaviorPaper: DPAS: A Prompt, Accurate and Safe I/O Completion Method for SSDs (USENIX FAST '26, Seo et al.)