What are the granularity guarantees for delay base...
# coroutines
k
What are the granularity guarantees for delay based suspends? The docs only mention that it's an implementation detail of the dispatcher that's used. On the JVM with
Dispatchers.Default
on a linux machine, I'm seeing delays resumed before I'm expecting them to very frequently. The following code:
Copy code
public suspend fun Clock.delayUntil(instant: Instant) {
    val expectedDuration = instant - now()
    val actualDuration = measureTime { delay(duration = expectedDuration) }
    if (actualDuration < expectedDuration) {
        println("Supposed to delay for $expectedDuration, delayed for $actualDuration.")
    }
}
Produces the following logs when run iteratively over ~1 minutes, so this is a common occurrence.
Copy code
Supposed to delay for 998.612601ms, delayed for 998.557767ms.
Supposed to delay for 998.891639ms, delayed for 998.538202ms.
Supposed to delay for 998.744499ms, delayed for 998.735252ms.
Supposed to delay for 998.929048ms, delayed for 998.500404ms.
Supposed to delay for 998.832210ms, delayed for 998.619967ms.
Supposed to delay for 998.589884ms, delayed for 998.523756ms.
Supposed to delay for 998.921973ms, delayed for 998.628101ms.
Supposed to delay for 998.890982ms, delayed for 998.699278ms.
Supposed to delay for 998.776822ms, delayed for 998.580276ms.
Supposed to delay for 998.765377ms, delayed for 998.495635ms.
Supposed to delay for 998.642242ms, delayed for 998.615320ms.
Supposed to delay for 998.962181ms, delayed for 998.829443ms.
Supposed to delay for 998.788745ms, delayed for 998.536458ms.
I am personally surprised by this. I would have expected delays to always suspend for an amount of time equal to or greater than the passed in duration, but never less than. Is that an unreasonable expectation?
Some context: I'm playing with implementing a delay based implementation of a
cron
-like API to schedule jobs.
The symptom of this issue is that some delay resumes before it's supposed to and the next iteration of a schedule occurs almost immediately because the next job iteration schedules itself in the next ~1-5ms.
I recognize that delay based behaviors quickly bottom out in platform timing guarantees but I suspect that this common of an occurrence is not ideal?
Executing on
linuxX64
K/N seems even less accurate.
Copy code
Supposed to delay for 998.913685ms, delayed for 998.282396ms, difference of 631.289us.
    Supposed to delay for 999.682193ms, delayed for 999.291051ms, difference of 391.142us.
    Supposed to delay for 998.704951ms, delayed for 998.309983ms, difference of 394.968us.
    Supposed to delay for 998.844689ms, delayed for 998.286725ms, difference of 557.964us.
    Supposed to delay for 998.911791ms, delayed for 998.283234ms, difference of 628.557us.
Is potentially resuming a delay for upwards of half a ms before intended considered expected behavior?
JVM seems to be off by up to ~200us for comparison.
the
runBlocking
event loop seems to suffer from the same problem on the JVM.
Copy code
Supposed to delay for 997.837087ms, delayed for 997.760069ms, difference of 77.018us.
Supposed to delay for 998.845540ms, delayed for 998.381947ms, difference of 463.593us.
Supposed to delay for 998.618864ms, delayed for 998.390328ms, difference of 228.536us.
Supposed to delay for 998.847573ms, delayed for 998.409186ms, difference of 438.387us.
s
All of these differences look like sub-millisecond, so I wonder if it could just be due to precision. For example,
Instant.now()
only has millisecond precision on the JVM I'm looking at right now.
k
Perhaps I should limit the scope of the loop to be just
delay(Long)
to narrow it down
One moment.
Perhaps you're correct! Thanks for the tip. I think I'm unable to replicate without the integration of
Instant
. Seems the bug is mine after all 😄
Actually, just kidding. Not solved. The following code
Copy code
@Test fun foo() = runTest(timeout = 1.days) {
        withContext(Dispatchers.Default) {
            while (true) {
                val expectedDuration = 998.milliseconds + 902752.nanoseconds
                val actualDuration = measureTime { delay(expectedDuration) }
                val difference = actualDuration - expectedDuration
                println("Supposed to delay for $expectedDuration, delayed for $actualDuration, difference of $difference.")
            }
        }
    }
Sometimes does not delay for the required duration. Here are the following logs for JVM on my Linux box.
Copy code
Supposed to delay for 998.902752ms, delayed for 1.000313484s, difference of 1.410732ms.
Supposed to delay for 998.902752ms, delayed for 998.778686ms, difference of -124.066us.
Supposed to delay for 998.902752ms, delayed for 999.019637ms, difference of 116.885us.
Supposed to delay for 998.902752ms, delayed for 998.660794ms, difference of -241.958us.
Supposed to delay for 998.902752ms, delayed for 998.949447ms, difference of 46.695us.
Supposed to delay for 998.902752ms, delayed for 998.573213ms, difference of -329.539us.
Supposed to delay for 998.902752ms, delayed for 998.600730ms, difference of -302.022us.
Supposed to delay for 998.902752ms, delayed for 999.076977ms, difference of 174.225us.
Supposed to delay for 998.902752ms, delayed for 998.584946ms, difference of -317.806us.
Anything negative resumes prior to when it's expected.
And here are the logs for K/N linuxX64
Copy code
Supposed to delay for 998.902752ms, delayed for 998.751622ms, difference of -151.13us.
    Supposed to delay for 998.902752ms, delayed for 998.394454ms, difference of -508.298us.
    Supposed to delay for 998.902752ms, delayed for 998.351222ms, difference of -551.53us.
    Supposed to delay for 998.902752ms, delayed for 998.766148ms, difference of -136.604us.
    Supposed to delay for 998.902752ms, delayed for 998.434123ms, difference of -468.629us.
    Supposed to delay for 998.902752ms, delayed for 998.680873ms, difference of -221.879us.
    Supposed to delay for 998.902752ms, delayed for 998.431539ms, difference of -471.213us.
    Supposed to delay for 998.902752ms, delayed for 998.816225ms, difference of -86.527us.
Oh, I think I know the issue.
Delays are only ever supposed to have millisecond precision. Duration can represent nanoseconds, but when calling
delay(Duration)
the value is converted to milliseconds and delegated to
delay(Long)
. This loses the nanosecond granularity because the only coercion done is to ensure that
Duration.inWholeMilliseconds
doesn't produce a zero value.
Copy code
internal fun Duration.toDelayMillis(): Long =
    if (this > Duration.ZERO) inWholeMilliseconds.coerceAtLeast(1) else 0
I think what this should really be doing is rounding nanoseconds up to the nearest millisecond so as to not lose nanoseconds which are important. Logs which show this:
Copy code
Supposed to delay for 992.759049ms, delayed for 992.691033ms, difference of 68.016us.
    now: 2023-10-21T15:31:06.007240951Z
    delayUntil: 2023-10-21T15:31:07Z
    delayMillis: 992
If I change a local implementation of that coercion to:
Copy code
private fun Duration.myToMillis(): Long {
    val millis = inWholeMilliseconds
    return if (millis * 1_000_000 < inWholeNanoseconds) millis + 1 else millis
}
Which rounds up the nanosecond, everything works as expected.
@Dmitry Khalanskiy [JB] Sorry to tag directly, but does this seem like a reasonable change? I'm happy to make a PR if so.
Going to open a github issue rather than continue this in Slack.