MrPowerGamerBR
06/16/2019, 8:51 PMdelay(...)
stops working, no matter what dispatcher I use (Dispatchers.Default
, <http://Dispatchers.IO|Dispatchers.IO>
, custom dispatcher from a custom executor, etc). It just hangs on the delay
call and never continues.
The dispatcher/thread pool isn't overloaded, because the application is still alive and processing every request without any delay... except when there is a delay
call somewhere, then the code hangs)
Restarting the JVM magically fixes the issue, until after a few days it stops working again. Using JVM 12, Kotlin 1.3.21 + Coroutines 1.2.1 (Yes, I know, it is outdated, I didn't update yet because I couldn't get Kotlin's script engine working on 1.3.31... if reeeally needed I can update, but I decided to ask here before I do it to figure out if anyone knows what could cause this issue (or if it was already fixed))delay(...)
, no matter what dispatcher I use.
https://cdn.discordapp.com/attachments/393332226881880074/589923294916902961/SayRQ1OGN5.gif▾
https://cdn.discordapp.com/attachments/392476688614948897/589958622000185549/unknown.png▾
https://cdn.discordapp.com/attachments/392476688614948897/589958706410553356/unknown.png▾
delay
seems to have that strange issue. (even suspending functions works fine, except if they call delay
)elizarov
06/16/2019, 11:34 PMdelay
schedule. It is called “Default Executor”.MrPowerGamerBR
06/17/2019, 12:52 AMval coroutineExecutor = createThreadPool("Coroutine Executor Thread %d")
val coroutineDispatcher = coroutineExecutor.asCoroutineDispatcher()
fun createThreadPool(name: String): ExecutorService {
return Executors.newCachedThreadPool(ThreadFactoryBuilder().setNameFormat(name).build())
}
elizarov
06/17/2019, 12:54 AMdelay
in the way you describe.MrPowerGamerBR
06/17/2019, 12:55 AMsuspend
methods, etc), but not delayelizarov
06/17/2019, 12:58 AMdelay
its thread is automatically shutsdown, so next time you need to delay it needs to create thread……. But at this thread dump I see the thread up & runningMrPowerGamerBR
06/17/2019, 1:03 AMelizarov
06/17/2019, 1:03 AMjmap
?jmap -dump:live,format=b,file=heap.bin <pid>
MrPowerGamerBR
06/17/2019, 1:04 AMelizarov
06/17/2019, 1:06 AMMrPowerGamerBR
06/17/2019, 1:07 AMelizarov
06/17/2019, 1:08 AMMrPowerGamerBR
06/17/2019, 1:11 AMelizarov
06/17/2019, 1:12 AMdelay
in your code?MrPowerGamerBR
06/17/2019, 1:17 AMdelay
)elizarov
06/17/2019, 1:17 AMdelay(Long.MAX_VALUE)
to wait forever?MrPowerGamerBR
06/17/2019, 1:24 AMnow - 100 years in milliseconds
)elizarov
06/17/2019, 1:27 AMMrPowerGamerBR
06/17/2019, 1:31 AMelizarov
06/17/2019, 1:32 AMMrPowerGamerBR
06/17/2019, 1:33 AMelizarov
06/17/2019, 1:34 AMSystem.nanoTime()
(and jumps back and forth and something like that), but that kind of broken nano-time would wreak a lot of havoc elsewhere. I doubt a running system can have a problem with it.MrPowerGamerBR
06/17/2019, 1:34 AMelizarov
06/17/2019, 1:38 AMMrPowerGamerBR
06/17/2019, 1:42 AMelizarov
06/17/2019, 1:43 AMdelay
after all, but those threads were not available (missing) at the time?MrPowerGamerBR
06/17/2019, 1:48 AMExecutors.newCachedThreadPool(ThreadFactoryBuilder().setNameFormat(name).build())
Only Kotlin Coroutines uses that executor, so it can't be something else.
Also, wouldn't GC remove cached thread (that are unused) after 60s, causing them to create new threads (which will have a new ID assigined to them)?
And if it was the dispatcher, wouldn't using a different dispatcher work? (like Dispatchers.Default
or <http://Dispatchers.IO|Dispatchers.IO>
)elizarov
06/17/2019, 3:36 AMGlobalScope.launch(Dispatchers.Unconfined) {
println("(0) at ${Instant.now()}")
delay(1)
println("(1) at ${Instant.now()}")
}
Do you have console output of that process redirected somewhere?MrPowerGamerBR
06/17/2019, 10:03 AM(0)
line but never displays the (1)
elizarov
06/17/2019, 3:25 PMdelay
uses.MrPowerGamerBR
06/17/2019, 3:40 PM-Dkotlinx.coroutines.debug
to the JVM arguments
6. Added a task that keeps creating coroutines, waits 60s, then prints how much time it took and logs it.
7. Added a way to dump currently running coroutines (with the DebugProbe)
8. Enabled remote debugging
I know some of the changes doesn't have any relation to the issue, but I was like "let's try changing everything and see what sticks"elizarov
06/17/2019, 11:04 PMDefaultExecutor
object instance.MrPowerGamerBR
06/18/2019, 12:34 AMelizarov
06/19/2019, 9:01 PMMrPowerGamerBR
06/19/2019, 11:31 PMelizarov
06/20/2019, 3:42 AMdelay()
? In theory it should not be a problem, since delay
queue uses O(log n) algorithm.MrPowerGamerBR
06/20/2019, 1:26 PMDefaultExecutor
, while most of them I was able to, getting the _queue
and _delayed
values from the EventLoopImplBase
(which I guess are the ones that you will need) is kinda difficult, they don't show up on the declaredMembers
list, only on the declaredFields
list, which then returns a AtomicReferenceFieldUpdater
on the _queue$FU
field... but what are the variable's types? In Kotlin it is a atomic<Any?>(null)
with AtomicFu, but what would that be in Java itself?elizarov
06/20/2019, 1:29 PMObject
in Java. Atomics are erased during compilation.MrPowerGamerBR
06/20/2019, 5:08 PMAtomicReferenceFieldUpdater
has two types? <*, *>
... well, actually I don't need to know the two types because you can just cast it with <*, *>
, however I wanted to know what value should be used on the _queue.get(SomeObjectHere)
. I tried looking at the Kotlin Coroutines code but because it uses AtomicFu, it is just a _queue.value
callelizarov
06/20/2019, 7:30 PM_queue.value
is actually stored in _queue
field.MrPowerGamerBR
06/20/2019, 8:54 PM_delayed
and _queue
fields were null, but that's because I never spawned any coroutine task.
Anyway, here's some of the dumped variables from the EventLoopImplBase
EventLoopImplBase:
_queue = kotlinx.coroutines.internal.LockFreeTaskQueueCore@44170e37
Size: 0
_queue$FU = java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl@671fab50
_delayed = kotlinx.coroutines.internal.ThreadSafeHeap@3d30a5ae
Size: 3300
_delayed$FU = java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl@3ad4a630
isCompleted = false
_delayed
ThreadSafeHeap, but maybe that's a normal size if you are running a big application (my test instance returns 9
for that same var)elizarov
06/20/2019, 9:05 PMa
inside ThreadSafeHeap
instance. It contains instance of DelayedTask
. Can you dump all of them? I’d need index
and nanoTime
for each one to see if it is consistentMrPowerGamerBR
06/20/2019, 10:05 PMnanoTime
, there doesn't seem to be anything wrongelizarov
07/01/2019, 4:53 PMSystem.nanoTime()
is not monothonic at your system.delay(Long.MAX_VALUE)
, weren’t you?