Julia Samól
10/31/2023, 9:18 AMDefaultDispatcher-worker
threads would keep getting spawned without any limits?
I’ve been debugging an Android app which is supposed to run in the background and periodically execute tasks that could be async (the async code uses coroutines). After running the app for long time it looks like the app gets bloated with DefaultDispatcher-worker
threads (I can see an increasing number of threads in the Profiler and increasing ids of the threads when I print their names) and eventually crashes with the OutOfMemory
error (it happened once in a coroutine running on a DefaultDispatcher-worker-739
thread - yes, 739).
The coroutine handling in the app is quite poor, yes - that’s what I’m trying to fix, amongst others - and we use mostly the IO dispatcher. The only dependency we use that comes with a suspend interface that I’m aware of is Ktor. There’s also a place where we create a newSingleThreadContext
and run a coroutine on that single thread. The single thread doesn’t seem to be an issue, though, the context gets closed after it’s no longer needed and I don’t see any zombie threads stacking over time coming from that part of the code.
So at this moment it looks like whatever is spawning DefaultDispatcher-worker
threads is causing the problem. I thought, however, that both Dispatchers.Default
and <http://Dispatchers.IO|Dispatchers.IO>
are built on a limited thread pool, so I can’t understand how it could be possible that hundreds of such threads get spawned.Dmitry Khalanskiy [JB]
10/31/2023, 9:43 AMDispatchers.Default
has as many threads as there are CPU cores, and <http://Dispatchers.IO|Dispatchers.IO>
has 64 threads + (at most) as many threads as limitedParallelism
calls require. For example, this program uses at most 70 threads for `Dispatchers.IO`:
val worker1 = Dispatchers.IO.limitedParallelism(2)
val worker2 = Dispatchers.IO.limitedParallelism(4)
suspend fun main() {
repeat(1000) {
worker1.launch { while(true) { } }
worker2.launch { while(true) { } }
Dispatchers.IO.launch { while(true) { } }
}
}
Dmitry Khalanskiy [JB]
10/31/2023, 9:48 AMkotlinx.coroutines.io.parallelism
could be set to 100000, then threads would spawn effectively unrestricted. I'd suggest git grep
for this property in the project.
• limitedParallelism
could in theory be misused. This code has a bug:
// Uses at most two threads in total for all calls:
fun runInATwoThreadedWorker(block: suspend () -> Unit) {
Dispatchers.IO.limitedParallelism(2).launch {
block()
}
}
Calling runInATwoThreadedWorker
will create arbitrarily many threads. So, I would also audit all limitedParallelism
usages.Julia Samól
10/31/2023, 10:12 AMlimitedParallelism
worked like this 🤔 Thanks for the tip, I’ve just learnt something new 😄 However, I haven’t found any usages of this nor kotlinx.coroutines.io.parallelism
in the project. If these are the only ways to configure the dispatchers to different limits, it could mean that the issue may come from the dependencies, couldn’t it?Dmitry Khalanskiy [JB]
10/31/2023, 10:18 AMI wasn’t aware thatIt's a special property ofworked like this 🤔limitedParallelism
<http://Dispatchers.IO|Dispatchers.IO>
called elasticity: https://kotlinlang.org/api/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-dispatchers/-i-o.html
the issue may come from the dependenciesThe buggy usage of
limitedParallelism
could happen in them, yes.
I'd try logging the value of these system properties during execution:
• kotlinx.coroutines.io.parallelism
• kotlinx.coroutines.scheduler.core.pool.size
• kotlinx.coroutines.scheduler.max.pool.size
Julia Samól
10/31/2023, 10:27 AMI’d try logging the value of these system properties during executionThe three of them are
null
. Is that expected or did I just look for wrong system properties? 😅Dmitry Khalanskiy [JB]
10/31/2023, 10:34 AMlimitedParallelism
. In this case, I'd expect
kotlinx.coroutines.internal.LimitedDispatcher.Worker#run
in the stack traces of these threads.Julia Samól
10/31/2023, 10:39 AMDmitry Khalanskiy [JB]
10/31/2023, 10:40 AMCLOVIS
10/31/2023, 6:56 PMJulia Samól
11/01/2023, 8:25 AMPlease let us know when you discover the cause of the problem.I think it might have been Ktor setting
limitedParallelism
on its own after all. I did find kotlinx.coroutines.internal.LimitedDispatcher$Worker.run
together with some io.ktor
entries in our stack traces and noticed we used an older version of Ktor (2.0.2
). After I updated it to the latest (2.3.5
), I would no longer see the worrying amounts of the DefaultDispatcher-worker
threads.Julia Samól
11/01/2023, 8:43 AMkotlinx.coroutines.io.parallelism
property in our app overrule any unexpected use of limitedParallelism
in the dependency?
It looks like it’s a really delicate API that might lead to quite erroneous situations, especially if used in dependencies that are out of developer’s control. It would be good to know if there is a way to have the last word in the end product and be able to set a hard limit to simply ignore any misuse in dependencies.Dmitry Khalanskiy [JB]
11/02/2023, 9:04 AMIf there wasn’t a new version of Ktor, would setting theIt wouldn't. The whole point ofproperty in our app overrule any unexpected use ofkotlinx.coroutines.io.parallelism
in the dependency?limitedParallelism
<http://Dispatchers.IO|Dispatchers.IO>
elasticity is to be able to do things like
val databaseConnection = Dispatchers.IO.limitedParallelism(200)
val fileReading = Dispatchers.IO.limitedParallelism(10)
and not have these different views fight with one another for threads. limitedParallelism
is supposed to be used instead of manually created thread pools.
It looks like it’s a really delicate API that might lead to quite erroneous situations, especially if used in dependencies that are out of developer’s control.The same can be said about any usage of thread pools.
It would be good to know if there is a way to have the last word in the end product and be able to set a hard limit to simply ignore any misuse in dependencies.The JVM itself doesn't allow this in general: no one can prevent a buggy dependency from spawning a thousand threads using
thread { }
or from forgetting to close thread pools. Likewise, no one can prevent a buggy dependency from hogging all CPU time by flooding Dispatchers.Default
with nonsense. Likewise, a buggy library could have a memory leak. Dependencies do inevitably have a lot of ways to cause resource starvation. I think adding a workaround for the specific issue of buggy usage of limitedParallelism
would add a lot of cognitive load around it, but wouldn't make a dent in this general class of bugs.Julia Samól
11/02/2023, 9:23 AM