Hi Is there a reason why `DefaultDispatcher worker` threads kotlinlang #coroutines

Hi! Is there a reason why `DefaultDispatcher-worke...

Julia Samól

10/31/2023, 9:18 AM

Hi! Is there a reason why

DefaultDispatcher-worker

threads would keep getting spawned without any limits? I’ve been debugging an Android app which is supposed to run in the background and periodically execute tasks that could be async (the async code uses coroutines). After running the app for long time it looks like the app gets bloated with

DefaultDispatcher-worker

threads (I can see an increasing number of threads in the Profiler and increasing ids of the threads when I print their names) and eventually crashes with the

OutOfMemory

error (it happened once in a coroutine running on a

DefaultDispatcher-worker-739

thread - yes, 739). The coroutine handling in the app is quite poor, yes - that’s what I’m trying to fix, amongst others - and we use mostly the IO dispatcher. The only dependency we use that comes with a suspend interface that I’m aware of is Ktor. There’s also a place where we create a

newSingleThreadContext

and run a coroutine on that single thread. The single thread doesn’t seem to be an issue, though, the context gets closed after it’s no longer needed and I don’t see any zombie threads stacking over time coming from that part of the code. So at this moment it looks like whatever is spawning

DefaultDispatcher-worker

threads is causing the problem. I thought, however, that both

Dispatchers.Default

and

<http://Dispatchers.IO|Dispatchers.IO>

are built on a limited thread pool, so I can’t understand how it could be possible that hundreds of such threads get spawned.

Dmitry Khalanskiy [JB]

10/31/2023, 9:43 AM

Your understanding is correct: by default,

Dispatchers.Default

has as many threads as there are CPU cores, and

<http://Dispatchers.IO|Dispatchers.IO>

has 64 threads + (at most) as many threads as

limitedParallelism

calls require. For example, this program uses at most 70 threads for `Dispatchers.IO`:

Copy code

val worker1 = Dispatchers.IO.limitedParallelism(2)
val worker2 = Dispatchers.IO.limitedParallelism(4)

suspend fun main() {
  repeat(1000) {
    worker1.launch { while(true) { } }
    worker2.launch { while(true) { } }
    Dispatchers.IO.launch { while(true) { } }
  }
}

Dmitry Khalanskiy [JB]

10/31/2023, 9:48 AM

There are caveats, though, as always. • There are system properties that affect this. For example,

kotlinx.coroutines.io.parallelism

could be set to 100000, then threads would spawn effectively unrestricted. I'd suggest

git grep

for this property in the project. •

limitedParallelism

could in theory be misused. This code has a bug:

Copy code

// Uses at most two threads in total for all calls:
fun runInATwoThreadedWorker(block: suspend () -> Unit) {
  Dispatchers.IO.limitedParallelism(2).launch {
    block()
  }
}

Calling

runInATwoThreadedWorker

will create arbitrarily many threads. So, I would also audit all

limitedParallelism

usages.

Julia Samól

10/31/2023, 10:12 AM

Oh, I wasn’t aware that

limitedParallelism

worked like this 🤔 Thanks for the tip, I’ve just learnt something new 😄 However, I haven’t found any usages of this nor

kotlinx.coroutines.io.parallelism

in the project. If these are the only ways to configure the dispatchers to different limits, it could mean that the issue may come from the dependencies, couldn’t it?

Dmitry Khalanskiy [JB]

10/31/2023, 10:18 AM

I wasn’t aware that
limitedParallelism
worked like this 🤔

It's a special property of

<http://Dispatchers.IO|Dispatchers.IO>

called elasticity: https://kotlinlang.org/api/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-dispatchers/-i-o.html

the issue may come from the dependencies

The buggy usage of

limitedParallelism

could happen in them, yes. I'd try logging the value of these system properties during execution: •

kotlinx.coroutines.io.parallelism

•

kotlinx.coroutines.scheduler.core.pool.size

•

kotlinx.coroutines.scheduler.max.pool.size

Julia Samól

10/31/2023, 10:27 AM

I’d try logging the value of these system properties during execution

The three of them are

null

. Is that expected or did I just look for wrong system properties? 😅

Dmitry Khalanskiy [JB]

10/31/2023, 10:34 AM

If you checked the properties correctly, it just means no one set them, which is good, as it rules out them as culprits. So, I don't see any options except for someone leaking threads using

limitedParallelism

. In this case, I'd expect

kotlinx.coroutines.internal.LimitedDispatcher.Worker#run

in the stack traces of these threads.

👍 1

Julia Samól

10/31/2023, 10:39 AM

Ok, I’ll focus more on the externals that we’re using and try to look for some clues in the stack traces. Thanks for your help, it’s very appreciated!

Dmitry Khalanskiy [JB]

10/31/2023, 10:40 AM

Sure thing. Please let us know when you discover the cause of the problem.

👍 1

CLOVIS

10/31/2023, 6:56 PM

I've been following this thread silently, and it was very interesting. Are the tips listed here available somewhere else for future reference? (blog posts, documentation…)

Julia Samól

11/01/2023, 8:25 AM

Please let us know when you discover the cause of the problem.

I think it might have been Ktor setting

limitedParallelism

on its own after all. I did find

kotlinx.coroutines.internal.LimitedDispatcher$Worker.run

together with some

io.ktor

entries in our stack traces and noticed we used an older version of Ktor (

2.0.2

). After I updated it to the latest (

2.3.5

), I would no longer see the worrying amounts of the

DefaultDispatcher-worker

threads.

Julia Samól

11/01/2023, 8:43 AM

If there wasn’t a new version of Ktor, would setting the

kotlinx.coroutines.io.parallelism

property in our app overrule any unexpected use of

limitedParallelism

in the dependency? It looks like it’s a really delicate API that might lead to quite erroneous situations, especially if used in dependencies that are out of developer’s control. It would be good to know if there is a way to have the last word in the end product and be able to set a hard limit to simply ignore any misuse in dependencies.

Dmitry Khalanskiy [JB]

11/02/2023, 9:04 AM

If there wasn’t a new version of Ktor, would setting the
kotlinx.coroutines.io.parallelism
property in our app overrule any unexpected use of
limitedParallelism
in the dependency?

It wouldn't. The whole point of

<http://Dispatchers.IO|Dispatchers.IO>

elasticity is to be able to do things like

Copy code

val databaseConnection = Dispatchers.IO.limitedParallelism(200)
val fileReading = Dispatchers.IO.limitedParallelism(10)

and not have these different views fight with one another for threads.

limitedParallelism

is supposed to be used instead of manually created thread pools.

It looks like it’s a really delicate API that might lead to quite erroneous situations, especially if used in dependencies that are out of developer’s control.

The same can be said about any usage of thread pools.

It would be good to know if there is a way to have the last word in the end product and be able to set a hard limit to simply ignore any misuse in dependencies.

The JVM itself doesn't allow this in general: no one can prevent a buggy dependency from spawning a thousand threads using

thread { }

or from forgetting to close thread pools. Likewise, no one can prevent a buggy dependency from hogging all CPU time by flooding

Dispatchers.Default

with nonsense. Likewise, a buggy library could have a memory leak. Dependencies do inevitably have a lot of ways to cause resource starvation. I think adding a workaround for the specific issue of buggy usage of

limitedParallelism

would add a lot of cognitive load around it, but wouldn't make a dent in this general class of bugs.

Julia Samól

11/02/2023, 9:23 AM

Ok, these are fair points indeed 👍

900 Views

Open in Slack

Previous Next