I d love this community s feedback on my guidance document f kotlinlang #coroutines

I'd love this community's feedback on my guidance ...

Don Mitchell

12/21/2024, 5:56 AM

I'd love this community's feedback on my guidance document for how to use jvm coroutines in batch processing. A lot of this guidance is from making wrong assumptions about how coroutines work esp vis-a-vis error handling, number of being worked on stacks, and nested launches. It's intended for my department where we do a lot of batch processing and makes some assumptions that the reader knows that. What I'm mostly trying to find out is where am I still wrong.

CLOVIS

12/21/2024, 9:28 AM

Coroutines using Dispatchers.IO will switch out the running process and reactivate a ready one whenever your code does IO

That's not accurate.

<http://Dispatchers.IO|Dispatchers.IO>

does nothing fancy (unlike Loom), it's just a thread pool where it's not important is stuff blocks.

👀 1

CLOVIS

12/21/2024, 9:30 AM

To rate limit, you can also just use a

Semaphore(100)

which will let through 100 requests maximum at a time, and is very cheap (it's essentially just an atomic integer)

✅ 1

Sam

12/21/2024, 9:32 AM

> When making API or db calls, write your code so that it doesn't just hang waiting for return values from remote services. I think this risks confusing your reader. The point of coroutines is that you can write code that waits around for stuff, so long as you do so by suspending rather than blocking. If I've understood what you're trying to say, it's actually something more like this: "When your code suspends to wait for something, Kotlin will look for other coroutines to fill the waiting time, so make sure it finds some. These could be other coroutines within the current function, or they could be entirely separate tasks elsewhere in the application." > Do not nest runBlocking on the stack. Yes, great advice 👍. You have a function in the adjacent code example that's not quite right, though. >

Copy code

private suspend fun grandchildMethod(bar: BarThing) {
>   coroutineScope {
>     launch(Dispatchers.IO) {
>       doItInParallel(bar)
>     }
>   }
> }

There are a couple of problems here. • You mention using

coroutineScope

to "get the containing scope". That's not totally accurate. What it really does is to create a new scope as a child of the current coroutine. It's a subtle difference, but it has important implications when it comes to things like error handling and cancellation. • You're creating a

coroutineScope

to launch just one coroutine inside it. That's not going to be useful—`launch` runs code in the background, and

coroutineScope

waits for it to complete, so the net result is the same as if neither function was there. If you need to switch dispatchers, just use

withContext

. But in fact, you probably don't need to switch dispatchers—more on that in a moment. > If launching coroutines within coroutines […], ensure they have separate coroutine pools so they don't deadlock. Thankfully, this isn't true! Coroutines are absolutely designed to be used within coroutines. Deadlocks are caused by blocking threads inside coroutines, not by launching coroutines inside coroutines. You already covered the most common culprit with your advice against nested

runBlocking

. There's no need to switch to a new dispatcher when starting a new coroutine. (To me,

limitedParallelism

is an advanced topic: I've never had an occasion to use it in my many years of coroutining). Dispatchers aside, your

doit

function in the subsequent code example is an excellent example of concurrent decomposition 👍. > If you're trying to control external resource use (e.g., to not exceed an API rate limit), you probably want a semaphore and

withPermit

Well, yes, you can do that, but I feel like you already gave a better solution in your linked blog post about actors and queues. I'd much rather use an actor than a mutex to manage a shared resource. If you do choose to recommend both solutions, it might be helpful to also talk about how they compare. > If you don't want an exception in one process to cancel the others, you need to either catch all exceptions within or below the launch or async or use

supervisorScope

and a

CoroutineExceptionHandler

. Broadly true, but perhaps leading the reader down a wrong path. There are occasionaly valid reasons to avoid cancelling child coroutines immediately on failure, but I'd avoid encouraging it in the general case. I'd also avoid encouraging the suppression of exceptions. A supervisor scope is actually not designed for use with

launch

, for that exact reason. It's only supposed to be used with

async

, where errors can ultimately be rethrown by a later

await

. For your

accounts

-processing code example, you already gave the correct solution: catch and ignore the error inside the

launch

block, if ignoring the error is really what you want to do. (As an aside, rethrowing an exception from a

CoroutineExceptionHandler

has no effect, as you're already at the top of the coroutine stack, though of course you're right that rethrowing cancellation exceptions is important everywhere else.)

👍 1

Don Mitchell

12/23/2024, 1:43 PM

Wow, thanks so much for the clarifications One thing I'm not grokking is @Sam’s statement about not using

coroutineScope

. How can I call

launch

without having a

CoroutineScope

upon which to call it? Do I need to pass the

CoroutineScope

down the stack? I guess I mistakenly believed

coroutineScope

was merely acting like nested

Database.withTransaction()

by searching the stack for the parent

CoroutineScope

and making that available locally. So, how could I rewrite the

grandchildMethod

to use

launch

and have it be a direct child of the grandparent method's

runBlocking

this: CoroutineScope

CLOVIS

12/23/2024, 1:59 PM

Passing the

CoroutineScope

down to children is one option, but it can easily be confusing;

suspend fun

means "when I'm done, all my children operations are also done", whereas if you pass a

CoroutineScope

, the function you're calling could create a coroutine that you're responsible for

Don Mitchell

12/23/2024, 3:32 PM

Yeah, that makes sense and is what I was intentionally doing: having subordinate methods figure out the work the top

runBlocking

is respoinsible for. I guess the paradigmatic coroutine pattern is that the

launch

is directly in the

runBlocking

block not any subordinate methods (methods further down the stack); so, if we want the "subject" of a coroutine to be each of our 2000 accounts (one account per suspendable stack), we should just wrap the account iteration w

runBlocking

and immediately

launch

per account under that. Right now, we do a lot of decisions as to what processing we need depending on the account type and status. Originally we were passing the

CoroutineScope

down the stack and calling

launch

on it once we figured out what (if any) processing the account needed; thus, we were doing the process determination on

main

thread. I suppose doing the

launch

is simpler and makes sense.

Don Mitchell

12/23/2024, 3:38 PM

Given that clarification, I want to get back to clarifying `Dispatchers.IO`and

suspend

. First on

<http://Dispatchers.IO|Dispatchers.IO>

, I had the belief that kotlin's processor wrapped every http send w suspend if done within a

<http://Dispatchers.IO|Dispatchers.IO>

coroutine w resumption after response on the first available coroutine. You're saying I'm wrong and that the only suspensions will be specific calls to

yield

or something else and the only distinction in the dispatchers is the pool?

CLOVIS

12/23/2024, 3:41 PM

First on
<http://Dispatchers.IO|Dispatchers.IO>
, I had the belief that kotlin's processor wrapped every http send w suspend if done within a
<http://Dispatchers.IO|Dispatchers.IO>
coroutine w resumption after response on the first available coroutine.

No, the Kotlin compiler does no such thing. Some frameworks (including Ktor) will internally use

withContext(<http://Dispatchers.IO|Dispatchers.IO>)

, just like you would in your own code. There is no magic here. Also, `Dispatchers.IO``` isn't special, it's just a built-in thread pool where it's not important if threads are blocked by I/O. The only "magic" is that

<http://Dispatchers.IO|Dispatchers.IO>

and

Dispatchers.Default

share some threads.

CLOVIS

12/23/2024, 3:42 PM

You're saying I'm wrong and that the only suspensions will be specific calls to
yield
or something else and the only distinction in the dispatchers is the pool?

I'm not sure what you're asking here… Suspension always happens on a

suspendCoroutine

call, which usually happens deep inside the internal machinery of KotlinX.Coroutines. There are many ways users can trigger this: essentially all

suspend

functions deep down are a way to call that method.

CLOVIS

12/23/2024, 3:46 PM

Maybe you're confusing with

<http://Dispatchers.IO|Dispatchers.IO>

with the JDK's Project Loom, which can magically transform blocking calls in asynchronous ones?

👍 1

Don Mitchell

12/24/2024, 3:10 PM

I found https://github.com/kotlin-orm/ktorm/discussions/537 as a pattern for suspending on IO. Is this correct? Are there better patterns? Can the coroutine scheduler decide to interrupt any

suspend

block at any time to switch to another (akin to OS context switching) or does it need a specific trigger like

suspendCoroutine, async, withContext, ...

I know I've seen coroutine blocks suspend and then resume on another thread without doing any of those things. I thought it was on IO, but given this discussion I must be wrong or just "lucky".

CLOVIS

12/24/2024, 3:40 PM

I found https://github.com/kotlin-orm/ktorm/discussions/537 as a pattern for suspending on IO. Is this correct? Are there better patterns?

It seems to me that they are just defining a custom function over

withContext

, no? If so, yeah, you can do that if you prefer, it's a taste thing.

CLOVIS

12/24/2024, 3:42 PM

Can the coroutine scheduler decide to interrupt any
suspend
block at any time to switch to another (akin to OS context switching) or does it need a specific trigger like
suspendCoroutine, async, withContext, ...

Coroutines are not preemptable: a coroutine can only be suspended in specific

suspend

points, meaning when

suspendCoroutine

is called, which usually happens deep within

async

delay

, `yield`…

I thought it was on IO, but given this discussion I must be wrong or just "lucky".

Well, it depends on the IO. If you're using Ktor, all IO is coroutines-aware, so all Ktor methods can suspend and resume on another thread. If it's something else, it really depends.

4 Views

Open in Slack

Previous Next