I'd love this community's feedback on my guidance ...
# coroutines
d
I'd love this community's feedback on my guidance document for how to use jvm coroutines in batch processing. A lot of this guidance is from making wrong assumptions about how coroutines work esp vis-a-vis error handling, number of being worked on stacks, and nested launches. It's intended for my department where we do a lot of batch processing and makes some assumptions that the reader knows that. What I'm mostly trying to find out is where am I still wrong.
c
Coroutines using Dispatchers.IO will switch out the running process and reactivate a ready one whenever your code does IO
That's not accurate.
<http://Dispatchers.IO|Dispatchers.IO>
does nothing fancy (unlike Loom), it's just a thread pool where it's not important is stuff blocks.
👀 1
To rate limit, you can also just use a
Semaphore(100)
which will let through 100 requests maximum at a time, and is very cheap (it's essentially just an atomic integer)
1
s
> When making API or db calls, write your code so that it doesn't just hang waiting for return values from remote services. I think this risks confusing your reader. The point of coroutines is that you can write code that waits around for stuff, so long as you do so by suspending rather than blocking. If I've understood what you're trying to say, it's actually something more like this: "When your code suspends to wait for something, Kotlin will look for other coroutines to fill the waiting time, so make sure it finds some. These could be other coroutines within the current function, or they could be entirely separate tasks elsewhere in the application." > Do not nest runBlocking on the stack. Yes, great advice 👍. You have a function in the adjacent code example that's not quite right, though. >
Copy code
private suspend fun grandchildMethod(bar: BarThing) {
>   coroutineScope {
>     launch(Dispatchers.IO) {
>       doItInParallel(bar)
>     }
>   }
> }
There are a couple of problems here. • You mention using
coroutineScope
to "get the containing scope". That's not totally accurate. What it really does is to create a new scope as a child of the current coroutine. It's a subtle difference, but it has important implications when it comes to things like error handling and cancellation. • You're creating a
coroutineScope
to launch just one coroutine inside it. That's not going to be useful—`launch` runs code in the background, and
coroutineScope
waits for it to complete, so the net result is the same as if neither function was there. If you need to switch dispatchers, just use
withContext
. But in fact, you probably don't need to switch dispatchers—more on that in a moment. > If launching coroutines within coroutines […], ensure they have separate coroutine pools so they don't deadlock. Thankfully, this isn't true! Coroutines are absolutely designed to be used within coroutines. Deadlocks are caused by blocking threads inside coroutines, not by launching coroutines inside coroutines. You already covered the most common culprit with your advice against nested
runBlocking
. There's no need to switch to a new dispatcher when starting a new coroutine. (To me,
limitedParallelism
is an advanced topic: I've never had an occasion to use it in my many years of coroutining). Dispatchers aside, your
doit
function in the subsequent code example is an excellent example of concurrent decomposition 👍. > If you're trying to control external resource use (e.g., to not exceed an API rate limit), you probably want a semaphore and
withPermit
Well, yes, you can do that, but I feel like you already gave a better solution in your linked blog post about actors and queues. I'd much rather use an actor than a mutex to manage a shared resource. If you do choose to recommend both solutions, it might be helpful to also talk about how they compare. > If you don't want an exception in one process to cancel the others, you need to either catch all exceptions within or below the launch or async or use
supervisorScope
and a
CoroutineExceptionHandler
. Broadly true, but perhaps leading the reader down a wrong path. There are occasionaly valid reasons to avoid cancelling child coroutines immediately on failure, but I'd avoid encouraging it in the general case. I'd also avoid encouraging the suppression of exceptions. A supervisor scope is actually not designed for use with
launch
, for that exact reason. It's only supposed to be used with
async
, where errors can ultimately be rethrown by a later
await
. For your
accounts
-processing code example, you already gave the correct solution: catch and ignore the error inside the
launch
block, if ignoring the error is really what you want to do. (As an aside, rethrowing an exception from a
CoroutineExceptionHandler
has no effect, as you're already at the top of the coroutine stack, though of course you're right that rethrowing cancellation exceptions is important everywhere else.)
👍 1
d
Wow, thanks so much for the clarifications One thing I'm not grokking is @Sam’s statement about not using
coroutineScope
. How can I call
launch
without having a
CoroutineScope
upon which to call it? Do I need to pass the
CoroutineScope
down the stack? I guess I mistakenly believed
coroutineScope
was merely acting like nested
Database.withTransaction()
by searching the stack for the parent
CoroutineScope
and making that available locally. So, how could I rewrite the
grandchildMethod
to use
launch
and have it be a direct child of the grandparent method's
runBlocking
this: CoroutineScope
?
c
Passing the
CoroutineScope
down to children is one option, but it can easily be confusing;
suspend fun
means "when I'm done, all my children operations are also done", whereas if you pass a
CoroutineScope
, the function you're calling could create a coroutine that you're responsible for
d
Yeah, that makes sense and is what I was intentionally doing: having subordinate methods figure out the work the top
runBlocking
is respoinsible for. I guess the paradigmatic coroutine pattern is that the
launch
is directly in the
runBlocking
block not any subordinate methods (methods further down the stack); so, if we want the "subject" of a coroutine to be each of our 2000 accounts (one account per suspendable stack), we should just wrap the account iteration w
runBlocking
and immediately
launch
per account under that. Right now, we do a lot of decisions as to what processing we need depending on the account type and status. Originally we were passing the
CoroutineScope
down the stack and calling
launch
on it once we figured out what (if any) processing the account needed; thus, we were doing the process determination on
main
thread. I suppose doing the
launch
is simpler and makes sense.
Given that clarification, I want to get back to clarifying `Dispatchers.IO`and
suspend
. First on
<http://Dispatchers.IO|Dispatchers.IO>
, I had the belief that kotlin's processor wrapped every http send w suspend if done within a
<http://Dispatchers.IO|Dispatchers.IO>
coroutine w resumption after response on the first available coroutine. You're saying I'm wrong and that the only suspensions will be specific calls to
yield
or something else and the only distinction in the dispatchers is the pool?
c
First on
<http://Dispatchers.IO|Dispatchers.IO>
, I had the belief that kotlin's processor wrapped every http send w suspend if done within a
<http://Dispatchers.IO|Dispatchers.IO>
coroutine w resumption after response on the first available coroutine.
No, the Kotlin compiler does no such thing. Some frameworks (including Ktor) will internally use
withContext(<http://Dispatchers.IO|Dispatchers.IO>)
, just like you would in your own code. There is no magic here. Also, `Dispatchers.IO``` isn't special, it's just a built-in thread pool where it's not important if threads are blocked by I/O. The only "magic" is that
<http://Dispatchers.IO|Dispatchers.IO>
and
Dispatchers.Default
share some threads.
You're saying I'm wrong and that the only suspensions will be specific calls to
yield
or something else and the only distinction in the dispatchers is the pool?
I'm not sure what you're asking here… Suspension always happens on a
suspendCoroutine
call, which usually happens deep inside the internal machinery of KotlinX.Coroutines. There are many ways users can trigger this: essentially all
suspend
functions deep down are a way to call that method.
Maybe you're confusing with
<http://Dispatchers.IO|Dispatchers.IO>
with the JDK's Project Loom, which can magically transform blocking calls in asynchronous ones?
👍 1
d
I found https://github.com/kotlin-orm/ktorm/discussions/537 as a pattern for suspending on IO. Is this correct? Are there better patterns? Can the coroutine scheduler decide to interrupt any
suspend
block at any time to switch to another (akin to OS context switching) or does it need a specific trigger like
suspendCoroutine, async, withContext, ...
I know I've seen coroutine blocks suspend and then resume on another thread without doing any of those things. I thought it was on IO, but given this discussion I must be wrong or just "lucky".
c
I found https://github.com/kotlin-orm/ktorm/discussions/537 as a pattern for suspending on IO. Is this correct? Are there better patterns?
It seems to me that they are just defining a custom function over
withContext
, no? If so, yeah, you can do that if you prefer, it's a taste thing.
Can the coroutine scheduler decide to interrupt any
suspend
block at any time to switch to another (akin to OS context switching) or does it need a specific trigger like
suspendCoroutine, async, withContext, ...
Coroutines are not preemptable: a coroutine can only be suspended in specific
suspend
points, meaning when
suspendCoroutine
is called, which usually happens deep within
async
,
delay
, `yield`…
I thought it was on IO, but given this discussion I must be wrong or just "lucky".
Well, it depends on the IO. If you're using Ktor, all IO is coroutines-aware, so all Ktor methods can suspend and resume on another thread. If it's something else, it really depends.