https://kotlinlang.org logo
Title
d

DALDEI

10/03/2021, 3:56 AM
A few years into Kotlin and Coroutines -- starting in backend/server now also in android -- I still struggle with a fundamental design problem - how to 'cross the coroutine divide' ... Its a simple and complex problem : Given: one shouldn't make all functions suspend - therefore you will have a mix of suspend and non-suspend functions calling *each other * Given: <in my current domians> -- the 'majority/main' application is not 'coroutine based' -- its "traditional". Coroutines are added as needed, but the scale of existing code (custom and 3rd party, java/JVM and kotlin) is such that its not practical (yet) to have a coherant coroutine architecture and convention everything follows: Given: one should 'Avoid GlobalScope' -- and given its not trivial to create app coroutine scopes such that they are all self-contained -- real world code is not that pretty. Given: It is not obvious or easy to to tell if a non-suspend function was called by a suspend function, or is 'running in a coroutine' without passing down a CoroutineScope or context to every function (see 2.) Given: 'traditional' non-coroutine aware functions expect functions to only return when complete -- quite often you have to (and should) scope coroutine calls so they are blocking from the perspective of the parent non-coroutine-aware caller. This all leads to the need to have a good convention to follow for: How to create blocking scopes from within non-coroutine-aware functions. (without GlobalScope?) How to know if a non-suspend function is in fact in the call-chain of a suspend function or coroutine, so for example, you could call a suspend function without having to launch a new coroutine -- in what scope ? ( presuming the the caller is non-coroutine-aware it will not generally pass in a coroutine scope for you to use --) A few examples: Android: in onCreate() {} I need to call a suspend function (maybe it pre-existed as a suspend even if I dont need suspension for this call, 'suspend' is in it signature) -- But -- I should not return from onCreate until all its work is done. How ? <what scope>?.<what builder> { call suspend function } return // Do NOT return until above is complete If I have a scope (like lifecycleScope) life is easier -- Much, but if not, Then I have to make or borrow one, and then arrange somehow so its scope/lifecycle is well behaved. Could use GlobalScope. shouldnt probably but maybe 'this time' ? What builder? both launch and async -- are async so I have to wait for them. I cant do that outside the scope (the return statement) -- so where? Punt to yet another scope (as seen in sample code) val job = <scope>.launch { .. } <????> { job.join() } // This solves nothing -- Now I need to know how to wait for ??? Is 'runBlocking' the answer ? Its 'to be avoided' -- but precisely under which conditions? ------ Server side: A web request is processed and calls some "doGET" method -- from legacy java code expecting blocking behaviour. fun doGET() { // same question .. what scope ? launch/async ? runBlocking? } ----- The BIGGER problem In comes 'typical mature application' of some thousands of functions written over a decade by different people long gone .. Kotlin/Coroutines was introduced by some a while ago, haphazardly, Now we have a mess because its not at all clear which non-suspend functions are called by suspend functions upstream top() { module() library(callback) callback() --->| back to module onCallback() -> library() subsystem() -> Fancy Coroutine Stuff over blocking IO --> callback() // Developer comment about #@#%% old code and so on -- at any given point maybe a suspend fun maybe not, with no easy/quick way of telling if it was or could be called* by a coroutine or suspend func, or downstream, is there some 5 level deep suspend func calling back into the module without its context to hang on to ? Can one block the thread ? what other choice is there ? I end up having to solve these one by one ad-hoc with no real global architecture or understanding of what could go wrong. e.g. how bad is this in a library function: fun someFunHasToCallASuspendFun() { GlobalScope.runBlocking{ aQuickSuspendFunc() } } To avoid that Ive had to make massive invasive changes to pass in the appropriate scope from 5 levels up and then that can have a huge blast radius (let alone the ire of my teammates for such a huge change for a 'simple fix') Alternatively .. I've added a few 'suspend' modifiers to the fun and its parents etc. But the effect is very much like C++ "const rot" Once you add one suspend, then its parent has to be suspend, then everything that calls it has to also, then their parents -- it can and does explode quickly until often ALL methods have to be suspend just to call 1 10 levels deep. Another idiom: try to establish some well named 'global scopes' that are not 'GlobalScope' -- at some point this just seems silly -- inventing all these scopes just to avoid 1 -- when they do nothing different. It also requires opening up all libraries and modules to more dependencies, to the point sometimes of having to combine libraries just to share a common scope. The closest 'to ideal' I have found is Android Only, and UI Thread Only, <scope>.launch( Dispatchers.Main.immediate ){} If Im really careful I don't have to wait for this -- it inherently blocks the caller -- with some care even child coroutines are managed. But it seems so wrong that this is subtle and complicated -- surely this is the key difficulty in integrating coroutines -- yet most of the docs are written in a way that presumes everything is already coroutine aware - leaving the 'corner cases' as minor issues someone else can work out. Any commentary welcome - am I missing something big here ? Surely theres a good strategy to manage this problem
:thread-please: 6
s

Satyam Agarwal

10/03/2021, 5:41 AM
Hi David, I am far too less experience in giving answers to your questions, but I found your post interesting. One thing that that keeps occurring in the post is how to know non suspending functions are called in coroutine scoped function. Shouldn’t IDEs like IntelliJ help you with that ? Or did you mean at runtime ? (Which is even curious case) Nonetheless, my understanding is, marking a function suspend doesn’t make it async. It gives this function capabilities to achieve async and parallel behaviour. Let’s say you have the top function main which is not marked as suspend. On an app running on jvm, main function will run on main thread. And all the function inside it will also run on same thread unless mentioned otherwise. Now you can call your suspend function by wrapping it in runBlocking scope without “GlobalScope”. If you haven’t at any other place in the call chain, mentioned any different coroutine context, everything will run on main thread.
For your Android example, (though I have experience in android) if you simply wrap suspend function in runBlocking the program will block the thread where onCreate() is running. Once suspend function is finished, then the control will move to return. So you don’t really need to know if the function is finished or not. The situation is different if you are in a coroutine scope. You may have several functions launched from async or launch builders and might be working in parallel, then you have await and join apis to wait for them to finish and finally move on to the next thing.
e

ephemient

10/03/2021, 5:45 AM
if you must block non-suspending code from returning, your only choice is
runBlocking
or equivalent
be extra careful because
runBlocking { ... withContext(Dispatchers.Main) { ... } }
will deadlock
using
Dispatchers.Main.immediate
will run until the first suspension point, but that is hardly any guarantee
s

Satyam Agarwal

10/03/2021, 5:53 AM
You also mention which scope to use and where. First of all, you don’t need to pass the scope down at all. That the beauty of this framework, that the child coroutine is aware of parents coroutine context and runs in the same unless stated otherwise. So unless you have changed the coroutine context somewhere down the call chain, all the coroutines should run in the same context. So let’s say your parent function is running on Dispatchers.IO, all the functions called by parent will run in Dispatchers.IO If you change context somewhere down the call, that will run in the given context but eventually return to its parent’s context.
e

ephemient

10/03/2021, 5:56 AM
I would split this answer into two parts: • on Android, where you do have things you need to dispatch to the main UI thread, you simply must make everything coroutine-aware. yes, this means you can't make wait for suspending calls from framework methods like
onCreate
, and that may require restructuring your app so that you can start the work but finish it later. but the alternative is painful debugging when you do end up deadlocking. • on the server side, most existing frameworks are thread-per-request. in that case
runBlocking
to adapt between the suspending and non-suspending worlds is safer. your suspend funs should
withContext
to move themselves into an appropriate dispatcher (e.g.
<http://Dispatchers.IO|Dispatchers.IO>
or
Dispatchers.Default
) instead of pushing that knowledge into the caller.
1
once you're in the suspending world, I would say to avoid calling back into any non-suspending function that uses
runBlocking
, as much as possible. this will impose some order on how you convert an existing codebase to coroutines in a way that is tractable, IMO.
c

CLOVIS

10/03/2021, 6:38 AM
If you have a function that suspends and appears to block for the caller, use
coroutineScope
a

Adam Powell

10/03/2021, 3:05 PM
+1 to what @ephemient said above.
runBlocking {}
is something that should appear at very coarse-grained boundaries and very infrequently. Usually these are entry points, like
fun main() = runBlocking { ... }
or
@Test
fun myCodeBehavesAsExpected() = runBlocking {
  // test code
}
and yes, sometimes you might have exactly one
runBlocking
right at the beginning of handling a request for a thread-per-request server framework.
On Android the Jetpack libraries have extensive support for giving you appropriate scopes for activities, fragments, etc. If you're in an activity onCreate you can use
lifecycleScope.launch { ... }
which will automatically cancel when the activity is destroyed. You can use things like
lifecycleScope.launch {
  repeatOnLifecycle(STARTED) {
    someDataFlow.collect {
      someView.text = "New value is $it"
    }
  }
}
and
repeatOnLifecycle
will subscribe and unsubscribe to the flow each time the activity is started/stopped, respectively.
If you're in a non-suspending function, you don't care what coroutine scope you may or may not have been called from. By definition. A non-suspending function doesn't suspend, and generally it won't be launching more coroutines of its own either.
If you have non-coroutines code that supports callbacks, use
suspendCancellableCoroutine
to write a
suspend
adapter for it so that you can keep callers thinking in
suspend
- they'll already bring appropriate scopes with them.
If you have non-coroutines code that blocks the calling thread on external IO, you can wrap them in suspend using
withContext(<http://Dispatchers.IO|Dispatchers.IO>)
to dedicate a temporary thread to the blocking call.
It's only if you need to satisfy a blocking interface that you are implementing that you should consider something like
runBlocking {}
. And even then, something like Android's
onCreate
is not an example of this;
onCreate
shouldn't block. If you need to do something sufficiently over-time enough that it was written as a suspend call, use something like
lifecycleScope
.
"Blocking interface" in this case means something where the API contract demands that some result be complete by the time the call returns, like the thread-per-request client handling described above. Maybe the interface requires you to return a computed value.
Activity.onCreate
is again, not an example of this.
d

DALDEI

10/03/2021, 11:05 PM
Appreciate the response ! Thanks! (but ...) I dont understand most of them as they simply 'do not work' -- or as I understand them. Example:
If you have non-coroutines code that blocks the calling thread on external IO, you can wrap them in suspend using withContext(Dispatchers.IO) to dedicate a temporary thread to the blocking call.
Um, no, that does not work. withContext() can NOT be called from "non-coroutine code" Please prove Im wrong, that would make my day x 10000 fun nonCoroutineCode() { withContext( Dispatcher.IO ) { ---- DOES NOT COMPILE } } And that is the crux of the problem -- all these great methods only work if your within a suspend fun
Another example:
That the beauty of this framework, that the child coroutine is aware of parents coroutine context and runs in the same unless stated otherwise.
So unless you have changed the coroutine context somewhere down the call chain, all the coroutines should run in the same context.
Um, no. fun coroutineCode() { scope.launch { fun1() } } ---- some other module fun fun1() { fun2() } fun fun2() { launch { } // DOES NOT COMPILE (or any other builder ) - } If you leave the lexical scope of an in scope CoroutineScope or you are in a a non-suspend fun - your out of luck -- no coroutines for you unless you aquire a scope somehere (which is not hard -- but -- I''d LIKE it to be the CALLERS scope -- but it cant be withoujt passing it down becuase its not a suspend fun so no 'scope is in scope' ---------- This answer is the only one that I belive is correct:
if you must block non-suspending code from returning, your only choice is 
runBlocking
 or equivalent
And with that comes all the side effects as dicussed here and elsewhere --
And yes thanks for the tips about lifecycleScope -- which is AWESOME -- but -- way to infrequently available --
Regarding the (correct!) comment about onCreate not having a contract for blocking -- I mentioned this case, for one because that was the most recent time I hit this general problem, but more so because it demonstrates the difference and nuance between theory and practice. The 'practice' (in the apps Ive had to work in) is that the convention documented in onCreate is implicitly assumed to be 'contract'. That is, one puts in onCreate 'initialization' work precisely so the rest of the app doesn't have to concern itself whether it was done or not. So if I do some work like 'create bootstrap data set' in onCreate, but do it in launch {} and return before it completes, then something is highly likely to break -- or at least likely enough that I need to go hunting every possible place that might to prove otherwise. This seems to be a prevailing issue when working on an application that was not previously written with coroutines in mind -- you not only have to be aware of all the coroutine conventions and contracts -- but also try to think 'What would the other developers have thought before coroutines' and work within those conventions as well -- until you can rewrite the entire app , which is not likely soon -- and less likely if you cant figure out a good set of conventions to get there.
e

ephemient

10/04/2021, 1:07 AM
this is the sort of thing you have to had thought of before too, though? if you were using Rx or Loaders for async, blocking in onCreate is still a no-go, even if the compiler wasn't preventing you from doing so
a

Adam Powell

10/04/2021, 2:38 PM
It sounds like the missing piece for your setup here is that when you're working with suspend, you generally jump into suspending code early at something resembling an entry point and stay there to manage things that need to happen over time. To use the onCreate example, launch into your lifecycleScope, then manage sequence and contract in suspend-land. For example,
override fun onCreate(savedInstanceState: Bundle?) {
  super.onCreate(savedInstanceState)
  lifecycleScope.launch {
    performSuspendingInit()
    doThingsThatRequireSuspendingInitToBeComplete()
  }
}
If you are in a suspend function and you need to launch something, use
coroutineScope {}
instead of going hunting for another external scope to launch into:
suspend fun myFunction() {
  coroutineScope {
    launch { doThingOne() }
    launch { doThingTwo() }
  }
}
structured concurrency means async behavior is always opt-in; launching into your caller's scope is an antipattern because it "leaks" running operations into your caller. By writing it as the above, the caller can wrap it in their own
launch {}
call if they want it to happen concurrently with something else they're doing, and they can use exactly the same
coroutineScope {}
pattern to get a scope.
If you have a non-
suspend
function and you need to do something over time, then you need to think through whether you want to launch the operation into a scope or if you want to send a message to a
suspend
function that's already running via a
Channel
or similar mechanism.
d

DALDEI

10/05/2021, 3:32 AM
Very good suggestions, thanks. As motioned this isn't unique to coroutines, the problem ( for me ) tends to come up most when augmenting existing code that was not (or poorly) 'coroutine aware' "Crossing the boundry" between coroutines and non-coroutines may really just be another face of 'Crossing the boundary' between 2 entirely different architectural patterns. So in a real sense (and I believe intentional) the specific 'rules' of kotlin coroutines have been designed and tuned over time to make it hard to screw up -- by making it a compile time error for the most part to 'cross the boundary' easily. It may be a fundamental design anti-pattern to mix these 2 styles other then 'from the edges' where the behaviour is well defined. This gets massively harder when you 'swerve over the lines' vs 'cross'. e.g. This pattern is very hard to get right. Imagine these functions in different compilation units ( so they do NOT share the same 'scope' in any meaning) Also imagine they were written at different times by different people.
main() {
normalFunction()
}
fun normalFunction() {
someScope.launch {
callSuspend()
}
callNonSuspend()
}
suspend fun callSuspend() {
callNonSuspend()
}
----------------------------->>> >HERE
fun callNonSuspend() {
// Imagine Today I want to ADD a call some to suspend fun --
whatToDoToCallSuspend()
}
------------ Certainly there are many was to do this -- but the problem isnt so much how, but what. It makes a difference if 'callNonSuspend' was called 'from a coroutine' or not. Things like exception handling - should they be propagated to the caller scope or not, if a coroutine is launched but not waited for, who is going to manage it , whether its 'OK to Block Thread' -- which can lead to deadlock not just inefficiency. These are not fundamentally different then other architectural questions - but they add a whole new dimension to 'WTF' -- If your adding coroutines to an existing program then its unlikely the previous authors thought about coroutines at all, or in the same way. The variety and effect of interaction between otherwise unrelated code is substantial when you add coroutines. However one other factor makes this more specific to both kotlin and coroutines. The heavy use of Scope* conventions. By this I mean the kind of "DSL Scope" that coroutine builders use, similar to the more idiomatic scope functions (let, with, use, apply etc). These leverage the compiler in ways that are profoundly powerful -- but *only work when the entire call chain is in scope e.g This works MyScope.launch { async { coroutineScope { launch { async { ... } } } } All very nicely and cleanly hide the implicit scope so they 'just work" But when the same exact code gets distributed across files and modules -- it falls apart dramatically. There is no implicit scope, or its the wrong one. What was enforce/supplied by the compiler is gone. You have to explicitly recreate the implicit scope and carry it through everything. No different then other kotlin scope usage -- yet fundamentally different in that the current version of coroutine APIs require it -- you end up having to explicitly implement the receivers to every level of function call just to use it 1 time 10 levels deeper. All the hard work of the library authors vanishes the moment you leave the lexical scope. Building the scaffolding back up is not trivial. An approach I have not yet tried but may -- is using the newly supported ThreadLocal coroutine context. This could help a LOT -- but I havent fully walked through all the details. https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-thread-context-element/index.html https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/as-context-element.html Basic concept
fun CoroutineScope.topLevel() {
put "this" into a thread local
fun2()
}
....
fun fun100() {
Look for scope or context from thread local
with( myThreadLocalScope ) {
Yea Life is restored
}
}
e

ephemient

10/05/2021, 7:41 AM
step in the wrong direction. you do not want to be passing the scope around like that, which enables callees to break out of structured concurrency.
☝️ 1
d

DALDEI

10/18/2021, 11:17 PM
Yes I get that - and have been bit. BUT -- what else is there that can be done if suspend F1() calls non-suspend F2() ... F10() What options do you have to run a coroutine in F10() ? ( assuming F10 is not in the same compilation unit or inline or otherwise not have direct acess tot he scoping functions) A). F10 can call runBlocking() -- Uses the thread global scope B). F10 can have its own created-on-the-fly scope C). F10 can use the scope of its caller (F1) passed to it somehow what else is there ? All of them suffer from 'break out of structured concurrency' *except C which atleast F10 would share the same structured scope as F1 and thus the same cancelable lifecycle. What is the 'right direction' if these are 'the wrong one' -- > that is my still-unanswered puzzle. Easy to say 'dont do that' not so easy to say ' do THIS instead'
e

ephemient

10/19/2021, 12:58 AM
don't make both F1() and F10() suspend until you've converted the interior calls - either start from the top and work your way down the call tree, or maybe vice versa
d

DALDEI

10/29/2021, 1:23 AM
At the end - the requirement is that all function calls between any 2 calls using coroutines must be suspend calls OR pass the scope or context The std lib has pleanty of cases (and docs that suggest it) that passing CoroutineContext or CoroutineScope as either a receiver or a parameter is a reasonable alternative to using suspend (in some cases prefered). Given a complex app, where coroutines are being introduced and where it is not viable to rewrite all the paths between any 2 calls -- Using the coroutine supported version of thread local accomplishes the exact same thing as passing a context or coroutine scope as a parameter. suspendfun f( c: CoroutineScope ) { put c in coroutine impl of thread local f1() } fun f1() { f2() } ..... fun f10(){ val scope : CoroutineScope = get from thread local scope.launch { This uses the same context safely as f() } }
My understanding/interpretation: The key to this is understanding that the coroutine 'builder' DSL heavily relies on the use of Kotlin semantic scope -- however the underlying coroutine API does not. Almost all of the (recent) docs focus entirely on the builders -- for good reason -- as that is the preferred and clean way to do it. However -- that only works when the code is within the same semantic scope --- that breaks down when you cross compilation units or modules where there is no such scope available. That doesn't stop coroutines from working -- but it does stop the conventions that rely on kotlin scope from working -- which is all of the structured coroutine builders until you can re-establish the scope How to bridge 2 disjoint kotlin scopes such that you can use the coroutine builder DSL correctly is a topic that is not discussed at all to my findings .. hence - discussing it . So far have not found a better solution or even an equivalent one to the above. Still very much open to discover one.