A few years into Kotlin and Coroutines starting in backend s kotlinlang #coroutines

A few years into Kotlin and Coroutines -- starting...

DALDEI

10/03/2021, 3:56 AM

A few years into Kotlin and Coroutines -- starting in backend/server now also in android -- I still struggle with a fundamental design problem - how to 'cross the coroutine divide' ... Its a simple and complex problem : Given: one shouldn't make all functions suspend - therefore you will have a mix of suspend and non-suspend functions calling *each other * Given: <in my current domians> -- the 'majority/main' application is not 'coroutine based' -- its "traditional". Coroutines are added as needed, but the scale of existing code (custom and 3rd party, java/JVM and kotlin) is such that its not practical (yet) to have a coherant coroutine architecture and convention everything follows: Given: one should 'Avoid GlobalScope' -- and given its not trivial to create app coroutine scopes such that they are all self-contained -- real world code is not that pretty. Given: It is not obvious or easy to to tell if a non-suspend function was called by a suspend function, or is 'running in a coroutine' without passing down a CoroutineScope or context to every function (see 2.) Given: 'traditional' non-coroutine aware functions expect functions to only return when complete -- quite often you have to (and should) scope coroutine calls so they are blocking from the perspective of the parent non-coroutine-aware caller. This all leads to the need to have a good convention to follow for: How to create blocking scopes from within non-coroutine-aware functions. (without GlobalScope?) How to know if a non-suspend function is in fact in the call-chain of a suspend function or coroutine, so for example, you could call a suspend function without having to launch a new coroutine -- in what scope ? ( presuming the the caller is non-coroutine-aware it will not generally pass in a coroutine scope for you to use --) A few examples: Android: in onCreate() {} I need to call a suspend function (maybe it pre-existed as a suspend even if I dont need suspension for this call, 'suspend' is in it signature) -- But -- I should not return from onCreate until all its work is done. How ? <what scope>?.<what builder> { call suspend function } return // Do NOT return until above is complete If I have a scope (like lifecycleScope) life is easier -- Much, but if not, Then I have to make or borrow one, and then arrange somehow so its scope/lifecycle is well behaved. Could use GlobalScope. shouldnt probably but maybe 'this time' ? What builder? both launch and async -- are async so I have to wait for them. I cant do that outside the scope (the return statement) -- so where? Punt to yet another scope (as seen in sample code) val job = <scope>.launch { .. } <????> { job.join() } // This solves nothing -- Now I need to know how to wait for ??? Is 'runBlocking' the answer ? Its 'to be avoided' -- but precisely under which conditions? ------ Server side: A web request is processed and calls some "doGET" method -- from legacy java code expecting blocking behaviour. fun doGET() { // same question .. what scope ? launch/async ? runBlocking? } ----- The BIGGER problem In comes 'typical mature application' of some thousands of functions written over a decade by different people long gone .. Kotlin/Coroutines was introduced by some a while ago, haphazardly, Now we have a mess because its not at all clear which non-suspend functions are called by suspend functions upstream top() { module() library(callback) callback() --->| back to module onCallback() -> library() subsystem() -> Fancy Coroutine Stuff over blocking IO --> callback() // Developer comment about #@#%% old code and so on -- at any given point maybe a suspend fun maybe not, with no easy/quick way of telling if it was or could be called* by a coroutine or suspend func, or downstream, is there some 5 level deep suspend func calling back into the module without its context to hang on to ? Can one block the thread ? what other choice is there ? I end up having to solve these one by one ad-hoc with no real global architecture or understanding of what could go wrong. e.g. how bad is this in a library function: fun someFunHasToCallASuspendFun() { GlobalScope.runBlocking{ aQuickSuspendFunc() } } To avoid that Ive had to make massive invasive changes to pass in the appropriate scope from 5 levels up and then that can have a huge blast radius (let alone the ire of my teammates for such a huge change for a 'simple fix') Alternatively .. I've added a few 'suspend' modifiers to the fun and its parents etc. But the effect is very much like C++ "const rot" Once you add one suspend, then its parent has to be suspend, then everything that calls it has to also, then their parents -- it can and does explode quickly until often ALL methods have to be suspend just to call 1 10 levels deep. Another idiom: try to establish some well named 'global scopes' that are not 'GlobalScope' -- at some point this just seems silly -- inventing all these scopes just to avoid 1 -- when they do nothing different. It also requires opening up all libraries and modules to more dependencies, to the point sometimes of having to combine libraries just to share a common scope. The closest 'to ideal' I have found is Android Only, and UI Thread Only, <scope>.launch( Dispatchers.Main.immediate ){} If Im really careful I don't have to wait for this -- it inherently blocks the caller -- with some care even child coroutines are managed. But it seems so wrong that this is subtle and complicated -- surely this is the key difficulty in integrating coroutines -- yet most of the docs are written in a way that presumes everything is already coroutine aware - leaving the 'corner cases' as minor issues someone else can work out. Any commentary welcome - am I missing something big here ? Surely theres a good strategy to manage this problem

🧵 6

Satyam Agarwal

10/03/2021, 5:41 AM

Hi David, I am far too less experience in giving answers to your questions, but I found your post interesting. One thing that that keeps occurring in the post is how to know non suspending functions are called in coroutine scoped function. Shouldn’t IDEs like IntelliJ help you with that ? Or did you mean at runtime ? (Which is even curious case) Nonetheless, my understanding is, marking a function suspend doesn’t make it async. It gives this function capabilities to achieve async and parallel behaviour. Let’s say you have the top function main which is not marked as suspend. On an app running on jvm, main function will run on main thread. And all the function inside it will also run on same thread unless mentioned otherwise. Now you can call your suspend function by wrapping it in runBlocking scope without “GlobalScope”. If you haven’t at any other place in the call chain, mentioned any different coroutine context, everything will run on main thread.

Satyam Agarwal

10/03/2021, 5:45 AM

For your Android example, (though I have experience in android) if you simply wrap suspend function in runBlocking the program will block the thread where onCreate() is running. Once suspend function is finished, then the control will move to return. So you don’t really need to know if the function is finished or not. The situation is different if you are in a coroutine scope. You may have several functions launched from async or launch builders and might be working in parallel, then you have await and join apis to wait for them to finish and finally move on to the next thing.

ephemient

10/03/2021, 5:45 AM

if you must block non-suspending code from returning, your only choice is

runBlocking

or equivalent

ephemient

10/03/2021, 5:47 AM

be extra careful because

runBlocking { ... withContext(Dispatchers.Main) { ... } }

will deadlock

ephemient

10/03/2021, 5:48 AM

using

Dispatchers.Main.immediate

will run until the first suspension point, but that is hardly any guarantee

Satyam Agarwal

10/03/2021, 5:53 AM

You also mention which scope to use and where. First of all, you don’t need to pass the scope down at all. That the beauty of this framework, that the child coroutine is aware of parents coroutine context and runs in the same unless stated otherwise. So unless you have changed the coroutine context somewhere down the call chain, all the coroutines should run in the same context. So let’s say your parent function is running on Dispatchers.IO, all the functions called by parent will run in Dispatchers.IO If you change context somewhere down the call, that will run in the given context but eventually return to its parent’s context.

ephemient

10/03/2021, 5:56 AM

I would split this answer into two parts: • on Android, where you do have things you need to dispatch to the main UI thread, you simply must make everything coroutine-aware. yes, this means you can't make wait for suspending calls from framework methods like

onCreate

, and that may require restructuring your app so that you can start the work but finish it later. but the alternative is painful debugging when you do end up deadlocking. • on the server side, most existing frameworks are thread-per-request. in that case

runBlocking

to adapt between the suspending and non-suspending worlds is safer. your suspend funs should

withContext

to move themselves into an appropriate dispatcher (e.g.

<http://Dispatchers.IO|Dispatchers.IO>

Dispatchers.Default

) instead of pushing that knowledge into the caller.

➕ 1

ephemient

10/03/2021, 5:59 AM

once you're in the suspending world, I would say to avoid calling back into any non-suspending function that uses

runBlocking

, as much as possible. this will impose some order on how you convert an existing codebase to coroutines in a way that is tractable, IMO.

CLOVIS

10/03/2021, 6:38 AM

If you have a function that suspends and appears to block for the caller, use

coroutineScope

Adam Powell

10/03/2021, 3:05 PM

+1 to what @ephemient said above.

runBlocking {}

is something that should appear at very coarse-grained boundaries and very infrequently. Usually these are entry points, like

fun main() = runBlocking { ... }

Copy code

@Test
fun myCodeBehavesAsExpected() = runBlocking {
  // test code
}

and yes, sometimes you might have exactly one

runBlocking

right at the beginning of handling a request for a thread-per-request server framework.

Adam Powell

10/03/2021, 3:08 PM

On Android the Jetpack libraries have extensive support for giving you appropriate scopes for activities, fragments, etc. If you're in an activity onCreate you can use

lifecycleScope.launch { ... }

which will automatically cancel when the activity is destroyed. You can use things like

Copy code

lifecycleScope.launch {
  repeatOnLifecycle(STARTED) {
    someDataFlow.collect {
      someView.text = "New value is $it"
    }
  }
}

and

repeatOnLifecycle

will subscribe and unsubscribe to the flow each time the activity is started/stopped, respectively.

Adam Powell

10/03/2021, 3:09 PM

If you're in a non-suspending function, you don't care what coroutine scope you may or may not have been called from. By definition. A non-suspending function doesn't suspend, and generally it won't be launching more coroutines of its own either.

Adam Powell

10/03/2021, 3:12 PM

If you have non-coroutines code that supports callbacks, use

suspendCancellableCoroutine

to write a

suspend

adapter for it so that you can keep callers thinking in

suspend

- they'll already bring appropriate scopes with them.

Adam Powell

10/03/2021, 3:13 PM

If you have non-coroutines code that blocks the calling thread on external IO, you can wrap them in suspend using

withContext(<http://Dispatchers.IO|Dispatchers.IO>)

to dedicate a temporary thread to the blocking call.

Adam Powell

10/03/2021, 3:14 PM

It's only if you need to satisfy a blocking interface that you are implementing that you should consider something like

runBlocking {}

. And even then, something like Android's

onCreate

is not an example of this;

onCreate

shouldn't block. If you need to do something sufficiently over-time enough that it was written as a suspend call, use something like

lifecycleScope

Adam Powell

10/03/2021, 3:16 PM

"Blocking interface" in this case means something where the API contract demands that some result be complete by the time the call returns, like the thread-per-request client handling described above. Maybe the interface requires you to return a computed value.

Activity.onCreate

is again, not an example of this.

DALDEI

10/03/2021, 11:05 PM

Appreciate the response ! Thanks! (but ...) I dont understand most of them as they simply 'do not work' -- or as I understand them. Example:

If you have non-coroutines code that blocks the calling thread on external IO, you can wrap them in suspend using withContext(Dispatchers.IO) to dedicate a temporary thread to the blocking call.

Um, no, that does not work. withContext() can NOT be called from "non-coroutine code" Please prove Im wrong, that would make my day x 10000 fun nonCoroutineCode() { withContext( Dispatcher.IO ) { ---- DOES NOT COMPILE } } And that is the crux of the problem -- all these great methods only work if your within a suspend fun

Another example:

That the beauty of this framework, that the child coroutine is aware of parents coroutine context and runs in the same unless stated otherwise.

So unless you have changed the coroutine context somewhere down the call chain, all the coroutines should run in the same context.

Um, no. fun coroutineCode() { scope.launch { fun1() } } ---- some other module fun fun1() { fun2() } fun fun2() { launch { } // DOES NOT COMPILE (or any other builder ) - } If you leave the lexical scope of an in scope CoroutineScope or you are in a a non-suspend fun - your out of luck -- no coroutines for you unless you aquire a scope somehere (which is not hard -- but -- I''d LIKE it to be the CALLERS scope -- but it cant be withoujt passing it down becuase its not a suspend fun so no 'scope is in scope' ---------- This answer is the only one that I belive is correct:

if you must block non-suspending code from returning, your only choice is
runBlocking
or equivalent

DALDEI

10/03/2021, 11:05 PM

And with that comes all the side effects as dicussed here and elsewhere --

DALDEI

10/03/2021, 11:06 PM

And yes thanks for the tips about lifecycleScope -- which is AWESOME -- but -- way to infrequently available --

DALDEI

10/04/2021, 12:45 AM

Regarding the (correct!) comment about onCreate not having a contract for blocking -- I mentioned this case, for one because that was the most recent time I hit this general problem, but more so because it demonstrates the difference and nuance between theory and practice. The 'practice' (in the apps Ive had to work in) is that the convention documented in onCreate is implicitly assumed to be 'contract'. That is, one puts in onCreate 'initialization' work precisely so the rest of the app doesn't have to concern itself whether it was done or not. So if I do some work like 'create bootstrap data set' in onCreate, but do it in launch {} and return before it completes, then something is highly likely to break -- or at least likely enough that I need to go hunting every possible place that might to prove otherwise. This seems to be a prevailing issue when working on an application that was not previously written with coroutines in mind -- you not only have to be aware of all the coroutine conventions and contracts -- but also try to think 'What would the other developers have thought before coroutines' and work within those conventions as well -- until you can rewrite the entire app , which is not likely soon -- and less likely if you cant figure out a good set of conventions to get there.

ephemient

10/04/2021, 1:07 AM

this is the sort of thing you have to had thought of before too, though? if you were using Rx or Loaders for async, blocking in onCreate is still a no-go, even if the compiler wasn't preventing you from doing so

Adam Powell

10/04/2021, 2:38 PM

It sounds like the missing piece for your setup here is that when you're working with suspend, you generally jump into suspending code early at something resembling an entry point and stay there to manage things that need to happen over time. To use the onCreate example, launch into your lifecycleScope, then manage sequence and contract in suspend-land. For example,

Copy code

override fun onCreate(savedInstanceState: Bundle?) {
  super.onCreate(savedInstanceState)
  lifecycleScope.launch {
    performSuspendingInit()
    doThingsThatRequireSuspendingInitToBeComplete()
  }
}

If you are in a suspend function and you need to launch something, use

coroutineScope {}

instead of going hunting for another external scope to launch into:

Copy code

suspend fun myFunction() {
  coroutineScope {
    launch { doThingOne() }
    launch { doThingTwo() }
  }
}

structured concurrency means async behavior is always opt-in; launching into your caller's scope is an antipattern because it "leaks" running operations into your caller. By writing it as the above, the caller can wrap it in their own

launch {}

call if they want it to happen concurrently with something else they're doing, and they can use exactly the same

coroutineScope {}

pattern to get a scope.

Adam Powell

10/04/2021, 2:50 PM

If you have a non-

suspend

function and you need to do something over time, then you need to think through whether you want to launch the operation into a scope or if you want to send a message to a

suspend

function that's already running via a

Channel

or similar mechanism.

DALDEI

10/05/2021, 3:32 AM

Very good suggestions, thanks. As motioned this isn't unique to coroutines, the problem ( for me ) tends to come up most when augmenting existing code that was not (or poorly) 'coroutine aware' "Crossing the boundry" between coroutines and non-coroutines may really just be another face of 'Crossing the boundary' between 2 entirely different architectural patterns. So in a real sense (and I believe intentional) the specific 'rules' of kotlin coroutines have been designed and tuned over time to make it hard to screw up -- by making it a compile time error for the most part to 'cross the boundary' easily. It may be a fundamental design anti-pattern to mix these 2 styles other then 'from the edges' where the behaviour is well defined. This gets massively harder when you 'swerve over the lines' vs 'cross'. e.g. This pattern is very hard to get right. Imagine these functions in different compilation units ( so they do NOT share the same 'scope' in any meaning) Also imagine they were written at different times by different people.

main() {

normalFunction()

fun normalFunction() {

someScope.launch {

callSuspend()

callNonSuspend()

suspend fun callSuspend() {

callNonSuspend()

----------------------------->>> >HERE

fun callNonSuspend() {

// Imagine Today I want to ADD a call some to suspend fun --

whatToDoToCallSuspend()

------------ Certainly there are many was to do this -- but the problem isnt so much how, but what. It makes a difference if 'callNonSuspend' was called 'from a coroutine' or not. Things like exception handling - should they be propagated to the caller scope or not, if a coroutine is launched but not waited for, who is going to manage it , whether its 'OK to Block Thread' -- which can lead to deadlock not just inefficiency. These are not fundamentally different then other architectural questions - but they add a whole new dimension to 'WTF' -- If your adding coroutines to an existing program then its unlikely the previous authors thought about coroutines at all, or in the same way. The variety and effect of interaction between otherwise unrelated code is substantial when you add coroutines. However one other factor makes this more specific to both kotlin and coroutines. The heavy use of Scope* conventions. By this I mean the kind of "DSL Scope" that coroutine builders use, similar to the more idiomatic scope functions (let, with, use, apply etc). These leverage the compiler in ways that are profoundly powerful -- but *only work when the entire call chain is in scope e.g This works MyScope.launch { async { coroutineScope { launch { async { ... } } } } All very nicely and cleanly hide the implicit scope so they 'just work" But when the same exact code gets distributed across files and modules -- it falls apart dramatically. There is no implicit scope, or its the wrong one. What was enforce/supplied by the compiler is gone. You have to explicitly recreate the implicit scope and carry it through everything. No different then other kotlin scope usage -- yet fundamentally different in that the current version of coroutine APIs require it -- you end up having to explicitly implement the receivers to every level of function call just to use it 1 time 10 levels deeper. All the hard work of the library authors vanishes the moment you leave the lexical scope. Building the scaffolding back up is not trivial. An approach I have not yet tried but may -- is using the newly supported ThreadLocal coroutine context. This could help a LOT -- but I havent fully walked through all the details. https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-thread-context-element/index.html https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/as-context-element.html Basic concept

fun CoroutineScope.topLevel() {

put "this" into a thread local

fun2()

....

fun fun100() {

Look for scope or context from thread local

with( myThreadLocalScope ) {

Yea Life is restored

ephemient

10/05/2021, 7:41 AM

step in the wrong direction. you do not want to be passing the scope around like that, which enables callees to break out of structured concurrency.

☝️ 1

DALDEI

10/18/2021, 11:17 PM

Yes I get that - and have been bit. BUT -- what else is there that can be done if suspend F1() calls non-suspend F2() ... F10() What options do you have to run a coroutine in F10() ? ( assuming F10 is not in the same compilation unit or inline or otherwise not have direct acess tot he scoping functions) A). F10 can call runBlocking() -- Uses the thread global scope B). F10 can have its own created-on-the-fly scope C). F10 can use the scope of its caller (F1) passed to it somehow what else is there ? All of them suffer from 'break out of structured concurrency' *except C which atleast F10 would share the same structured scope as F1 and thus the same cancelable lifecycle. What is the 'right direction' if these are 'the wrong one' -- > that is my still-unanswered puzzle. Easy to say 'dont do that' not so easy to say ' do THIS instead'

ephemient

10/19/2021, 12:58 AM

don't make both F1() and F10() suspend until you've converted the interior calls - either start from the top and work your way down the call tree, or maybe vice versa

DALDEI

10/29/2021, 1:23 AM

At the end - the requirement is that all function calls between any 2 calls using coroutines must be suspend calls OR pass the scope or context The std lib has pleanty of cases (and docs that suggest it) that passing CoroutineContext or CoroutineScope as either a receiver or a parameter is a reasonable alternative to using suspend (in some cases prefered). Given a complex app, where coroutines are being introduced and where it is not viable to rewrite all the paths between any 2 calls -- Using the coroutine supported version of thread local accomplishes the exact same thing as passing a context or coroutine scope as a parameter. suspendfun f( c: CoroutineScope ) { put c in coroutine impl of thread local f1() } fun f1() { f2() } ..... fun f10(){ val scope : CoroutineScope = get from thread local scope.launch { This uses the same context safely as f() } }

DALDEI

10/29/2021, 1:30 AM

My understanding/interpretation: The key to this is understanding that the coroutine 'builder' DSL heavily relies on the use of Kotlin semantic scope -- however the underlying coroutine API does not. Almost all of the (recent) docs focus entirely on the builders -- for good reason -- as that is the preferred and clean way to do it. However -- that only works when the code is within the same semantic scope --- that breaks down when you cross compilation units or modules where there is no such scope available. That doesn't stop coroutines from working -- but it does stop the conventions that rely on kotlin scope from working -- which is all of the structured coroutine builders until you can re-establish the scope How to bridge 2 disjoint kotlin scopes such that you can use the coroutine builder DSL correctly is a topic that is not discussed at all to my findings .. hence - discussing it . So far have not found a better solution or even an equivalent one to the above. Still very much open to discover one.

6 Views

Open in Slack

Previous Next