We have an argument in our team, regarding cache c...
# coroutines
a
We have an argument in our team, regarding cache coherency and how coroutine context switch affect it. Imagine the situation: 1. Coroutine C modifies object O 2. Coroutine C yelds, thread is switching to another coroutine 3. Coroutine C is resumed by another thread, uses data from O and updates object O again 4. Process repeats, context switching is happening a lot, and usually under few milliseconds Note: there is no concurrent access to object O, only a single coroutine works with O at any point in time Is there a chance that updates made by one of the threads in iteration N are not visible by another thread on iteration N+1? In other words, what happens with CPU cache when coroutines are switched way too fast? Is it coherent no matter what?
e
context switching machinery inside coroutines should have all the necessary memory barriers to ensure writes from one thread are published when it suspends and are visible to other threads when they resume
not to say there can't be bugs - Scala had an issue in their equivalent,

https://www.youtube.com/watch?v=EFkpmFt61Jo

- but if that doesn't work it's definitely a bug
c
Are you passing the object around or is object O some kind of global?
a
Copy code
class Action {
private var state = ...

suspend fun action() { ... }
}
Action is invoked by other facility
e
if it's possible that
action()
could be invoked by multiple coroutines at once, then you may have an issue
otherwise, if there's only one running coroutine at a time that is touching state, it should look the same as unthreaded code. if it doesn't then there's a bug in coroutines, because a goal is that straight-line code still acts the same.
c
i don’t know enough about your implementation, but i’d be concerned about safe publication of both the field and the values inside it. (state is a
var
so you could be changing both).
a
State is not leaving Action until computation is (partially) completed, it's basically an accumulator, but it's somewhat algorithmically complex code. Developer who wrote the code doesn't know where the bug is and trying to blame CPU cache (current workaround involves copying a state on certain safepoints), saying coroutines do not guarantee HB (happens before) relationship between context switch, while me and other developer are opposite of this.
e
coroutine does, and if you can come up with a case where it doesn't, file a bug
t
Copy code
private var state = ...
this can always have different values on different threads due to the way modern CPUs work (especially different threads can having different on die caches), regardless of whether you use coroutines or not. for JVM you can use `@Volatile` to ensure that does not happen. Since you explicitly say you switch to a different thread and are not doing this, CPU cache is a very likely cause.
never rule out programmer error though 🙂
e
the atomics inside coroutines are suppose to do the equivalent of publishing volatile. if they don't they should be fixed (just like the Scala issue in the video above)
t
But they’re not using coroutines to access data, it’s just a
var
somewhere. Kotlin doesn’t even make assumptions about the underlying memory model (in this case JVM) or provide an abstraction to libraries (such as coroutines) to deal with this. Also it’d be pretty terrible if coroutines made all your writes only to main memory, your program would be an order or magnitude slower just for using coroutines. I think maybe you mean a var inside a suspend method, which effectively lives on a context under control of coroutines itself, like:
Copy code
class Action {


   suspend fun action() { 
      private var state = ...
      ... 
   }
}
e
on jvm, a volatile write also publishes all other writes up to that point, and conversely on read
atomics have an equivalent effect as volatile
a
@Tijl
this can always have different values on different threads due to the way modern CPUs work (especially different threads can having different on die caches), regardless of whether you use coroutines or not.
for JVM you can use `@Volatile` to ensure that does not happen. Since you explicitly say you switch to a different thread and are not doing this, CPU cache is a very likely cause.
I've spent few hours refreshing JMM and other topics, and as it seems, there is no need for volatile b.c. coroutines' atomics will put a fence anyway during suspend (it'll flush everything). And because there is no data race (e.g. only a single thread runs an action), there is no need to use atomics either.
By the way, thanks for the video @ephemient, it was very interesting. I'd also recommend a series of blog post from https://preshing.com/ on MM topic and also https://shipilev.net/blog/2014/jmm-pragmatics/
👍 1
e