@elizarov do you know if the forkjoin pool has any odd behavior in terms of scheduling?
This is not really coroutine only;
In C# and Go, I can spin up Ping-Pong actor pairs equal to the number of cores my machine has.
and I get maximum throughput there, adding more pairs does nothing, the throughput remains the same.
On JVM, both for my own ProtoActor (ontop of Coroutines) and Scala Akka (which also uses a custom forkjoin).
I have to spin up many more pairs. e.g Core Count * 10, to max out