<@U092308M7> imagine if the JVM could resize stack...
# random
m
@orangy imagine if the JVM could resize stacks on the fly as it garbage collects so millions of thread stacks wasn't unreasonable, and reify them onto the heap on demand. would there still be a need for language/frontend-compiler level coroutines?
e
mikehearn: How else could you do JS-style dispatch of many-many coroutines onto a single UI thread (even loop)? That is the dream of UI programmer — full asynchrony without ever having to worry about shared mutable state.
It is not just about stack reification. It is about cooperative multitasking (having explicit control on where you can switch from one coroutine to antoher) — way easier model to program in, when you are dealing with lots and lots of shared state.
k
The answer is: TeaVM 🧌
m
Yes, co-operative vs pre-emptive multi-tasking is a good answer.
I was thinking more about the server context where fibers/continuations often seem to be used for performance and to work around large thread stacks
but for the UI case where you aren't trying to improve performance, but rather get better control over scheduling, it's much harder to get that with a few VM tweaks
u
@elizarov you say, without ever having to worry about shared mutable state. I don't quite see, why we don't have to worry about it any more. I think the oposite is true. Aditionally to traditional races, we now potentially also have reentrancy issues. Hidden enough, that the average UI programmer won't even know that they are there, waiting for him. Don't get me wrong. I love coroutines. But I guess they need a lot of education to prevent a whole new class of bugs
k
I use TeaVM to create a single-page application web framework. UI is based on co-operative multithreading (backed by VM-supporeted coroutines). I didn't notice any problems with the approach. The only thing I introduced is a special annotation which prohibits turning specific method into coroutine.
e
@mikehearn The idea that fibers/continuations improve peformance in server context is a myth. See this account by Netflix, for example: http://techblog.netflix.com/2016/09/zuul-2-netflix-journey-to-asynchronous.html However, they definitely help to improve stability of large enterprise apps, because in a blocking world one slow 3rd party service can exhaust all worker threads. That is where some JVM-support of “green threads” might help (e.g. you just allocate A LOT of worker threads if they are cheap), but that is not going to be as efficient as really-really light-weight coroutines anyway.
m
Right - their results are what I'd expect, improved memory usage/scalability, but not much speed improvement if you're already CPU saturated. By performance I meant for the case where you need prohibitive amounts of RAM to keep your cores pegged because your threads spend most of their time waiting. Good article though, thanks for the link.
k
RAM is where they help the most if node.js is any indication
And very true about stack traces... In C#, most of the async await stack traces were useless... You got one line listed of your method with the rest of the stack on C# code. At least with Rx, you can get something meaningful by wrapping the exception...
e
@kenkyee Can you please elaborate on Rx solution to stacktraces?
k
Rx traces are useless unless you handle the error case in your subscribe call... Wrapping the exception with "new Exception(t)" is enough to get a useful stack trace
👍 1
m
New IntelliJ has some support for useful stack traces in async computations.
k
What did the support do?
o
@elizarov " [...] because in a blocking world one slow 3rd party service can exhaust all worker threads." Can you explain this problem a bit more in depth, and how coroutines help to solve it?
e
@okkero So you have this big enterprise system (I was writing those for 10+ years) and you have literally hundreds of various services provided by it, used by many-many customers. Some business methods have very complex logic and would invoke some other (remote) services somewhere and sometimes. So you have 100-1000 worker threads. All works fine. However, if one of the remote services you occasionally use becomes slow (answers in seconds instead of its usual milliseconds) you can end up with all your worker threads just waiting on the response from that remote service, preventing any useful work from being done (all incoming requests just queue up until there is an available worker thread).
How do you solve it with coroutines? First of all with coroutines you don’t have to block a thread while you wait. However, a realistic system will have some blocking calls that you cannot get rid of. Coroutines help here, too. With coroutines you can have a separate worker pool for each of those blocking services. Jumping between contexts (thread pools) is natural (easy) with coroutines — just
run(AnotherPool) { … }
and you original thread is free to do any other work, while the code is working in another pool.
So, coroutines give you failure isolation. If one of your subsystems fails, it only affect the requests that involve that subsystem, but does not touch anything else.
o
But couldn't you have a separate worker pool regardless of whether you use coroutines or not?
e
It does not help without coroutines. When your business method invocation comes in from a user you usually don’t know in advance what services it will invoke. Will it need to go to DB or to which DB or will it be all be served from cache? There is no way to know that until you actually start executing it (parse request, check permissions, etc). So. you are forced to schedule all user requests onto the shared worker thread pool. If it happens that it did invoke a slow operation during execution then you cannot release that worker pool thread, unless you have coroutines that would let you suspend execution in this case and release a thread for other tasks.
m
It only converts failure modes: doing it that way means pending requests will still pile up in RAM instead of thread stacks. Eventually you will run out of RAM no matter what is going on unless you push the backpressure upstream or start rejecting requests.
e
You need to support proper request cancelation, too. E.g. when user abandons attempts to get an answer (cancels request or closes connection — whatever way your transport supports) you should remove queued requests from this user and release memory.
m
Yeah, but if you can do that you can also interrupt threads and free them up. The primary benefit in that sort of situation is that suspended continuation state is usually smaller than a thread
e
You could, if only blocking APIs were interruptible. Most of them are not. What is worse, lots of libs that you’ll end up using in your large project just eat-up and loose interruption status (
try { … } catch (InterruptedException e) { /*ignore*/ }
)
m
Yeah, but most libraries aren't designed for coroutines either 🙂 Which brings us neatly back to the start of the thread, how lovely.
e
You don’t need them to. That is the beauty. With coroutines I can have a separate thread pool for each of my blocking subsystems (DB ops, remote calls — any legacy blocking stuff). Let them exhaust those pools — it affects only them. But then I can dispatch my business requests as coroutines to a very small worker pool that is never blocked — they suspend while waiting for blocking operations to complete in their own pools.
k
Is there a way to get health metrics on how many coroutines are suspended?
e
Not yet. We have not figured out the API for that yet. Will likely be available post-1.1 only.