In a product of ours we have had many performance issues con kotlinlang #coroutines

In a product of ours we have had many performance ...

Mattias Flodin

06/14/2022, 1:47 PM

In a product of ours we have had many performance issues connected to the network latency between application and database server. There are many trivial SQL queries executed in sequence and for each query there's a network round trip. Critically the database only permits a single query at a time on a given connection / transaction so I can't hide the latency by running the queries in parallel. Instead of "real" parallel execution I had an idea to develop a coroutine dispatcher that allows each coroutine to issue a query, but it queues them up and sends them in a single batch (one call to Statement.execute() with multiple statements or using executeBatch()). Each query is associated with the continuation that will process its results, so once the result sets come in I can dispatch them all through multiple resumeWith() calls. My problem is, how do I know when to stop waiting for more queries to be queued up and actually make the JDBC call? Can the coroutine dispatcher somehow detect when await() is called (or there is an implicit wait) on any of its queued continuations? Basically I want to continue queuing queries up until the originating control flow enters a waiting state.

Adam Powell

06/14/2022, 2:42 PM

The dispatcher is the wrong unit to try to manipulate; you can do what you're describing through your suspending query function

Mattias Flodin

06/14/2022, 2:45 PM

So the suspending query function... queues up the query, right? But then remains the question, how to know when to send the batch?

Mattias Flodin

06/14/2022, 2:47 PM

Do I yield and then signal the share queue that it should execute the batch?

Adam Powell

06/14/2022, 2:47 PM

how does a caller signal to you that they've enqueued the full batch?

Adam Powell

06/14/2022, 2:48 PM

and does it matter?

Adam Powell

06/14/2022, 2:49 PM

how long are you willing to wait to form a batch vs. sending queries serially?

Mattias Flodin

06/14/2022, 2:53 PM

I'm imagining that I have a block of code (a scope in a method) that fires up multiple coroutines. Eventually execution enters a point when all coroutines that were started, including the top coroutine, are all suspended, all waiting for some query to finish. That's when I want to fire off the batch to and subsequently release all the waiting continuations when the results come back. I'm willing to wait as long as any coroutine is still running and able to queue up more queries, but only within this particular scope of execution (say, a REST query being serviced).

Mattias Flodin

06/14/2022, 2:55 PM

There will likely be internally dependent queries so once you resume the initial continuations there will be a bunch of new ones being queued up.

Adam Powell

06/14/2022, 3:05 PM

the coroutines machinery works in layers where lower layers by design don't have knowledge of the semantics of layers above. Dispatchers/ContinuationInterceptors only know how to modify the way that continuations resume; they know nothing about why a coroutine suspended or when/why one will resume. Jobs sort of know about vague structural dependencies between coroutines but again, no concept of why, and there aren't useful intermediate non-terminal states that you can use to represent, "I am still running but I am waiting for a very specific kind of result"

Adam Powell

06/14/2022, 3:06 PM

from either of those layers you can't know whether something is suspended waiting for a query vs suspended waiting on a

delay

or similar vs. suspended waiting on some aggregated result of several queries running in different coroutines

Mattias Flodin

06/14/2022, 3:11 PM

Ok. So you don't see any way of using a coroutine based abstraction to hide the details of the batching needed to remove latency then?

Adam Powell

06/14/2022, 3:12 PM

I didn't say that, I said that you don't have enough info to be able to do it from a dispatcher 🙂

Adam Powell

06/14/2022, 3:12 PM

or from other notions of local "idleness" because they suffer from the same limitations

Adam Powell

06/14/2022, 3:14 PM

so either you need to accept that batching is going to be its own thing that determines when a batch is ready to go on its own without trying to monitor idleness of related operations, or you'll need to have the client give an explicit signal of some sort. That explicit signal might come from something like a dsl-scoped block of code reaching the end, but it's still "explicit" from the standpoint of client layering

Mattias Flodin

06/14/2022, 3:19 PM

Yeah the problem is that it's hard to explicitly say when it's time to run the batch since there are multiple levels in the call chain and each level can have queries that are not dependent on the other's results (and can hence run in the same batch). The model I have in my head is that of a build system such as GNU Make: I feed it a dependency graph, and from that it implicitly figures out what it can run in parallel (i.e. in a batch) and what needs to wait because it depends on results from the previous tasks. And I was hoping that the coroutine machinery would have access to such a dependency tree for jobs based on which coroutine waits on what coroutine. But I guess I could make a DSL with lambda functions for something more akin to a build system instead.

Adam Powell

06/14/2022, 3:22 PM

it has no such dependency tree and really, it probably shouldn't. Relying on such a thing is always going to be fragile since it's so easy to construct a scenario where 3rd party code can suspend in such a way that there's a semantic dependency that isn't represented structurally within the system. Expanding the structure to be able to model all possible use cases would make the whole system unwieldy and possibly perform badly

Mattias Flodin

06/14/2022, 3:24 PM

I see. Thanks for taking the time to explain.

Adam Powell

06/14/2022, 3:26 PM

I think that as you work through this you might find that creating a precise dependency tracking setup doesn't perform any better than accumulating queries from a channel and then sending the whole batch after some short time delay

Mattias Flodin

06/14/2022, 3:28 PM

Perhaps. Or I could issue the first query right away and while waiting for the results of that I keep queuing up subsequent queries.

👍 3

Adam Powell

06/14/2022, 3:28 PM

it may even end up performing worse depending on how subsequent/otherwise unrelated queries get stacked up

Adam Powell

06/14/2022, 3:28 PM

yeah, that's another idea too

Adam Powell

06/14/2022, 3:28 PM

play with it and profile

Joffrey

06/15/2022, 9:18 AM

@Mattias Flodin The last thing you suggested is what is sometimes referred to as "natural batching" - not time-based, not size-based. When the "actor" (the DB) is ready, it takes all available elements from the queue and that's your new batch. While the actor is working on a batch, all new queries are enqueued, waiting for it to be ready. That might work well enough for you

👍 3

Joffrey

06/15/2022, 9:20 AM

https://github.com/Kotlin/kotlinx.coroutines/issues/902

Mattias Flodin

06/15/2022, 9:45 AM

Ah yes, I've had to deal with that issue with SharedFlow previously and solved it in a pretty roundabout way by keeping a separate event history that is checked simultaneously as the flow is polled for an event. Probably not a good solution for the general case.

Marc Knaup

09/04/2022, 11:49 PM

Hey there, I just found this thread as I’m trying to solve basically the same problem. I’m trying to build Facebook’s DataLoader in Kotlin using coroutines. DataLoader is made for Node.js and basically uses the end of the current event loop cycle to dispatch batch of “load” events. The goal is to use it in a GraphQL project to batch database queries that are executed in parallel when resolving fields. Since coroutines aren’t event loop-based this poses quite a challenge. The only idea I have so far that will likely work is to dispatch a batch after a certain delay plus optionally using a manual trigger. That certainly adds a performance penalty of at least 1ms per batch – maybe less if I implement an alternative to

delay

that supports sub-millisecond delays. @Mattias Flodin basically said that would be my ideal scenario too:

Eventually execution enters a point when all coroutines that were started, including the top coroutine, are all suspended, all waiting for some query to finish. That’s when I want to fire off the batch to and subsequently release all the waiting continuations when the results come back.

Calling

DataLoader.load(…)

makes it explicit that we’re waiting for something and once a batch is dispatched all loads are combined into one query. @Adam Powell I don’t fully understand your point how layering makes this impractical or impossible. What exactly is a layer? Are lower layers opaque to higher layers? Is there a good source to read about the architecture? Is it not possible to wait for all execution within that scope to be suspended, then execute some logic (dispatch a batch, which adds another suspended execution), and only then allow all executions to resume again? It doesn’t matter if a suspension is a query, delay, or anything else. “wait for all execution within that scope to be suspended” would just be the equivalent to Node.js’ end of currently event loop cycle and allowing the execution to resume equivalent to resuming the event loop.

9 Views

Open in Slack

Previous Next