Hello. Maybe do you have some insights of my probl...
# server
n
Hello. Maybe do you have some insights of my problem? I have a backend with many cores, C10k traffic, and DB. DB is the slowest part. Of course, I am using a Ktor. Question, which DB driver performs better (assume server hits 100% load): - sync driver. - async + coroutines. A thread is able to process another request while a DB query is working. I did some benchmarks on the artificial system and found, the sync driver was able to produce ~10% more reqs/sec. Also, the async driver takes much more memory on the same load. I assume coroutines have some overhead. I even tried to limit IO bandwidth inside docker to emulate IO cap. The sync driver was faster. However, I am not sure my benchmark was good. So, my question, why I should use async + coroutines? Maybe do you have some insights from real-world applications?
o
I believe in real app you should have some cache between app and db and not all requests will hit the db. Those requests will be able to complete without waiting for DB and overall time-to-respond will be better. If you make a synthetic benchmark where all requests hit DB you won’t get any benefit, unless async driver implements pipelining too.
b
Synchronous should be faster. There's overheads in switching threads, allocating threads, managing thread-safe references. Async isn't to improve the performance of your database though
o
@bdawg.io once connection to DB is acquired from the pool, there shouldn’t be any multithreaded cost involved (if you don’t use JDBC or any other driver with thread locals). Coroutines do indeed add some cost in allocations of state objects. Note, that for maximum performance pool should be coroutines-based too, so that you don’t block a thread when you don’t have a connection available. E.g. hikariCP is a blocking pool, so it won’t work very async.
If you check www.techempower.com/benchmarks/#section=data-r17&hw=ph&test=db you will see that vert.x with async and pipelined db drivers beats everyone else.
n
In my case there is no threading overhead. In the sync driver it is obvious, core number == thread number. In async: coroutines uses the same thread pool. And async driver is implemented using reactive streams. So it also creates few threads for connection to DB.
But coroutines have to put state somewhere, and it is overhead I head.
o
Few threads for reactive streams and rescheduling jobs there is what makes things slower.
n
It means, unless async DB driver is implemented in kotlin coroutines, it will be always slower compared do sync. Am I correct?
o
Not necessary, coroutines is just a way to simplify (and reduce errors) when coding async code. If you use same threadpool for both, you should be fine. Unfortunately, I don’t know much about reactive streams implementation (not to say there are several), so can’t advice here. We are prototyping an async SQL client to postgres, but it’s far from usable.
n
To be concrete, I am having MongoDB and the official driver is using vanilla reactive-streams. However, under the good is still an old async driver which uses own thread pool.
So, in the current state, I see coroutines would be beneficial if many requests do not use DB at all and they could be processed while other coroutines are waiting for DB.
o
Right. If all your code eventually goes to DB, async doesn’t make anything faster by magic, your bottleneck is not there.
n
Thanks for the discussion!
o
We are using and maintaining jasync-sql: https://github.com/jasync-sql/jasync-sql it also has vertx support. As mentioned here if db itself is slow async will not solve it. It can help in handling more cocurrent requests though.
💯 1
s
@neworldlt For good performance as suggested by @orangy you need pooling (and maybe caching), MongoDB reactive-streams driver does not provide a pool so don't expect huge performance by using it directly, but we are working on a Reactor based pool that will allow to reach a much better level of performance and that will be integrated in Spring Data Reactive support.