Hi Team I was recently doing stress testing with m...
# spring
a
Hi Team I was recently doing stress testing with my springboot non-blocking microserivce developed using kotlin coroutines. I am connecting to postgres16 docker image using r2dbc. I created a simple write functionality. I have done stress testing using K6. The experiment shows a very high spike in database CPU when try to insert 130 req/ seconds . The CPU spiked up to 57%. The application is using r2dbc. With r2dbc connection pool with default initial size and max size. I then compared the same scenario with a blocking springboot application with HikariCP pool and jdbc. The CPU only spiked up to 4 %. Below is my analysis. https://medium.com/@arjunac009/stress-testing-reactive-springboot-application-with-kotlin-coroutine-jooq-postgresql-61aabe970b29. I know some you may be really experienced around r2dbc. It would be really great if you could share your thoughts process
j
You might try setting https://postgresqlco.nf/doc/en/param/log_min_duration_statement/ really low and see if the same sql is being executed. I'd also recommend logging connections and disconnections. Then lower your benchmarking speed and see what is different. Could be reconnecting a lot. Could be something with prepared statements.
👍 1
The CPU usage on the db should be the same if they're both running at the same speed.
r2dbc and hikari should look identical from the database's perspective if you're really running apples to apples queries at the same rate. There's no async stuff in postgres that I know of that would change the performance. Your app CPU and memory usage should vary, not the DB until r2dbc starts sending way more queries than hikari does
a
Hi @Jared Schlicht I was also thinking the same way. From my understanding connection pool has an important role in database performance. Creating connections is an expensive operation. Here we both hikari and r2dbc using the same default configurations. • Initial size=10 • max size = 20 we are triggering 130 req/seconds so it challenges pool management. I have tested the application using k6. Attaching the screenshot of DB CPU in both case r2dbc - spike went upto 57% hikari- 4%
j
Alright, here's some feedback... Don't use scaling size connection pools. Just do a fixed number of connections per pool to reduce any weirdness in tests. This is also what Hikari recommends in production (smaller pool with a fixed size so it doesn't have to constantly spin up new ones) For the hikari pool, I'd recommend this as your full config:
Copy code
minimumIdle: 20
maximumPoolSize: 20
For the r2dbc pool, I'd do this:
Copy code
pool:
  enabled: true
  initial-size: 20
  max-size: 20
(You had a bit of a bug here... R2DBC got an idle timeout of 10 minutes. Hikari got a timeout of 10 seconds. Doesn't matter in the test, but you shouldn't have different settings. A fixed connection pool means it won't try to idle timeout anyway)
Problem 2: r2dbc doesn't seem to actually be creating as many connections as you specify with initial size. For the hikari/sync implementation with the numbers above the postgres container is running 26 PIDs/processes. This is what you'd expect, as there's a process per connection in postgres plus a handful for the server itself. For the r2dbc/async implementation, I can't get it to go above 6 PIDs. So it's not actually bothering to make many connections as it should be. You can see the numbers if you run "docker stats" while running. Whatever you used to view those graphs might show it too. PIDs == Process Count, I believe.
Problem 3... They're running very different SQL. r2dbc is running a bunch of weird metadata queries and it i*sn't issuing any transaction begin/commit statements at all*. So either looking up info about the extensions is slow, looking up the isolation level is slow, or postgres is performing better with the typical begin; do stuff; commit pattern. Don't know. I'd never write data to the disk without wrapping it in transactions, so I think you should rewrite your r2dbc jooq code to do a transaction
You can see the parallelism going on like we'd expect from a normal sync webapp. The begins all get interleved, then the inserts, then the commits.
I have no idea what r2dbc is doing here.
Here's what I did to have postgres log the SQL statements. Repeat for both projects:
services: postgres: image: postgres:16-alpine3.17 command: ["postgres", "-c", "log_statement=all"] environment:
I'm tapping out from here, as r2dbc is doing stuff I'd never want it to in production. Also, with Virtual Threads stabilized, async and coroutines are dead to me, so I don't personally plan to use r2dbc... Ever...
But let me know what you find out, I'd be curious.
If none of the rest of this works, Switch from alpine to regular postgres. Alpine is cursed by weird bugs. I doubt this is your problem, but it's a quick thing to try.
a
Thanks @Jared Schlicht for taking time and debugging on this. really helpful and given me fresh perspective. let me try things you have suggested. will get back to you
Regarding wired SQL I found out the issue. even though we are using
@Transaction
annotation in the product service. Spring was unable to inject the transaction. • Since we are using customised configuration. we also need to create an r2dbc transaction bean with the connection factory. • added it below commit • https://github.com/arjunbalussery/kotlin-async/commit/018dd66affdb41d2b036fc2adfc742e4d3c3afd7 • now we are bale to see begin and commit
Copy code
kotlin-postgres-1  | 2023-12-24 09:59:38.809 UTC [65] LOG:  statement: SHOW TRANSACTION ISOLATION LEVEL
kotlin-postgres-1  | 2023-12-24 09:59:38.810 UTC [65] LOG:  statement: SELECT oid, typname FROM pg_catalog.pg_type WHERE typname IN ('hstore','geometry')
kotlin-postgres-1  | 2023-12-24 09:59:38.813 UTC [65] LOG:  statement: BEGIN READ WRITE
kotlin-kotlin-1    | ********* reached********
kotlin-postgres-1  | 2023-12-24 09:59:38.825 UTC [66] LOG:  statement: SHOW TRANSACTION ISOLATION LEVEL
kotlin-postgres-1  | 2023-12-24 09:59:38.826 UTC [66] LOG:  statement: SELECT oid, typname FROM pg_catalog.pg_type WHERE typname IN ('hstore','geometry')
kotlin-postgres-1  | 2023-12-24 09:59:38.829 UTC [66] LOG:  execute S_0/B_3: insert into "public"."product" ("id", "name") values (cast($1 as uuid), $2) returning "public"."product"."id"
kotlin-postgres-1  | 2023-12-24 09:59:38.829 UTC [66] DETAIL:  parameters: $1 = 'aa3e94e5-440b-4ba6-8699-6596ee9db992', $2 = 'bus123'
kotlin-kotlin-1    | ********* completed********
kotlin-postgres-1  | 2023-12-24 09:59:38.831 UTC [65] LOG:  statement: COMMIT
• It seems performance has again decreased. now even reaching 100 parallel requests we are only getting 91-93 % success. Regarding number of PIDs. will have a look
I don't know how it persisted in DB without the commit statement. Since I saw the record I was under the impression that the transaction is happening as expected.
j
If you aren't in a Begin block, every insert is committed after each statement
👍 2
Works fine in a benchmark, but dangerous in life situations where you probably have more than 1 dml statement in certain controllers
a
Totally agree..
@Jared Schlicht Hope you had a nice vacation. Now I have added the changes to do below things • New Transaction will be added to each request • Reactive transaction management • The connection pool started scaling up. • you will be able to see PID reaching up to 27 • then coming back to 7 once the requests are processed. Once we corrected the bug related to the transaction it seems the performance are further decreased. Now let's say if we set MaxConnection= 20. any parallel request >= 20 is getting erring out. latest commit: https://github.com/arjunbalussery/kotlin-async/commit/b375e478caeec6e6bd4ca1d64e75b1b362410d50
added it as a discussion point in postgres-r2dbc • https://github.com/pgjdbc/r2dbc-postgresql/issues/625