https://kotlinlang.org logo
Title
a

Alexandre Brown

11/27/2021, 1:53 PM
Hello, what are the differences (pros/cons) of a ktor backend on Kotlin Native vs GraalVM Native Image ? (Assume Ktor 2.0) Thanka
1
b

Big Chungus

11/27/2021, 2:45 PM
Ktor native doesn't require any extra configs? Con is that you're locked out of jvm ecosystem.
👍 2
a

Alexandre Brown

11/27/2021, 2:53 PM
Thanks @Big Chungus, would be curious to find a comparison of start up time and performance of native vs GraalVM native image. Also do you know if kotlin native is stable/production ready (eg: Coroutine support not only single thread etc)? Thanks again
n

napperley

11/27/2021, 10:42 PM
Presumably Kotlin Native would also have these advantages over GraalVM: • Lower memory usage • Smaller binary size • Quicker startup time • No need to fine tune the GC • High level of platform integration/interop • Easier platform integration/interop • Access to C/platform ecosystem (eg on Linux that means access to POSIX and Linux Kernel Userspace libraries like V4L for example - https://en.wikipedia.org/wiki/Video4Linux ) • No Virtual Machine performance overhead/baggage • Low reliance on meta programming (eg don't need to worry about Annotitus - Annotation overuse, which is prevalent in the JVM ecosystem)
👍 1
Whether Kotlin Native is stable/production ready or not will heavily depend on what you are trying to do; aka depends answer 😄.
If you intend to do some native development then access to the JVM ecosystem isn't a big loss. At the end of the day ecosystem size isn't everything.
a

Alexandre Brown

11/27/2021, 11:03 PM
Thanks for the great addition @napperley, I will be creating a kotlin multiplatform library that can be deployed as a library or as a ktor web service. The ktor web service will receive images, perform some network calls, some business logic, some preprocessing/postprocessing (with heavy use of Coroutines!) And return back the images. I still have to check if all my dependencies are available outside of the jvm world but first I'm curious about the Coroutines support of kotlin native.
n

nordiauwu

11/28/2021, 12:56 PM
I got a little bit curious about this question so the decision was made to run some benchmarks and see how K/N, JVM, and GraalVM perform. Environment: Premium AMD DigitalOcean droplet (1 GB RAM / 1 AMD CPU) OS: Ubuntu 21.10 x64 JRE: openjdk-16-jre-headless Benchmark results: JVM Peak throughput: 3400~ RPS Peak memory usage: 62~ MB Executable size: 9.36 MB K/N Peak throughput: 125~ RPS Peak memory usage: 233~ MB Executable size: 4.14 MB GraalVM Doesn't compile. See

https://i.imgur.com/ZAeQEJj.png

P.S. I couldn't find a reliable way to test memory consumption, so it might be a little bit biased. Conclusion I see no use case for ktor on K\N besides FaaS.

https://www.youtube.com/watch?v=BXQABPmWC70

👍 1
a

Alexandre Brown

11/28/2021, 1:17 PM
Wow thanks a lot @nordiauwu! It's sad that your GraalVM didn't compile, maybe try using the agent to generate the config file. But yeah K/N not looking good. https://www.graalvm.org/reference-manual/native-image/Agent/ Thanks for sharing.
Or maybe use the ktor graalvm sample from the doc and use Locust to generate traffic. https://locust.io/ Anyways these are just ideas, thanks again!
n

nordiauwu

11/28/2021, 1:23 PM
Yep, I did use the reflection.json and build script from their sample, but no success =(
👍 1
n

napperley

11/28/2021, 11:50 PM
Still way too early to involve a Kotlin Native program in benchmarks. The Kotlin Native version of Ktor lacks maturity and hasn't been heavily optimised for memory usage. Kotlin Native's memory model (the new one) isn't mature yet. Also Kotlin Native doesn't have any dedicated benchmark tools yet. Presumably the tools are still being worked on by the Kotlin team. Which Kotlin Native memory model was used in the benchmark? Would expect the Kotlin Native program to have lower memory usage than the JVM one, however Kotlin Native and Ktor needs to mature first before that occurs.
o

Oleg Yukhnevich

11/29/2021, 6:02 AM
@nordiauwu can you pls share code? BTW, K/N server should work much better, after coroutines 1.6 with new MM support will be integrated to ktor, to support multi-thread dispatcher inside it, and also removed overhead of using old MM inside ktor (atomics just to support mutation, custom lists implementation, that supports freezing and so on)
👍 3
n

nordiauwu

11/29/2021, 6:02 AM
Apparently I used the old MM, gotta struggle a bit more with GraalVM and redo the benchmark. Thanks for the feedback @napperley
o

Oleg Yukhnevich

11/29/2021, 6:03 AM
also, current K/N GC with new MM isn't production ready, so we should also wait for new one 🙂
n

nordiauwu

11/29/2021, 6:09 AM
e

e5l

11/29/2021, 10:46 AM
btw Our native tests are running about 2 times faster than
jvm
. I guess the startup time and workload without warm up should be better
I don't have any measurements and numbers about that, and also we don't know the reasons right now so we really appreciate your feedback
We also have direct feedback loop with compiler developers, and if you report
slow workloads
- we will make it run much faster and better.
Anyway, @nordiauwu thank you for the comparison.
n

nordiauwu

11/29/2021, 10:57 AM
@e5l thanks to your help we're not done yet! I was able to solve the GraalVM issue by upgrading to the latest EE release. I'll post new results shortly.
e

e5l

11/29/2021, 11:09 AM
btw, we would be happy to accept your contributions here: https://github.com/ktorio/ktor-benchmarks
n

nordiauwu

11/29/2021, 11:41 AM
K\N with the new MM looks really promising, I'd even already use it in production to save up on memory somewhere 🙂 GraalVM is also super awesome with its instant warm-up, while for JVM ten
ab
runs isn't enough to get fully warmed up. JVM Peak throughput: 2700~ RPS Peak memory usage: 59~ MB Executable size: 10 MB K\N (new MM) Peak throughput: 1400~ RPS Peak memory usage: 16~ MB Executable size: 5 MB GraalVM Peak throughput: 5000~ RPS Peak memory usage: 89~ MB Executable size: 38 MB I also took a look at locust.io as @Alexandre Brown suggested, it's clearly a great tool for simulating real user behavior but a bit overkill for our "Hello World" application, so I ended up running
ab -n 1000 -c 10 -k
10 times for each platform. The source code is the same for all platforms - https://pastebin.com/VvZ9UnzY
👍 2
:tnx: 1
e

e5l

11/29/2021, 11:46 AM
@svyatoslav.scherbina
n

nordiauwu

11/29/2021, 11:47 AM
@e5l sure, but setting up automated benchmarks for all 3 platforms we discussed will be pretty tough, especially for GraalVM, which requires a good amount of RAM to compile (1 GB wasn't enough when I tested). Maybe I'll get back to it later and see what we can do.
a

Alexandre Brown

11/29/2021, 1:09 PM
@nordiauwu Thanks a lot for your efforts! I'd expect GraalVM memory footprint to be considerably lower than JVM under GraalVM Native (no JVM). Was your tests on GraalVM done with GraalVM native or Hotspot? I'd argue the "question" is more whether K/N outperforms GraalVM Native as both of these options are no-jvm options.
n

nordiauwu

11/29/2021, 4:01 PM
@Alexandre Brown, I'm not sure what you mean by Hotspot, but I was able to run the executable produced by native-image without java installed so I guess we can call it native.
Here it is, you can test it yourself if you want :)
a

Alexandre Brown

11/29/2021, 4:42 PM
@nordiauwu if you use GraalVM without native image then it uses Java Hotspot VM (which is a little different than JRE). If you run with native image then it does not use any VM
n

nordiauwu

11/29/2021, 4:44 PM
Thanks for the clarification, I'm pretty new to GraalVM so... :)
a

Alexandre Brown

11/29/2021, 4:46 PM
Ok no worries , did you use "native-image" to build your graal binary ?
n

nordiauwu

11/29/2021, 4:47 PM
Yes
a

Alexandre Brown

11/29/2021, 4:47 PM
Did you use the "--static" parameter as well?
Can you share your git repo so I can check 😄
n

nordiauwu

11/29/2021, 4:49 PM
Here is what I used: https://github.com/ktorio/ktor-samples/blob/main/graalvm/build.sh The only thing I changed is the path to Main class.
👍 2
a

Alexandre Brown

11/29/2021, 4:49 PM
Because If you don't use --no-fallback parameter then it might still use Hotspot VM even if you used native-image command
n

nordiauwu

11/29/2021, 4:50 PM
There's pretty much nothing to show, really.
a

Alexandre Brown

11/29/2021, 4:53 PM
If you try to add
--static
this will create a static native image which does not need anything to run (as opposed to a mostly static image). This could potentially reduce the size of it. I'd be curious to see the benchmark results with a
--static
image. This should reduce the size of the executable.
But the results are still very surprising, to see a native-image outperform the JVM is something I didn't expect since the JVM has JIT optimization that the AOT compiler from native-image usually has trouble matching. Thanks for sharing.
The advantage of
--static
is not negligible because it means you can create a docker image using FROM SCRATCH instead of a base image with OS etc.
n

nordiauwu

11/29/2021, 4:59 PM
Thanks, I'll try the
--static
in the near future and let you know
👍 1
a

Alexandre Brown

11/29/2021, 5:03 PM
Thanks again for the benchmarks, really nice of you @nordiauwu
n

nordiauwu

11/29/2021, 5:03 PM
You're welcome 🙂
n

napperley

11/29/2021, 8:17 PM
Throughput with the new Kotlin Native MM looks much better than the old one. As expected Kotlin Native comes out of top in the benchmark with Peak Memory Usage, and Executable Size, but is the worst with Throughput (comes as no surprise). From the benchmark Kotlin Native would perform much better with Serverless than with Microservices. No surprises there.
n

nordiauwu

12/01/2021, 12:25 PM
@Alexandre Brown, it increased the executable size by 200 kb
k

kevin.cianfarini

12/02/2021, 7:22 PM
@e5l what io multiplexing is being used under the hood for ktor native? On suitable linux kernels, is it by any chance io_uring?
e

e5l

12/03/2021, 8:44 AM
Right now it's a simple
select
it is compatible with allmost every platform
👍 1
I'm not sure if the different multiplexing will give much performance boost at this level of load, but we can accept PR with different a implementation
e

Elena Lepilkina

12/03/2021, 11:31 AM
@nordiauwu could you, please, provide more details about used options. Did I understand right that you use
binaryOptions["freezing"] = "disabled"
to build and run with new K/N memory MM? Or did you build ktor locally? What ktor version did you use?
n

nordiauwu

12/03/2021, 12:59 PM
a

Alexandre Brown

12/03/2021, 1:05 PM
@nordiauwu I was confused by the results because I was comparing executable size. Yes JVM is smaller than graalvm but if you take into account docker image size then it's more realistic as the JVM will need a JVM and be ~120MB compared to the unchanged 30MB of graalVM static image. Maybe something to keep in mind as this is more real world scenario/comparison
👍 1
e

Elena Lepilkina

12/03/2021, 1:11 PM
Thank you @nordiauwu, but why do disable EscapeAnalysis? it seems unneeded I didn't see any failures with EA when reproduced. Was it made because of default memory limit?
n

nordiauwu

12/03/2021, 1:13 PM
@Elena Lepilkina I was getting some weird error asking to disable it earlier, maybe it’s fixed in new MM or something
Nope, it's still there
e

Elena Lepilkina

12/03/2021, 1:24 PM
It's gradle default memory limitation, the problem is described in the bottom, just add memory with adding to
gradle.properties
org.gradle.jvmargs=-Xmx2048m
👍 1
k

kevin.cianfarini

12/03/2021, 3:54 PM
I’m not sure if the different multiplexing will give much performance boost at this level of load, but we can accept PR with different a implementation
@e5l can you clarify? if io isn’t the bottleneck in ktor native, what do you think is limiting throughput?
e

e5l

12/03/2021, 5:09 PM
It depends on the workload, right now IO taking about 5%
The IO will be a bottleneck when you have a lot of connections
k

kevin.cianfarini

12/03/2021, 6:38 PM
Thanks for clarifying!
e

Elena Lepilkina

12/13/2021, 2:52 PM
A bit more information for people interested in this benchmark, the next release will have some performance improvements for new MM. Current measurements on early version of 1.6.20 show that this sample should be faster in
~15%
with same ktor version and disablede freezing. Also 1.6.20 will include one more flag that you can try
-Xgc=cms
that enables concurrent mark and sweep strategy that should also speed up this sample a bit (
3-5%
). Please, pay attention that results may differ from platform to platform so numbers of improvements can differ.
👍 2
n

napperley

12/13/2021, 10:18 PM
Does the ~15% refer to throughput?
One thing that wasn't mentioned. What is the default GC used by the K/N's new MM?
1
e

Elena Lepilkina

12/14/2021, 8:17 AM
Does the ~15% refer to throughput?
Yes
What is the default GC used by the K/N's new MM?
Now stop-the-world mark and sweep . But later stop-the-world mark + concurrent sweep that now is under flag should become default
👍 1