Has anyone else seen significant performance regre...
# k2-adopters
y
Has anyone else seen significant performance regression with K2? I’m seeing the following from build scans comparing 1.9.23 with Beta5 for our Android project: •
KotlinCompile
totalling
30m 17.942s
->
46m 7.145s
KaptGenerateStubsTask
totalling
8m 32.377s
->
23m 35.496s
KaptWithoutKotlincTask
totalling
9m 56.293s
->
11m 14.002s
KspTaskJvm
totalling
6m 19.561s
->
22m 56.832s
Looking at the individual modules / tasks the ones that are significant slower with K2 seem quite random. E.g. while some bigger module are 40% faster with K2, many tiny modules’ KotlinCompile tasks are order of magnitude slower with K2 e.g.
0.5s
>
33s
. I’ll dig a bit more and create an issue on youtrack.
t
hmm, it could be a case of memory pressure in Kotlin daemon on parallel compilation. Could you validate that you have enough memory defined for Kotlin daemon?
j
Is there any tool to know which amount of memory we should provide? I have seen a bit annoying not having an “automatic tool” which can detect which amount of memory should we set in Java/Kotlin/Gradle. Even if that tools is just a wrapper of the gradle profiler tool which get those values by “brute force”
t
can't say for Gradle daemon memory requirements, but for Kotlin daemon it is correlates with LoC + amount of symbols in external dependencies. Thus this value will constantly be resuggested when you will change your codebase or add/remove dependencies 🙂 We are working on a documentation page on Kotlin daemon that should describe it more in detail. Also we are thinking how Kotlin build metrics report may help here, but no exact plans so far.
👍 2
ah year - the case becomes more problematic with parallel compilation in multi-module project. If you have few but really huge modules - Kotlin Daemon may require more memory to run compilation than for many small modules
why it could be a memory pressure in Kotlin Daemon - in Beta4 we've changed default GC in Kotlin daemon to parallel GC. Parallel GC brings performance improvement around 10%, but manages memory slightly differently from default G1C. With G1C you may also have quite a lot of GC runs, but it was not so prominent as with parallel GC
y
It does seem to be related to memory pressure as this is a CI build running in a docker container. I remember benchmarking with gradle profiler locally (64g RAM) with the early 2.0 Betas last year and there were some perf improvements
I’m setting the jvmargs for kotlin daemon explicitly with
kotlin.daemon.jvmargs
, and also using Parallel GC as it has better throughput. I just did another run giving kotlin daemon more RAM but not seeing any difference.
🤔 1
t
Kotlin daemon prints some GC logs into log file (in system tmp dir) - could you check it?
theoretically it could also be some memory leak - then we need a repro project to investigate
or, at least, heap dump
y
I’ll try to get them from circleci
thank you color 1
does code cache matter for ephemeral CI build with fresh compiler daemon?
t
What do you mean by code cache? 🤔
btw how many lines of code your
KotlinCompile
task compiles?
y
I mean setting
ReservedCodeCacheSize
and
UseCodeCacheFlushing
as mentioned in the issue above
IIRC it only matters for incremental build?
s
Code cache (Amount of asm that JIT can emit) is more or less constant based on the size of the compiler JARs. Usually, a lot of classes is already JIT-ed by the end of the build (When it is long enough) Also, when Kotlin deamon is used code-cache is filling up with time. Our benchmarks shows that JVM is likely to run out of the code cache with default settings.
Copy code
Java HotSpot(TM) 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled.
Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=
👍 1
t
@Yang could you also create Kotlin build report file?
we would be interested to look into it
y
yup I’ll try kotlin build report next
thank you color 1
btw how many lines of code your
KotlinCompile
task compiles?
736,914
lines of kotlin code excluding tests in the codebase, the build probably compiles a bit less as a few modules are not part of the build
Here are the top section of the kotlin build report metrics: K1
Copy code
Time metrics:
  Total Gradle task time: 6,414.13 s
  Spent time before task action: 105.78 s
  Task action before worker execution: 46.98 s
  Run compilation in Gradle worker: 2,210.44 s
    Clear jar cache: 0.01 s
    Clear output: 0.01 s
    Connect to Kotlin daemon: 4.84 s
    Run compilation: 1,944.98 s
      Non incremental compilation in daemon: 259.38 s
      Incremental compilation in daemon: 1,941.22 s
        Update caches: 11.33 s
        Sources compilation round: 1,842.27 s
          Compiler initialization time: 27.49 s
          Compiler code analysis: 1,283.67 s
          Compiler code generation: 289.86 s
          Compiler IR translation: 239.95 s
          Compiler IR lowering: 125.25 s
          Compiler IR generation: 164.45 s
        Shrink and save current classpath snapshot after compilation: 21.55 s
          Shrink current classpath snapshot non-incrementally: 20.52 s
            Load current classpath snapshot: 4.24 s
          Save shrunk current classpath snapshot: 0.39 s
  Start gradle worker: 6.12 s
  Classpath entry snapshot transform: 15.76 s
    Load classes (paths only): 0.13 s
    Snapshot classes: 14.28 s
      Load contents of classes: 3.39 s
      Snapshot Kotlin classes: 4.22 s
      Snapshot Java classes: 3.15 s
    Save classpath entry snapshot: 0.77 s

Size metrics:
  Total size of the cache directory: 28.8 KB
    ABI snapshot size: 22.1 KB
  Increase memory usage: 56.6 GB
  Total memory usage at the end of build: 2,261.9 GB
  Total compiler iteration: 461
    Number of lines analyzed: 2872517
    Number of lines for code generation: 1660623
    Analysis lines per second: 842447
    Code generation lines per second: 1481243
    Compiler IR translation line number: 1660623
    Compiler IR lowering line number: 1660623
    Compiler IR generation line number: 1660623
  Number of times 'ClasspathEntrySnapshotTransform' ran: 1058
    Size of jar classpath entry: 859.1 MB
    Size of jar classpath entry's snapshot: 185.2 MB
    Size of directory classpath entry's snapshot: 8 B
  Number of times classpath snapshot is shrunk and saved after compilation: 461
    Number of classpath entries: 51167
    Size of classpath snapshot: 4.5 GB
    Size of shrunk classpath snapshot: 110.5 MB
  Number of times classpath snapshot is loaded: 461
    Number of cache hits when loading classpath entry snapshots: 49951
    Number of cache misses when loading classpath entry snapshots: 1216
  Start time of task action: 35489-01-24T03:32:46

Build attributes:
  REBUILD_REASON:
    Incremental compilation is not enabled(157)
    Unknown Gradle changes(461)
K2
Copy code
Time metrics:
  Total Gradle task time: 9,387.18 s
  Spent time before task action: 128.09 s
  Task action before worker execution: 722.05 s
  Run compilation in Gradle worker: 4,451.58 s
    Clear jar cache: 719.40 s
    Connect to Kotlin daemon: 558.17 s
    Calculate output size: 0.41 s
    Run compilation: 863.79 s
      Non incremental compilation in daemon: 1,071.88 s
      Incremental compilation in daemon: 859.37 s
        Update caches: 3.81 s
        Sources compilation round: 810.02 s
          Compiler initialization time: 8.99 s
          Compiler code analysis: 344.75 s
          Compiler code generation: 252.20 s
          Compiler IR translation: 201.40 s
          Compiler IR lowering: 118.02 s
          Compiler IR generation: 134.03 s
        Write history file: 0.01 s
        Shrink and save current classpath snapshot after compilation: 13.69 s
          Shrink current classpath snapshot non-incrementally: 13.00 s
            Load current classpath snapshot: 3.21 s
          Save shrunk current classpath snapshot: 0.26 s
  Start gradle worker: 20.65 s

Size metrics:
  Total size of the cache directory: 355.7 MB
    ABI snapshot size: 15.2 KB
  Increase memory usage: -8572176912 B
  Total memory usage at the end of build: 1,993.0 GB
  Total compiler iteration: 317
    Number of lines analyzed: 1682046
    Number of lines for code generation: 1682046
    Analysis lines per second: 1134781
    Code generation lines per second: 1680855
    Compiler IR translation line number: 1682046
    Compiler IR lowering line number: 1682046
    Compiler IR generation line number: 1682046
  Number of times classpath snapshot is shrunk and saved after compilation: 317
    Number of classpath entries: 32662
    Size of classpath snapshot: 2.8 GB
    Size of shrunk classpath snapshot: 73.4 MB
  Number of times classpath snapshot is loaded: 317
    Number of cache hits when loading classpath entry snapshots: 31015
    Number of cache misses when loading classpath entry snapshots: 1647
  Start time of task action: 35706-01-09T22:38:56

Build attributes:
  REBUILD_REASON:
    Incremental compilation is not enabled(305)
    Unknown Gradle changes(317)
I’m not familiar with this report but these look very suspicious
Copy code
Clear jar cache: 719.40 s
Connect to Kotlin daemon: 558.17 s
🤔 2
t
Connect to Kotlin daemon
is indeed looks suspicious
y
out of curiosity is “parallel compilation” in K2 equivalent to setting the experimental
-Xbackend-threads
in K1?
t
no, not related. It is related when multiple compilation tasks are running at the same time
y
I see
d
Let me add my 2 cents. Our project wasn't compiling against Beta4, but. we do have some benchmarks on our CI and local machines using Gradle Profiler. I've just made a clean run off my Developer Machine, and it's indeed performs worse than 1.9.23. Some inputs: we are Android mainly Kotlin project with 3% of Java, a few modules with legacy KAPT and most modules using KSP. We have 571 Gradle modules and we are running (almost) latest build tools AGP 8.3.1, Gradle 8.6, JDK 17 (latest openjdk 17. Hardware: 32 GB of RAM M2 Pro 12 core machine with fans full speed on macOS 14.4.1 constantly powered. JVM args:
org.gradle.jvmargs=-Xmx12g -XX:MaxMetaspaceSize=1g -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -Dfile.encoding=UTF-8
Copy code
Task measured against;
./gradlew appassembleDebug --no-build-cache --rerun-tasks` Interesting finding; Task count went down, but configuration time is slightly up, and the whole run takes longer too. I'm attaching measured numbers here:
I really want to dig into what tasks are taking longer, but my guess is that might be caused by KAPT + KSP running K1 mode(?).But that's odd that overall task count went down, still overall execution time went up.
t
@dniHze please compare Kotlin build reports between 1.9.23 and Beta5
d
Sure. Let me try this one and get back to you when I have these done.
👍 1
@tapchicoma alright, looks like I have them: K1:
Copy code
Time metrics:
  Total Gradle task time: 1,951.88 s
  Spent time before task action: 43.75 s
  Task action before worker execution: 66.76 s
  Run compilation in Gradle worker: 1,265.35 s
    Clear jar cache: 0.37 s
    Clear output: 1.08 s
    Connect to Kotlin daemon: 20.64 s
    Calculate output size: 0.52 s
    Run compilation: 701.88 s
      Non incremental compilation in daemon: 539.84 s
      Incremental compilation in daemon: 690.11 s
        Store build info: 0.19 s
        Clear outputs on rebuild: 0.15 s
        Update caches: 2.40 s
        Sources compilation round: 436.16 s
          Compiler initialization time: 29.50 s
          Compiler code analysis: 210.98 s
          Compiler code generation: 102.16 s
          Compiler IR translation: 91.21 s
          Compiler IR lowering: 37.09 s
          Compiler IR generation: 64.80 s
        Write history file: 0.09 s
        Shrink and save current classpath snapshot after compilation: 10.80 s
          Shrink current classpath snapshot non-incrementally: 8.43 s
            Load current classpath snapshot: 2.32 s
          Save shrunk current classpath snapshot: 1.64 s
  Start gradle worker: 0.91 s
  Classpath entry snapshot transform: 0.80 s
    Load classes (paths only): 0.00 s
    Snapshot classes: 0.58 s
      Load contents of classes: 0.03 s
      Snapshot Kotlin classes: 0.35 s
      Snapshot Java classes: 0.02 s
    Save classpath entry snapshot: 0.13 s

Size metrics:
  Total size of the cache directory: 182.9 KB
    ABI snapshot size: 25.5 KB
  Increase memory usage: 67.5 GB
  Total memory usage at the end of build: 1,438.9 GB
  Total compiler iteration: 532
    Number of lines analyzed: 372887
    Number of lines for code generation: 348003
    Analysis lines per second: 792399
    Code generation lines per second: 1845299
    Compiler IR translation line number: 348003
    Compiler IR lowering line number: 348003
    Compiler IR generation line number: 348003
  Number of times 'ClasspathEntrySnapshotTransform' ran: 117
    Size of jar classpath entry: 6.1 MB
    Size of jar classpath entry's snapshot: 4.8 MB
  Number of times classpath snapshot is shrunk and saved after compilation: 532
    Number of classpath entries: 40328
    Size of classpath snapshot: 2.4 GB
    Size of shrunk classpath snapshot: 56.9 MB
  Number of times classpath snapshot is loaded: 532
    Number of cache hits when loading classpath entry snapshots: 38604
    Number of cache misses when loading classpath entry snapshots: 1724
  Start time of task action: 51219-05-29T15:21:28

Build attributes:
  REBUILD_REASON:
    Incremental compilation is not enabled(376)
    Unknown Gradle changes(532)

Total time for Kotlin tasks: 848.11 s (43.4 % of all tasks time)
K2:
Copy code
Time metrics:
  Total Gradle task time: 1,832.97 s
  Spent time before task action: 39.77 s
  Task action before worker execution: 132.85 s
  Run compilation in Gradle worker: 1,150.10 s
    Clear jar cache: 98.55 s
    Clear output: 0.81 s
    Connect to Kotlin daemon: 60.17 s
    Calculate output size: 0.67 s
    Run compilation: 409.01 s
      Non incremental compilation in daemon: 339.85 s
      Incremental compilation in daemon: 402.02 s
        Store build info: 0.11 s
        Clear outputs on rebuild: 0.07 s
        Update caches: 0.82 s
        Sources compilation round: 295.71 s
          Compiler initialization time: 14.49 s
          Compiler code analysis: 125.92 s
          Compiler code generation: 74.05 s
          Compiler IR translation: 78.73 s
          Compiler IR lowering: 32.72 s
          Compiler IR generation: 41.06 s
        Write history file: 0.02 s
        Shrink and save current classpath snapshot after compilation: 7.39 s
          Shrink current classpath snapshot non-incrementally: 5.90 s
            Load current classpath snapshot: 1.70 s
          Save shrunk current classpath snapshot: 0.89 s
  Start gradle worker: 1.27 s

Size metrics:
  Total size of the cache directory: 297.1 MB
    ABI snapshot size: 25.5 KB
  Increase memory usage: 35.4 GB
  Total memory usage at the end of build: 1,664.2 GB
  Total compiler iteration: 532
    Number of lines analyzed: 378782
    Number of lines for code generation: 353898
    Analysis lines per second: 1469265
    Code generation lines per second: 2615244
    Compiler IR translation line number: 353898
    Compiler IR lowering line number: 353898
    Compiler IR generation line number: 353898
  Number of times classpath snapshot is shrunk and saved after compilation: 532
    Number of classpath entries: 40328
    Size of classpath snapshot: 2.4 GB
    Size of shrunk classpath snapshot: 140.4 MB
  Number of times classpath snapshot is loaded: 532
    Number of cache hits when loading classpath entry snapshots: 39006
    Number of cache misses when loading classpath entry snapshots: 1322
  Start time of task action: 51219-05-22T06:24:44

Build attributes:
  REBUILD_REASON:
    Incremental compilation is not enabled(376)
    Unknown Gradle changes(532)

Total time for Kotlin tasks: 815.52 s (44.5 % of all tasks time)
t
@dniHze would be interesting to see full report, but looks like related to the performance issue linked in this thread above. Please try to remeasure with
2.0.0-RC1
release
d
Is there an expected timeline when the first RC should drop? 👀 And yeah, more than happy to do the measurements once that is ready.
t
around next week
👌 1
fyi: RC1 was just released
y
Thanks, will test soon.
And looks like compose compiler is now part of Kotlin!
t
yes, more details in this issue. Check also linked issues for known bugs - we are planning to fix it in RC2
d
Just to check, does it only replaces JB Compose Compiler, or it also going to replace the Google's Compose Compiler too? 👀
t
it replaces Google Compose compiler plugin - basically compose compiler plugin was moved into
kotlin.git
d
No more waiting for alpha builds of google's compose compiler. neat. So now effectively it comes bundled with the language version.
👌 1
y
Just did another run with 2.0.0-RC1 and seeing so big improvements 🎉
Copy code
| Task                      | Kotlin 1.9.23   | Kotlin 2.0.0-Beta5 | Kotlin 2.0.0-RC1 |
|---------------------------|-----------------|--------------------|------------------|
| **KotlinCompile**         | 30m 17.942s     | 46m 7.145s         | 20m 48.478s      |
| **KaptGenerateStubsTask** | 8m 32.377s      | 23m 35.496s        | 18m 14.579s      |
| **KaptWithoutKotlincTask**| 9m 56.293s      | 11m 14.002s        | 12m 12.981s      |
| **KspTaskJvm**            | 6m 19.561s      | 22m 56.832s        | 7m 10.953s       |
kapt stub generation is still taking much longer than K1, I wonder if it’s due to
kapt.use.k2=true
New kotlin build report:
Copy code
Time metrics:
  Total Gradle task time: 6,517.78 s
  Spent time before task action: 153.94 s
  Task action before worker execution: 55.37 s
  Run compilation in Gradle worker: 2,323.84 s
    Clear jar cache: 0.52 s
    Connect to Kotlin daemon: 8.09 s
    Calculate output size: 0.45 s
    Run compilation: 1,011.91 s
      Non incremental compilation in daemon: 1,301.95 s
      Incremental compilation in daemon: 1,008.65 s
        Clear outputs on rebuild: 0.01 s
        Update caches: 4.70 s
        Sources compilation round: 953.53 s
          Compiler initialization time: 13.59 s
          Compiler code analysis: 392.68 s
          Compiler code generation: 308.58 s
          Compiler IR translation: 235.84 s
          Compiler IR lowering: 149.26 s
          Compiler IR generation: 159.16 s
        Write history file: 0.01 s
        Shrink and save current classpath snapshot after compilation: 15.05 s
          Shrink current classpath snapshot non-incrementally: 14.10 s
            Load current classpath snapshot: 3.02 s
          Save shrunk current classpath snapshot: 0.52 s
  Start gradle worker: 6.26 s

Size metrics:
  Total size of the cache directory: 359.5 MB
    ABI snapshot size: 15.4 KB
  Increase memory usage: 38.7 GB
  Total memory usage at the end of build: 2,220.4 GB
  Total compiler iteration: 321
    Number of lines analyzed: 1711070
    Number of lines for code generation: 1711070
    Analysis lines per second: 1041549
    Code generation lines per second: 1512965
    Compiler IR translation line number: 1711070
    Compiler IR lowering line number: 1711070
    Compiler IR generation line number: 1711070
  Number of times classpath snapshot is shrunk and saved after compilation: 321
    Number of classpath entries: 33004
    Size of classpath snapshot: 2.9 GB
    Size of shrunk classpath snapshot: 75.0 MB
  Number of times classpath snapshot is loaded: 321
    Number of cache hits when loading classpath entry snapshots: 31884
    Number of cache misses when loading classpath entry snapshots: 1120
  Start time of task action: 36057-03-25T08:49:43

Build attributes:
  REBUILD_REASON:
    Incremental compilation is not enabled(307)
    Unknown Gradle changes(321)

Total time for Kotlin tasks: 2,256.78 s (34.6 % of all tasks time)
Some observation: •
Total time for Kotlin tasks
(2,256.78 s) is now on par with K1 (2,184.79 s), was 4,082.17 s in Beta5 •
clear jar cache
(0.52s) is on par with K1 (0.00s), was 791.4s in Beta5 •
Connect to Kotlin daemon
(8.09s) still takes longer than K1 (4.59s), but much improved from Beta5 (558.17s) • `Non incremental compilation in daemon`: (1,301.95s) is still much longer than K1 (260.73s), and also slightly longer than Beta5 (1,071.88s)
Our overall build time is only ~ 1 minutes slower than K1 now. Great work @tapchicoma!
t
kapt stub generation is still taking much longer than K1, I wonder if it’s due to kapt.use.k2=true
Is it clean build?
y
yep
I disabled
kapt.use.k2
which seems to fix the regression
Copy code
| Task                      | 1.9.23       | 2.0.0-RC1 (KAPT K2 On) | 2.0.0-RC1 (KAPT K2 Off) |
|---------------------------|--------------|------------------------|-------------------------|
| **KaptGenerateStubsTask** | 8m 32.377s   | 18m 14.579s            | 8m 16.309s              |
| **KaptWithoutKotlincTask**| 9m 56.293s   | 12m 12.981s            | 10m 10.441s             |
t
interesting, will try to clarify it
👍 1
so kapt generate stubs could run slower due to the using completely new analysis API instead of compiler internals. Though kapt performance should not differ. Unfortunately at the moment we don't write any kapt related info into the Kotlin build metrics report, but we plan to add such metrics (issue). There is an ongoing work to improve kapt/k2 in 2.0.20 release and add incremental compilation support for it.
👍 1
d
As promised, I've done some measurements on our side: Mean K1: 281,820.32 ms Mean RC1: 269,031.82 ms Diff: -4.5% Finally, for the first time, we see build time improvements after all betas. Great job getting these improvements! Unfortunatelly, still can't confirm with K2/Kapt due to Dagger issues.
👍 2
thank you color 1