Has anybody dug into any details around what makes...
# announcements
t
Has anybody dug into any details around what makes Kotlin use a lot of memory when compiling? I've got a project that consists of a mix of generated and handwritten code and I've had to bump the compiler to 6GB memory to see it complete.. and I am curious. The project is only about 48000 lines (with whitespace) and 160-170 classes (+ a few data classes here and there). The largest kotlin class is 2600 lines long (with whitespace). (targeting jvm -- server style application)
👀 4
🙀 1
d
Hello Can you share your project, if it's open source, so we can investigate this problem?
l
The compiler usually takes a lot and uses a lot of memory when I'm using a lot of lambdas ans high order functions
At #CT0G9SD7Z some classes takes 40s to show syntax errors in IntelliJ
So I'd guess that the compiler is not very optimized when dealing with big classes and functions
t
@dmitriy.novozhilov Sadly, no, it is a closed source project 😞 During an earlier run, I did jmap a bit on what consumed memory, and I was slightly surprised by the large increase in character array usage. Over a few snapshots, the "leader board" on memory usage seemed pretty stable. Attached is a jmap dump from when it was hitting the VM ceiling in an earlier run. I should rerun it now after it has went from 4GB to 6GB and see if it is similar.
This code doesn't really use that much advanced concepts, since a lot of it is highly boilerplated serialization/deserialization (but due to the lack of a helpful object model, it is cumbersome to do it through reflection etc., and instead code generation has been a good way to ensure we get typed interfaces with the data).
(this is a kotlin-maven project, btw, if that would make any difference -- the excessive memory usage is visible both when building it from Intellij and straight maven -- not that this is surprising)
m
Just a guess, does your code contain many warnings?
Also, maybe you can provide just a part of your generated code? Without its actual semantics / naming, just to see which constructs do you use
s
@Thorkild What version of Kotlin compiler are you using?
t
@sdeleuze 1.3.72
l
@Thorkild And about the first question of Mikhail at least… many code warnings or not?
t
Working on checking. Maven shows none, but I find that highly unlikely that there isnt a single one, so I am checking if it is swallowing the warnings. I do use suppress annotations due to casting (unchecked_cast), which I am wondering if maybe generates a warning internally, and then the suppress just stops it from outputting the warning.
l
So, the Kotlin compilation is started from a maven build, not a Gradle one, right?
t
Yes. The same problem exists when I Run it from IntelliJ, though, but I am unsure if it runs it through maven or not.
310 unchecked cast suppressions in the code. I removed some of the suppressions, and maven then told me about the warnings, so I do not seem to have any warnings other than the ones suppressed. I tend to prefer to see warnings as errors , and avoid getting into the habit of seeing warnings as normal, so I tend to jump on warnings. So, it seems, if the problem is number of warnings, then the challenge is that the suppress doesn't make it ignore it completely.
I now have to use 7GB to compile it, since I am now using more of those classes, so I am guessing it escalates for every time I use the classes containing the suppressed warnings.
m
No, 370 or 1000 warnings isn't bad at all, it'd be bad if there were several thousands of them
m
Since you have generated code, one thing you may try to relieve the compilation burden is to package the generated sources a separate module and import that as a dependency, so that you don’t have to recompile those files every time. A simple maven multi-module build will do the trick.
Moreover, your numbers for LoC and classes don’t seem too problematic, so the culprit of the problem must be somewhere else. Meanwhile, you can try to use this maven extension to enable incremental compilation and further reduce build time: https://github.com/takari/takari-lifecycle
t
I have split out part of the code, but for this one it isn't the best thing to do (but I will have to if I have to). I dumped the heap of it and looked at the object memory usage now (through visualvm), but I dumped it too early so I think I did it before the stage where it really goes off the rails, so I am doing it again.
m
(sorry about Takari, I used it in the past but didn’t think it was only for Java)
t
After giving up on visualvm's calculations (it ran for 24 hours without finishing), I tried MemoryAnalyser (Mat) from Eclipse (I have this feeling of dejavu of doing the exact same order of tests before for other things..). The main culprit is char[], but that's not the root cause of course, but the "leak" is divided across 66000 entries of org.jetbrains.kotlin.load.java.structure.impl.classFiles.BinaryJavaMethod . Trying to dig down further a bit, but on random sample, they all retain the same amount of bytes, which is 99% in the returnType "org.jetbrains.kotlin.load.java.structure.impl.classFiles.PlainJavaClassifierType",
I think it might disregard the char[] referenced in this leak part of the leak analysis
I have through through what is "special" with the code I have, and one thing I have thought of might be a problem (but I have for now not found evidence for it) is that this is an API that tries to give a typed layer on an underlying structure where field names etc. are text strings. So it has a very simple data class with two fields (field name and class-reference).. and there are a little over 5000 of those classes spread across companion objects across the 130-160 classes
so if there is an assumption about few companion objects per class or some caching of that.... that might be a killer here
m
Does your project use kotlinx.html? Could be the cause with compiler 1.3.72 https://github.com/Kotlin/kotlinx.html/issues/147