After digging a bit more it seems like this isn't ...
# compiler
y
After digging a bit more it seems like this isn't possible at least JVM-wise because the binaries are distributed as jars instead of as IR like how JS and Native is distributed. Now my question is like can I possibly then access the provided source code of a library (i.e. the .kt file of a specific function like
let
from the standard library), compile that as part of the normal code compilation, and then access it's body in IR? I just need a way to inspect inside the code of a function coming in from a different compilation unit in IR instead of having to look into its bytecode. Even if it is a copy of the function and not 100% what the function itself is that's still absolutely fine because I just need to read through it's IR without editing it, and so it's fine with me if I'm taking it's source files and compiling just the ones with the functions that I need to read that IR. So yeah is there any way to get a library's included sources and compile that?
s
I don't think compiler deserializes IR bodies or reads source from other modules. It will slow down the compilation way too much. You can ask compiler to create IR from certain set of sources however, but that will slow down compilation 3-5x at least. The "right" way to approach this is to add annotation/field/file during compilation of another module to keep information you need.
y
Well my idea is to just pull those files that contain the functions that I need to look into. The right way won't really work here because I'm relying on getting the body from sources that I have absolutely no control over (including the stdlib). So the question is like how do I get the source of a library included from Gradle and then how do I tell the compiler to compile a file in these sources? What I'm thinking is that I can possibly even convert it to PSI first, then choose the function that I need. I think that maybe the kotlin metadata of the function or the line numbers could include the file name so I can just use that to get the specific file that I need, parse it into PSI, then choose the function that I need, possibly change it's name so that I can compile it into IR against the actual binary of the library (and so that only that one single function will need to be compiled not the whole library), and then finally use it's IR. Just to explain what I'm actually doing is performing inlining on my own at an IR level to then do additional optimisations with the knowledge of what that inlined body contained, and so I need the whole body in IR form
Also I think the compiler does know the IR of other modules in one way or another though because IIRC inlining on JS and Native is done in the IR layer and not the "bytecode" (for lack of a better word) layer and so that means that the compiler somehow has the IR of these external declarations
s
On retrieving sources: As I said, it is possible, but is extremely slow. No one forbids you to do this anyway 😅 About inlining: Compiler does it in interesting way AFAIK: when you compile intermediate libraries, it uses only "headers" of the functions, and all inlining happens during "linking" step, but it does inline IR directly, you are right. What are you trying to achieve btw? Maybe I could help more if I knew why you need the body of these functions :)
y
Well basically I'm working on a compiler plugin that optimises lambdas returned by inlined functions and stored in local
val
s to just allow them to never need to be actual
Function
objects. Part of that is, for example, allowing a
let
call on a lambda. This (and any other inline function that generically uses an object) currently results in the lambda being "boxed" (if you will). The plan, which does work already for functions in the same compilation unit, is to inline any inline function where, at the call site, a lambda is based in the place of a generic parameter so that then my other plugin code can take that inlined body, replace every
IrGet
on the parameter with the copied
IrFunctionExpression
that is the lambda, and then let the platform's respective compiler do the rest of inlining for me (I do also have special cases for calling
invoke
on the lambda directly but whatever that's just irrelevant details). The body of the function is also needed if the function receives the lambda as an extension receiver because, for some reason, the compiler just doesn't optimise that by default. The future plan for this is to also support inling `@JvmInline value class`es with a lambda backing field and possibly then even transforming normal classes into lambda-backed classes with the ultimate goal of providing something similar to the idea of structs but much more powerful with great performance. More info about the optimisation itself is available at KT-44530 and the plugin itself is open-source on github
I did some more digging by the way and apparently KLibs are just serialised IR and so I wonder if I can somehow get the KLib of a desired library (like the stdlib) and then deserialise only the parts that I need to use
The best that I can get so far in terms of the Klibs route is
jvmResolveLibraries
which errors out when I try to get it to give me a result for the
stdlib
It seems like the Klib route is the right direction here but 1) it doesn't seem like any klib files get generated for jvm compilations and 2) I can't seem to figure out how to even find the klib files for a specific library
s
On the KLIBs, you are right, and there's a way in the compiler to get another module with deserialized inline function bodies, but it will only work for JS and Native, because, as you already noticed, on JVM module gets compiled to jar directly, so you'll have to operate on the bytecode layer, which is also possible with compiler plugin. You also could recompile some of the libraries to get IR directly, but I don't think that's your goal :) By the way, it feels like this logic belongs to the inlining phase of the compiler rather than a plugin. The plugin is quite limited in these cases, so maybe forking the compiler (and then contributing back) is a bit better way to go, so I would recommend this as something to look at 😅 What you mentioned, however, is that you don't want to alter the other modules klib code, does that mean that you want to retrieve the IR and inline it manually? It could work, but I feel like it will cause a lot of problems with compatibility between compiler versions.
y
I'm not inlining it manually per se, I took the
FunctionInlining
lowering of JS and Native and just rewrote it to work with an
IrPluginContext
and it works flawlessly for anything in the same compilation unit. I don't wanna fork the compiler because the plan is to release this as a compiler plugin for the time being at least until (hopefully) this optimisation gets implemented into the main Kotlin compiler itself. i know that I can probably do this as a bytecode transformation, but then there's a million other headaches to deal with like `ObjectRef`s and undoing some optimisations that the compiler made like converting capturing lambdas into classes-with-fields and a load of other things that would be automatically taken-care of if done at the IR level. I think I might try to recompile some of the libraries, or possibly I can resort to not optimising inline functions from other libraries that weren't compiled with my plugin, and then in my plugin I could perhaps create a klib for the jvm compilation of a library and place it inside of the jar so that my plugin can pick it up and deserialise it from consumers of that library. I might also just try to recompile a part of the library like you said, and perhaps that can be a fallback if a library wasn't compiled with my plugin as a form of like compile-time performance for run-time performance trade-off (as in for example parts of the stdlib will be recompiled to be optimised so that at runtime you get the most performance possible). Well down another rabbit hole I go I guess lol!
s
I am sure that you could also contribute this thing to compiler yourself, instead of waiting for it to be implemented, if the Kotlin team is happy with the scope of the changes. To me, it seems like a small optimization which doesn't change much on the surface or binary compatibility, but I am not an expert in these things, to be completely honest 🙂 Other than that, I would explore the possibility of copying these inline functions to help with the cases you mentioned. E.g. in the modules compiled with your plugin, you could have the original function with
irGet
and copied one with
irFunctionExpression
. Then in another module, you could choose a function based on the type argument and "just" replace a call.