Hiya folks! I’m curious about some of K2’s impleme...
# compiler
t
Hiya folks! I’m curious about some of K2’s implementation details. For context, I work on rust-analyzer—a language server for the Rust programming language—and we were discussing K2. I had a question about it: how does K2 do invalidations/at what level of granularity does K2 invalidations? As I understand, K2 does resolution on a per-declaration basis, so I’m guessing invalidations are similarly done on a per-declaration basis (as opposed to, say, per-file). For context, in rust-analyzer, we have an extremely fine-grained invalidation scheme powered by a library called Salsa, where we invalidate analysis whenever an item (a function, body, trait definition, etc.—a declaration, in Kotlin’s terminology, I think?)—changes. The mechanism by which we accomplish this invalidation/incrementality is best described in the Rust compiler’s documentation, so even though we’re tracking lots of stuff, we’re able to be minimize how much work we need to redo on a per-edit basis.
y
Look into the Analysis API. I think it has info about what parts are calculated. I think the dependencies between declarations are mostly handled by some IntelliJ PSI system
t
hmm, i can take a look, and while i can see references to phases and caches in the source, it’s a bit harder for me to discern when said caches are invalidated. i also assume the intellij PSI system isn’t used the standalone compilation mode?
y
Oh sorry, are you on about incremental compilation? I believe that is on a per function basis yes.
d
The incremental compilation is implemented on file-level. Build tools (gradle or other intellij build system) passes to the compiler sources of changed files as sources and all compiled files as an additional classpath. After that there is an iteration algorithm in build tools, which determines which files might change due to previous changes and then passes sources of these files to the IC round until it stops on some fixed point. To determine which files might become dirty after changes in other files compiler records so-called lookups: pieces of information that "class/function/etc name
X
was referenced during the analysis of file `Y`". And on the build tools side we compare metadata dumps of current and previous compilation to find all declarations which were actually change.
So, as you can see, in the compilation the granularity of the process is quite rough, as the smallest input for the compiler is a single file. The other story is IDE. Here small changes in various places happen all the time, and reanalyze the whole files each time would be too time consuming. To handle that inside the Analysis API implemented smart cache-dropping algorithm, which drops results of analysis for specific declarations (e.g. only for one function). Also, AFAIK, the AA team works on even smarter algorithm, which will allow us to reanalyze only part of function body on change inside it:
Copy code
fun test() {
    // 1000 lines of code
    foo {
        println("hello") // change it to println("world")
    }
}
If I got the concept right, on such a change from example only
foo
call will be reanalyzed, and results for "1000 lines of code" would be taken from cache. @dimonchik0036 correct me if I wrong.
d
If we are talking about regular typing, the Analysis API reacts to them via the LLFirDeclarationModificationService. Basically, we have in-block modifications (modifications which can affect a predictable number of declarations, for instance typing inside a function body (if the function has an explicit return type)) and out-of-block modifications (all other). In-block modifications usually invalidate only a part of some particular declaration (e.g., drops the function's body). On the other hand, out-of-block modifications usually invalidate caches for the entire module and all dependent modules.
the AA team works on even smarter algorithm, which will allow us to reanalyze only part of function body on change inside it:
This is about KT-72357, and for now, we have plans only for partial analysis, not for partial invalidation, so your example transforms to
Copy code
fun test() {
    foo {
        println("hello") // want to check whether `println` points to the `kotlin.io.println`
    }
    // 1000 lines of code
}
so we can analyze only the first statement from the function and leave the remaining body untouched
thank you color 1
t
thanks for explaining and sorry for the delay! I don’t usually have slack open; i don’t use it at work. the distinction between the ide and build tool cases make sense to me. is the analysis API’s smart cache-dropping algorithm the one implemented in https://github.com/JetBrains/kotlin/blob/bb8023fa29794f6b08aaa315ac963d9d8187cd6c/analysis/low-level-api-fir/src/org/jetbrains/kotlin/analysis/low/level/api/fir/file/structure/LLFirDeclarationModificationService.kt#L58?