Hi I ve been playing with the compiler backend for a while a kotlinlang #compiler

Hi! I've been playing with the compiler backend fo...

mcpiroman

05/16/2023, 12:19 PM

Hi! I've been playing with the compiler backend for a while and, in short, I managed to make it 2.3x faster 🚀 (so far), while also somewhat simplifying the code 🌟 ! More in 🧵 .

👀 10

😮 3

🔥 21

👍🏾 1

👍 4

😱 6

mcpiroman

05/16/2023, 12:20 PM

More precisely, I reimplemented the first 13 out of 79 lowering phases of WASM backend (as it is the simplest backend) and compared them to their original ones over 'compiling' the WASM stdlib (because it has loads of code and also it was the easiest to set up). They are semantically equal (they produce the same IR dumps as the OGs), their logic is also almost 1:1. I mostly just recoded the framework.

mcpiroman

05/16/2023, 12:20 PM

Now, one of those first phases is function inlining. This is the single most expensive lowering and I only sped it up about 40%, because its work of cloning subtrees cannot be avoided (without restructuring the lowerings' order). But the other 12 phases run 4x faster 🚀! And because most lowerings behave more like these, I expect that final outcome would shift more toward this factor. OOTH the initial construction of IR tree (fir2ir) will likely be slower. It only happens once per compilation though.

mcpiroman

05/16/2023, 12:21 PM

Still, I see at least 3 more sophisticated ways to significantly optimize the compilation. I'm afraid I won't have free time to try them out however.

elizarov

05/16/2023, 12:28 PM

What’s the trick?

mcpiroman

05/16/2023, 12:34 PM

Couple of things, I didn't measure each one though. The most important ones I think would be: • Proper tree management API • Including automatically storing back-references (bi-directional graph) • Per class element cache (although ideally it should be per feature) • Reduced indirection of symbols • Much simplified tree walking code

mcpiroman

05/16/2023, 12:38 PM

Quite interesting one, but in the end I don't think so important, part is that the elements are connected as a linked list instead of using arrays.

elizarov

05/16/2023, 12:39 PM

Can you share a link to the resulting code (and the dfifs)?

mcpiroman

05/16/2023, 12:44 PM

Yes, I will share it once I remove some hard coded local paths like things. There is not much of a diff because I've added a new modules instead of changing current ones.

elizarov

05/16/2023, 12:48 PM

Did you patch the existing IR tree structures or created new ones and convert between them?

mcpiroman

05/16/2023, 12:50 PM

Yes, I cloned the IR tree code and applied some minor changes. Then I created a converter from current to new IR. (Actually I generated it thanks to the tree-generator module).

bashor

05/16/2023, 4:58 PM

Promising numbers 👍 I’m looking forward to seeing changes. Also, it would be interesting to measure memory footprint change.

mcpiroman

05/16/2023, 5:13 PM

As to memory usage, I've added some bits here and removed some there, so the net outcome should be close to 0. Reducing peak memory for good would require quite different approach.

mcpiroman

05/16/2023, 9:50 PM

Here's the code https://github.com/mcpiroman/kotlin/tree/bir

mcpiroman

05/16/2023, 10:06 PM

Ofc its often hacky and there are quite a few redundant pieces. Overall I copy-pasted the whole of compiler/ir/ir.tree and other relevant parts. My copy of IR is named BIR as for Backend IR, but it could also mean IR version B.

mcpiroman

05/18/2023, 11:54 AM

Have you tried it out? WDYT?

udalov

05/26/2023, 9:32 PM

Hey @mcpiroman, we've started an internal discussion about your changes and we need some time to figure out the next steps. Obviously the improvement seems very promising, so eventually we'd like to incorporate your changes fully or partially, but right now we're still reviewing your implementation and thinking of a proper process to work on it further, so that you wouldn't end up working on something which is unmergeable. I hope to come back in ~2 weeks with some results and continue the discussion.

udalov

05/26/2023, 9:32 PM

By the way, what other optimizations do you mean here?

Still, I see at least 3 more sophisticated ways to significantly optimize the compilation.

mcpiroman

05/26/2023, 11:11 PM

Hi. What I meant are more general approaches that can be taken rather than optimizations. With the implementation I presented I only just explored what can be easily improved without moving too much things around. Still, it doesn't address many current issues. For example, memory usages has been mentioned. Now, the IR still keeps the whole code of a module in memory at once, so, while not without some pain, the footprint can probably be reduced by a few, maybe a few dozen percent, but not more. To actually make a difference IR has to be somehow sliced and processed in chunks. Secondly, while I reduced it quite a bit, still the most time is spent on searching the tree for an opportunity for transformation, rather than doing it (except for inlining), likely also because of data locality. It, along with optimizing for code cache would be a broad topic, but it can be improved upon by, from e.g. replacing the by class with by feature cache, to preventing each phase from traversing whole tree, to specifically applying code on data that is already local. Then comes multithreading - it is even harder, if not impossible to do with what I changed, but it has a strong potential to come quite naturally along with some other approach to lowering. Then, not the least important, comes an integration with outside world. Wow, that's long. I would even suggest that importing just these changes I sent as they are is not even worth it, at least not without first considering those new ways, lest to rewrite everything twice. I also think I'd be better to do something like has been done with FIR, to fork the backend code and keep both implementations for a while, so that more experiments can be made without affecting the existing code. I have more concreate ideas how it can be done and, as discussed with @Ilmir Usmanov [JB], I might have a chance to discuss or implement it further.

ansman

08/20/2023, 12:54 AM

I’m curious if anything ever came out of these changes

➕ 3

7 Views

Open in Slack

Previous Next