Adding all the darwin targets (`macosArm64`, `iosA...
# kotlin-native
m
Adding all the darwin targets (
macosArm64
,
iosArm64
,
watchosArm64
,
watchosSimulatorArm64
,
tvosArm64
,
tvosX64
,
tvosSimulatorArm64
, ...) our CI build times took a 40min hit šŸ˜• . Is there any chance the compiler could optimize "something" down there as from the distance it looks like it's doing the same thing over and over again?
šŸ‘€ 1
r
m
Thanks! Optimizing Gradle tasks execution is definitely going to help
I think optimizing the Kotlin compiler/linker would nice too
Like having another "IR" that the linker could re-use between targets sharing the same API or so... But maybe that's another range of complexity
ā˜šŸ¼ 1
a
Yeah build speeds are rough with so many targetsšŸ˜” I'd love to start with either • allow parallelization for linking using gradle workers, or • if stuck linking one at a time, at least use all available cores when no other tasks need to be run
l
@svyatoslav.scherbina Would sharing any of the linking park parts be possible to improve multi-target build times as suggested above?
s
I think optimizing the Kotlin compiler/linker would nice too
That’s basically what we are doing all the time, see https://youtrack.jetbrains.com/issue/KT-42294
Would sharing any of the linkingĀ parkĀ parts be possible to improve multi-target build times as suggested above?
What do you mean by ā€œsharing any of the linkingĀ partsā€? Which suggestion do you refer to?
l
I'm referring to the last message from @mbonnin in this thread, that I ā˜šŸ¼ reacted to.
s
Like having another ā€œIRā€ that the linker could re-use between targets sharing the same API or so... But maybe that’s another range of complexity
What are ā€œtargets sharing the same APIā€?
l
The 2 ios simulators and iosArm64, or modules that only use common Apple APIs (which could be detected via dependencies + inlining tracking, and whether there are target/OS specific source files.
m
I'm not really sure I used correct terminology above. What I was trying to say is that as a user, I have exactly the same code for all the "darwin" targets and it feels weird that the compiler compiles/links that code N times where N is the number of "darwin" targets
There are certainly very good reasons it is so but feels weird nonetheless and would be awesome if that use case could be optimized somehow
āž• 1
s
The 2 ios simulators and iosArm64,
They don’t have same API. Even the SDKs in Xcode are different, and some APIs are in fact different. Most of the platform API is the same, but not all.
I have exactly the same code for all the ā€œdarwinā€ targets and it feels weird that the compiler compiles/links that code N times where N is the number of ā€œdarwinā€ targets
Even if the code is exactly the same, in some cases it has to be compiled multiple times. For example, consider your Obj-C library has
size_t foo(void);
. And then in Kotlin you have
foo() + 1
. Depending on the platform, it can be
Int
or
Long
.
There are certainly very good reasons it is so but feels weird nonetheless and would be awesome if that use case could be optimized somehow
There is no easy way to optimize this. In any case, right now we are focused on optimizing the scenario of running app or tests on the single target during the development. What is your case? Do you build binaries for all these targets? Why do you need this?
m
I'm building a lib so it needs to ship all the targets
āž• 2
Right now, I'm skipping some targets in CI to run the tests only on MacOS which somewhat alleviates the issue but that means we won't catch any bug from a different
size_t
type for an example
a
Yeah, we also do something similar - we skip some targets on checks that block PRs, and then run a more thorough check on every merge to
main
edit: I think we also use
debug
instead of
release
in PRs because it's significantly faster
šŸ¤ 1
āž• 1
r
It won't catch all API differences, but you could address number size differences by testing one 32-bit and one 64-bit target
šŸ‘€ 2
m
we skip some targets on checks that block PRs, and then run a more thorough check on every merge
Same here. We've managed to stay on the Github Action free tier so far but we're really borderline and on busy days the CI can get behind quite a lot.
I think we also useĀ 
debug
Ā instead ofĀ 
release
Ā in PRs because it's significantly faster
Ooh interesting šŸ‘€ . I'll look into this.
It won't catch all API differences, but you could address number size differences by testing one 32-bit and one 64-bit target
Indeed that'd work for some classes of errors, maybe not all
(Also because this happens for MacOS targets which are the most expensive, this makes the issue somewhat more important. If it were only linux, we could maybe workaround by parallelizing all targets)
l
@svyatoslav.scherbina Could we imagine having the compiler know/track the API differences, to it can only recompile what's needed for the other but similar targets? For example, it'd start by building for
iosArm64()
, and if only iOS common APIs are used, the output is just shared for the simulator. Otherwise, an incremental compilation recompiles the smallest parts required that are using platform specific APIs (be it from the platform or used libraries). Same for the scenario of compiling for
macosArm64()
with APIs common to iOS/iPadOS, watchOS, tvOS, and macOS of course. That'd be very helpful for platform agnostic libraries.
āž• 1
a
@svyatoslav.scherbina I definitely understand the desire to focus on single-target iteration speeds. That's probably the task that the majority KMM developers do most frequently -- especially in newer, greenfield projects and making a good first impression is important! TBH, at Quizlet our single-target local test/debug loop speeds (with
debug
variant) are pretty acceptable. The area where K/N build speed hurts us the most is in how long it takes us to actually ship. In order to have PR turnaround times that make sense, we have to intentionally be less-safe by skipping targets and using debug instead of release -- even though those checks are important to our use cases. When we want to actually consume a new version of our shared library (or fix an error that our less-safe PR checks missed!), it takes quite a while to build a release version because all the link tasks are run serially. I get that there may not be much to be gained from trying to share intermediate state for these Apple targets -- or if the work needed to represent all of the minor differences between them is too high for too little payoff. In that case, I think bringing the Kotlin/Native compile and link tasks up-to-date with newer Gradle features like Workers and Configuration Cache would go a long way. Not only would it drastically improve things for folks like us who have their loops blocked by verifying multiple targets by allowing us to parallelize these steps, but it would also probably improve iteration speeds for common single-target development use cases!
s
Ā I think we also useĀ 
debug
Ā instead ofĀ 
release
Ā in PRs because it’s significantly faster
Yes. While I have your attention, let me also remind you of the document that contains this and probably other useful tips for improving compilation time: https://kotlinlang.org/docs/native-improving-compilation-time.html. @mbonnin
I’m building a lib so it needs to ship all the targets
Do you have any binaries declared? Is it possible that you build not only klibs, but also a lot of binaries? @louiscad
Could we imagine having the compiler know/track the API differences, to it can only recompile what’s needed for the other but similar targets?
Although one could imaging having this, we are not working on this at the moment, and don’t have short-term plans for this. And this would require quite a lot of work. After all, we still have more doable things to optimize.
For example, it’d start by building forĀ 
iosArm64()
, and if only iOS common APIs are used, the output is just shared for the simulator.
The output can’t be just shared for the simulator, because these are different native targets which require different machine code generated. @ankushg
In that case, I think bringing the Kotlin/Native compile and link tasks up-to-date with newer Gradle features likeĀ WorkersĀ andĀ Configuration CacheĀ would go aĀ longĀ way.
This is way more feasible and implementable. Unfortunately, all folks working on Gradle support are somewhat busy with stabilizing things for KMM Beta, so I can’t promise you any ETA here. But I personally consider this as one of the next steps in improving the compilation time.
šŸ‘ 2
šŸ‘šŸ¼ 1
l
The output can’t be just shared for the simulator, because these are different native targets which require different machine code generated.
Not when it comes to the Apple Silicon simulator though?
s
Not when it comes to the Apple Silicon simulator though?
In this case too. iOS simulator on Apple Silicon is different from iOS device with arm64. It might be possible to reuse something, but the binaries still aren’t identical. Otherwise we wouldn’t have to add a whole bunch of new targets for Apple Silicon šŸ™‚ See e.g. https://youtrack.jetbrains.com/issue/KT-43667
šŸ‘šŸ¼ 1
m
Do you have any binaries declared
I'll double check but my understanding is that tests will need linking anyways so that will need to be linked ?
s
Yes, running tests requires building test binaries. But there are tasks for building test binaries even for targets that we can’t run tests on, e.g. iosArm64. So such binaries are pretty much useless. IIRC, these tasks shouldn’t be invoked when running
build
,
assemble
or
check
. But please ensure that you don’t run these tasks in other way. Generally, it might be a good idea to start investigation with taking a look at Gradle build scan.
a
@svyatoslav.scherbina
Do you have any binaries declared? Is it possible that you build not only klibs, but also a lot of binaries?
Many folks (like us and maybe @mbonnin šŸ™ƒ) are intentionally building binaries like XCFrameworks as part of their CI steps, because the projects are intended to eventually be consumed from Swift/Obj-C codebases, outside of KMP. Or sometimes, the library also includes samples that folks want to make sure can compile their binaries.
> In that case, I think bringing the Kotlin/Native compile and link tasks up-to-date with newer Gradle features likeĀ WorkersĀ andĀ Configuration CacheĀ would go aĀ longĀ way.
This is way more feasible and implementable.
Unfortunately, all folks working on Gradle support are somewhat busy with stabilizing things for KMM Beta, so I can’t promise you any ETA here. But I personally consider this as one of the next steps in improving the compilation time.
This is good to hear! Hopefully it's not too complicated, because it'll be a big help for us!
šŸ‘ 2
p
Silicon has broken our ci pipeline. The bitrise premium machines timeout at 90 minutes and building a release framework for all targets takes between 80 and 100 minutes.
a
Same here. It would be very beneficial to run compilation and linking in parallel on multiple cores. This is specially important when splitting targets between hosts, e.g. compile only Apple targets on a macOs machine, and everything else on a cheaper Linux machine. Since Apple targets are all native, the whole process doesn't utilize cores well.