Can anyone advice on how kotlin incremental compil...
# gradle
r
Can anyone advice on how kotlin incremental compilation integrates with Gradle? Specifically, if I want to benefit from incremental compilation in a CI environment, which directories (if any) under
$PROJECT_DIR/build
do I need to cache between runs? Or can I just maintain the build cache (
~/.gradle/caches/build-cache-1
) and still benefit from incremental compilation? Testing suggests not... This bit of the question is a bit NOT KOTLIN (and I have asked it on the Gradle slack), but if you maintain the whole of
$PROJECT_DIR/build
between runs is there any point in maintaining the build cache (
~/.gradle/caches/build-cache-1
) or indeed using the build cache at all?
d
Doesn't the build cache feature allow the artifacts to be stored to a centralized repository. (Gradle Enterprise?) Allowing a CI worker farm to lookup and retrieve the cached artifacts across multiple CI invocations/runs. I believe this is the purpose of using it at scale, if you are a single CI system setup yet it may not seem as useful as more manual copy/restore approaches
Sorry I don't know about incremental compilation, not sure it is something I have considered in CI myself, as usually you want repeatable builds from a known initial state, usually the cost of all the CI machinery on resources/time is far more than compling a few 100 extra classes each time. Many people use
git
and it does not preserve timestamps, the key to incremental is going to be preservation of timestamps as well as data.
m
There are 3 concepts: 1. up-to-date checks: do not run a task if its inputs did not change 2. build cache: copy the task output from a previous run that used the same inputs 3. incremental tasks only run a portion of a task because only a portion of its inputs changed
Usually "kotlin incremental compilation" refers to 3. (doc) but this is super confusing because in Gradle terms, "incremental" refers to 1. (doc) So incremental builds are much different from incremental tasks.
Not sure what folders you need to keep around but I'd recommend using the remote cache. This way, you end up downloading only those outputs that you can actually reuse.
βž• 1
t
Remote cache per-se will not help with incremental compilation(IC) on CI. Indeed new JVM approach to IC has fixed the issue of incremental compilation after cache-hit in the dependency sub-project. But to be able to use IC you need to have some previous state of task execution. If you want to use IC on CI probably the easiest solution would be to save the state of the repo after build is finished and on the next run restore this state, apply git commit and run the build. You could try to save state only for Kotlin compilation tasks, but we don't provide any guarantees on paths stability/required paths to save.
s
@mbonnin / @Rob Elliot -- not sure if this can help, as it's quite new, but at Buildless (https://less.build) we just added a Github Action, and it's a drop-in remote cache based on Cloudflare
scales to zero and may be useful since it eliminates the slow up-front download pattern used by GHA
very nice 1
we've had good success with it internally; if anybody here would like to try it, let me know and i can provide keys
c
How does it work technically? Is it just a SaaS Build Cache server?
If that's the case, for everyone reading it: the Build Cache server is free to self-host: https://docs.gradle.com/build-cache-node/ (Docker Compose & Kubernetes configs are provided)
πŸ‘€ 1
s
It's a SaaS build cache server, written from scratch πŸ™‚ with support for many tools, not just Gradle. It's also a local Agent, so you can deploy it to your own machine. We're working on other deployment options as well (private cloud, self-host). The Agent is free forever, and the Cloud is also free during beta.
Naturally the build cache itself is written in Gradle/Kotlin, so we actually cache Buildless with Buildless, and that has afforded a lot of testing.
Especially over HTTP/3, and with smarter connection management/compression, build caching really flies. Our new Github Action also extends these benefits to CI. So yes -- the build cache node docker image is free, and totally has a place. It's useful, but to a limit, at least when I used it last managing it was rather tough. Keeping space available, but not too much, while keeping it online, wasn't our core business, and now it is, so it doesn't have to be yours.
@mbonnin if you're using the Build Cache node and have any needs that aren't being met, let us know πŸ™‚ so long as your build is fast then we're happy lol
c
@Samuel Gammon can you generate read-only API keys? For my projects, only the CI on the
main
branch and for tags are allowed to write to the remote cache, but everyone (including external contributors) should be able to pull from it
s
Yes! You can, using a new feature landing soon called Cache Projects. It lets you segment your cache objects, and those can be set to a public mode, which doesn't require an API key at all. We also have read-only API keys.
(But for OSS projects, the ability to distribute objects publicly is awesome)
c
Nice, the Gradle plugin README doesn't mention how to set it up πŸ™‚
s
Ah, good catch πŸ˜… The plugin is just a configuration client in this case, which is either detecting and using the Agent, locally, or the Cloud, if you have
BUILDLESS_APIKEY
set in your env
You can find the full doc here https://docs.less.build/docs/gradle
πŸ‘€ 1
we'll make sure to make that more prominent
c
Does this mean the Gradle plugin doesn't work if the user hasn't installed the Agent?
s
The plugin is designed to be inert if the Cloud or Agent are both unavailable or disabled, in which case it falls back to Gradle's built-in caching
the Agent is nice because you get a local in-memory cache, and it also helps handle network blips
if you're on a higher latency connection, it's also helpful that it defers uploads to the cloud (so uploads are always fast during your builds)
c
Ah that's a shame, I would have expected the Gradle plugin to bundle the Agent and to be able to start it automatically, rather than it be a separate install
s
That's great feedback and we could probably do that πŸ™‚ I'll keep that in mind
c
it's also helpful that it defers uploads to the cloud
can't that cause ordering issues if you have a task that tries to close a Sonatype repository?
s
I'm not sure I follow?
c
If the upload is deferred, a subsequent request (which depends on the previous one having finished) may fail?
s
We aren't actually taking control of any Sonatype publishing or any other uploads for that matter
c
Oh, it's specifically build cache uploads?
s
the Agent, even if you use it with newer dependency acceleration features, is only about downloads and build caching
Yes, sorry, build cache uploads are deferred
Between the local agent and cloud
c
I see, that's a great feature
s
Why thank you πŸ™‚
it's so helpful when people contribute to the cache, but unfair to ask people behind higher latency links to do so all the time
this helps even that divide, sorry, should have been more clear πŸ˜…
we are definitely still learning and figuring out where we can be most impactful
c
Do you expect all devs to contribute to the cache?
s
oh, no, it's something any individual dev can disable -- there are several switches (env, config, gradle script, etc)
c
At the moment I don't consider anything other than the main branch and tags to be trustworthy, to ensure I will never cache some broken code somehow
πŸ‘ 1
s
we do enable it by default, though, and hope to optimize it enough that they keep it on
we've had that concern too but it hasn't yet materialized
c
^ that is, the cache is always read-only on dev's machines
I see, good
s
not that it couldn't, i don't know if i fully trust gradle's keys
as in, we do see errant cache misses but i've never seen it mixup two pieces of code
c
Well, I much prefer a cache miss than it caching the wrong stuff
s
of course, and in some ways buildless is designed to be a database, but with relaxed requirements for consistency
that's what enables that deferred upload feature, for example
we prefer a cache miss over a slow large hit and download, for example, and sensible object caps are set for higher latency connections
in any case, build caching is sort of half art and science still, thankfully gradle's reports make it helpful, and our view is, you should get all those options πŸ™‚ but with sensible defaults
that make it a little less thorny / miss prone.
c
You mentioned the Agent is a GraalVM image, I think?
s
yes πŸ˜„ we are big graalvm fans as well
c
If I were to create a project that would benefit from build caching, should I include the Agent as a library, or separately start it as its own process and communicate with it however the Gradle plugin currently does?
("project" as in "something that looks like a build tool")
s
normally, the agent runs as a background service; you can start/stop/manage it with the same command line that hosts the agent itself
so it's a CLI and an agent
buildless agent start
starts, etc etc, and you can get stats with
buildless agent stats
this is, effectively, a little web server, doing the caching and/or proxying; it's exposing JSON endpoints, which we plan to document
Screenshot 2023-12-21 at 2.18.50β€―AM.png
quick run of
buildless --help
on latest
c
Do you have stats on the Agent vs local Gradle build cache? When not connected to the cloud at all, is the agent still faster?
s
we don't have super clear data yet, the agent is pretty new -- we're on
rc2
, that's why we're looking for beta users πŸ™‚
truthfully, i don't know where the agent will sit as compared to local on-disk caching; but, the drawback of on-disk is that it can never be shared
πŸ‘ 1
soon, you'll be able to use the agent entirely for free, and then upload it for free to a free-tier account, etc, share it with friends within reasonable limits
so at least there is an escape hatch there, you know? but yeah, in both cases you're bound by local resources and it's disk vs disk anyway, since the agent will try to use a unix socket where it can
soon we can support HTTP/2 or HTTP/3 from the gradle plugin itself, which might impact that scenario
c
Do you expect to have a tier with unlimited read-only access for unauthenticated users? (for OSS contributors)
s
but it requires much deeper changes to gradle
hm, "expect" is hard to promise, but that's what we're shooting for πŸ™‚
πŸ‘ 1
if we got the adoption and people were into it, certain companies or investors are already interested
so... call it a, idk, fuzzy yes? a hopeful yes πŸ˜„
c
Makes sense, thanks for the info πŸ™‚
s
of course πŸ™‚ thanks for asking, it really helps to hear where people's thoughts are
c
Personally, I have two kinds of projects: β€’ OSS: it's important that it also works for contributors, even if they haven't setup anything β€’ proprietary: a SaaS cache is a tough sell…
If it works out, I'll probably end up trying it with OSS projects, and then making some kind of presentation at my dayjob
s
okay! πŸ˜„ thanks, that would be huge! on both points. Cloud is free during beta, so if you want keys to try it for OSS projects, drop me a line at
sam@less.build
and we can get them provisioned for you. re/security, I totally understand the concern there but we want to get it right.
build cache objects, i think, can be intrinsically trusted in many cases, because it's a content-addressable hash under the hood which is, in essence, self-verifying
that's a qualified sentence as in, it may or may not apply to this or that tool
πŸ˜‚ 1
not calling out names lol
but, you know, we want to leverage that and make sure things are completely encrypted, verified, signed, etc, and transparent
c
Sorry, I wasn't clear πŸ˜… My worry was more about external actors accessing the codebase/artifacts, than corrupting them
s
yes, they should be encrypted so we cannot read them
we actually don't need to; any telemetry readable within the gradle cache blob could be transmitted otherwise
and we don't want to lol
πŸ‘ 1
we can do all sorts of compression, or what not, without ever understanding or decrypting those blobs.
(so long as we pick algorithms correctly)
but, i digress; the point is, that part needs to be carefully done so it's trustable.
c
Do you have a worry that people will start using your service to upload anything they want?
s
let me give one example: our dependency proxy will (hopefully) soon gain sigstore verification support
hm, i mean that's always a risk with UGC for sure; we apply reasonable limits and protections where we can, like any saas business, and writes are always identified anyway
πŸ‘ 1
if someone wanted to distribute something nasty, they would probably have an easier time doing it elsewhere; but, again, we're always working to strengthen our posture there
πŸ‘ 1
(for instance, we can speak various protocols, but we deliberately refuse to serve to browsers)