Hi, folks! I have a not-so-great update about our ...
# apollo-kotlin
e
Hi, folks! I have a not-so-great update about our usage of data builders 😞 While they are great at reducing the amount of code we need to maintain for tests, we just noticed that the build times have increased by ~40% by enabling them 😞 Is this expected? More specifically, Android Lint tasks take considerably longer (this I don't really understand, Lint should ignore generated sources), as well as compilation tasks (which is more understandable, but it still feels like the jump in compilation time is too high). You can see build scans side-by-side comparison of longest tasks in the screenshot 🧵
Before and after:
m
This is unexpected. I would expect the data builder to generate less code... To be more precise, there's an initial cost of data builders because they generate builders for used schema types but the marginal cost as you add more operation should be negligeable
Test builders on the other hand generate their models for each and every operation so the initial cost is low but it grows with the sum of size of all the possible responses of your queries (that can be pretty large)
While data builders are basically o(size of schema)
I'll try to reproduce on a sample project. Thanks for providing the build scans, that helps a ton 🙏
e
thanks for looking into it!
our GQL schema seems to have 588 types, is that considered large?
m
Sounds reasonnable to me!
It should generate only those types that are actually used, maybe I missed something there
will double check
Out of curiosity, is it an auto-generated schema from a SQL database (hasura maybe?) or something like this? We've seen edge cases with those in the past
e
no, they come from our Python backend that uses Graphene
m
Gotcha, thanks for the info
e
hmm, just by looking at some random types, I can see even ones that are not used in the app have builders generated
m
Alright, it's coming from here
I wonder if it wouldn't just be easier to embed the SDL schema and parse it at runtime time...
What's happening here is that data builders need
Schema.possibleTypes()
to loop through possible interface types
And that in turn generates a small class containing the name of the type and its kind (object, interface, union, etc...)
And also the builder because we put it in the same file 🤔
e
oh, okay. if I am looking at this correctly,
possibleTypes()
is only used by the
DefaultFakeResolver
in
resolveTypename()
. wouldn't it be feasible to have only the used types then? or am I missing something?
m
I'm pretty sure we can only generate the used types. The only thing is if anyone is relying on the current behaviour of generating everything
But they would have a simple enough workaround (using
alwaysGenerateTypesMatching
). I'll make a PR to remove this
e
big disclaimer about breaking changes in the release notes? data builders are experimental either way, so such a change would be acceptable imo 😄
m
Yea, agreed. A note in the release notes should do it 👍
e
not sure how familiar you are with Android and Android Lint, but do you have any hunch why linting tasks were also taking longer, although generated code should be ignored?
m
I'm not really sure. Maybe it doesn't ignore generated code?
e
yeah, could be. thanks for the quick fix!
m
Sure thing. Thanks for reporting this!
e
Hmm, in anticipation of the above PR getting shipped, I tried the proposed fix locally, but it doesn't seem to work.
Schema
still contains all the types, including the ones that are not used. Not sure why, but it looks like this won't help 😞
m
I'm guessing you have a
Node
interface or so that almost every object implements?
e
we do have a
Node
interface, but only 7 of our types implement it
m
Maybe another interface or union that has a lot of implementations?
The problem is when accessing an interface like this:
Copy code
{ 
  node {
    id
    # more fields
  }
}
e
no, not really, we have very few types that implement anything.
ah, let me see
m
If you do the query above, then we have to generate builders for all implementations because you could do
Copy code
Data {
  node = buildCat {
  }
  // or
  node = buildDog {
  }
  // etc, for all types
}
But if you use very little interfaces, it might be something else...
e
could it be because the
Mutation
and
Query
types basically include all the operations with all the types?
m
Shouldn't be. It's only based on the operations so if you never query these root fields, they shouldn't be generated
e
I can see
Query.type
being used in the
Query<>
classes.
Copy code
public override fun rootField(): CompiledField = CompiledField.Builder(
    name = "data",
    type = com.wave.backend.type.Query.type
  )
We use the
compat
codegen method, in case it makes a difference.
m
We use the
compat
codegen method, in case it makes a difference.
it shouldn't
I can see
Query.type
being used in the
Query<>
classes
Query
and
Mutation
will always be generated
e
oh, I wonder how could I debug what gets added and why, thanks for that link!
So you were spot on
That's 100% because the
Mutation
and
Query
types basically include all the operations with all the types? (e
Damn
We would need to track field usages and not only types...
Out of curiosity, how many operations do you have in your codebase?
e
our schema has 237 mutations and 59 queries, and in our 4 apps (which are under a single Gradle project) we have 101 mutation and 81 query .graphql files, with many overlaps
m
Thanks! So now we generate 588 data builder classes (without nested classes), to be compared with ~296 test builder classes (with nested classes)
I see no way around tracking field usage to solve this how my...
It's getting late here but I'll take a look tomorrow!
e
of course! thanks for spending the time to diagnose this!
m
Forgot to post the link here. Stab at this is there: https://github.com/apollographql/apollo-kotlin/pull/4472
I've added more tests for common use cases like unused type, unused field, field on interface type but the test schema is nowhere near complete as yours so would be awesome if you could try in your environment
It would also tell us how much time that saves, hopefully that makes it on par with the test builders or even better
Also, if you're open to it, we can add your schema and operations to our test suite (either public or internal one)
e
Hey! I saw the PR, I was testing this as you wrote 😄 Number of generated types got down to a third of what it was before 😄
In terms of build time, not sure if I can test this reliably without a snapshot build published on sonatype, so I can test on the CI. my laptop has very different build times depending on how much thermal throttling it suffers 🙃
m
Makes sense. I'll merge this once our own CI is
It's merged, now we need CI#2 to kickoff and publish to the SNAPSHOTs 😅
Sorry for the delays. Compiling native and running Gradle tests is 🐌
e
no worries, there's no rush on my end 😄
however, I just realised now, I think that there is another problem
now I get a bunch of "unresolved reference" errors in the other modules
m
Ah
Multi modules
e
our project has 4 application modules, and one shared library module
it seems like only the types used in the shared modules are generated, but not the ones only used in the app ones
m
I see
e
sorry for making things so complicated xD
m
No worries!
Multi module is indeed a bit complicated, I always forget this case
The problem being. If you're using a
User
in the
schema
module and using
User.firstName
there
But only using
User.lastName
in the
feature1
module...
We now need to track field usages accross all modules...
e
yes, then
UserBuilder
doesn't contain a
lastName
field 🙃
m
Exactly
track field usages accross all modules...
That means we configure and run codegen in all modules. Kindof defeats the isolation benefit of modules 😕
e
is that even possible? I mean, the app modules depend on the shared library one that has the schema, but I don't know how Apollo works
m
I guess it's possible but not very pretty as changing a thing in
feature1
might trigger a recompilation in
feature2
I'll try something, might be useful for other multi-modules scenarios
Might break the project isolation thingie completely but there could be a fallback
Small update there: looks like it's going to work but is a significant change to how the plugin is wired so might take a bit of time. I'll post here the updates
e
Thanks for the update! Let me know if there's anything I can do to support. Cheers!
Hi, @mbonnin! I saw that there is a new version of Apollo 👏 I wanted to ask you, would the new "usedCoordinates auto detection" feature help with generating just what's needed in the schema when using data builders?
m
Yes! Sorry I meant to reach out about this. There are two things that could help you in this release: • https://github.com/apollographql/apollo-kotlin/pull/4494 -> to detect all used types automatically • https://github.com/apollographql/apollo-kotlin/pull/4486 -> to prevent lint for scanning all the generated source
The (data builders reusing is still in the works)
I'm not 100% happy with the way the
usedCoordinates
turned out. It's a lot of manual Gradle configuration. I'm currently looking into ways to make this process a bit more smooth (you currently have to doubly link the schema and feature modules)
But it should make it possible to detect all the used types and fields (coordinates) at the price of some Gradle config
Let me know what you think!
e
This looks great! Yeah, all the manual config might not scale well for some people (I know companies with dozens of modules – on the other hand, they usually have custom plugins that you apply once per module, taking care of all the config), but for us, with only a few modules, it's totally fine. Thank you so much, I will give this a try later today! 🙂