Hi folks I have a not so great update about our usage of dat kotlinlang #apollo-kotlin

Hi, folks! I have a not-so-great update about our ...

Eduard Boloș

10/20/2022, 10:38 AM

Hi, folks! I have a not-so-great update about our usage of data builders 😞 While they are great at reducing the amount of code we need to maintain for tests, we just noticed that the build times have increased by ~40% by enabling them 😞 Is this expected? More specifically, Android Lint tasks take considerably longer (this I don't really understand, Lint should ignore generated sources), as well as compilation tasks (which is more understandable, but it still feels like the jump in compilation time is too high). You can see build scans side-by-side comparison of longest tasks in the screenshot 🧵

Eduard Boloș

10/20/2022, 10:38 AM

Before and after:

mbonnin

10/20/2022, 10:45 AM

This is unexpected. I would expect the data builder to generate less code... To be more precise, there's an initial cost of data builders because they generate builders for used schema types but the marginal cost as you add more operation should be negligeable

mbonnin

10/20/2022, 10:46 AM

Test builders on the other hand generate their models for each and every operation so the initial cost is low but it grows with the sum of size of all the possible responses of your queries (that can be pretty large)

mbonnin

10/20/2022, 10:47 AM

While data builders are basically o(size of schema)

mbonnin

10/20/2022, 10:47 AM

I'll try to reproduce on a sample project. Thanks for providing the build scans, that helps a ton 🙏

Eduard Boloș

10/20/2022, 10:48 AM

thanks for looking into it!

Eduard Boloș

10/20/2022, 10:48 AM

our GQL schema seems to have 588 types, is that considered large?

mbonnin

10/20/2022, 10:49 AM

Sounds reasonnable to me!

mbonnin

10/20/2022, 10:49 AM

It should generate only those types that are actually used, maybe I missed something there

mbonnin

10/20/2022, 10:49 AM

will double check

mbonnin

10/20/2022, 10:50 AM

Out of curiosity, is it an auto-generated schema from a SQL database (hasura maybe?) or something like this? We've seen edge cases with those in the past

Eduard Boloș

10/20/2022, 10:51 AM

no, they come from our Python backend that uses Graphene

mbonnin

10/20/2022, 10:52 AM

Gotcha, thanks for the info

Eduard Boloș

10/20/2022, 10:53 AM

hmm, just by looking at some random types, I can see even ones that are not used in the app have builders generated

mbonnin

10/20/2022, 10:59 AM

Alright, it's coming from here

mbonnin

10/20/2022, 11:00 AM

I wonder if it wouldn't just be easier to embed the SDL schema and parse it at runtime time...

mbonnin

10/20/2022, 11:01 AM

What's happening here is that data builders need

Schema.possibleTypes()

to loop through possible interface types

mbonnin

10/20/2022, 11:02 AM

And that in turn generates a small class containing the name of the type and its kind (object, interface, union, etc...)

mbonnin

10/20/2022, 11:02 AM

And also the builder because we put it in the same file 🤔

Eduard Boloș

10/20/2022, 12:19 PM

oh, okay. if I am looking at this correctly,

possibleTypes()

is only used by the

DefaultFakeResolver

resolveTypename()

. wouldn't it be feasible to have only the used types then? or am I missing something?

mbonnin

10/20/2022, 12:20 PM

I'm pretty sure we can only generate the used types. The only thing is if anyone is relying on the current behaviour of generating everything

mbonnin

10/20/2022, 12:21 PM

But they would have a simple enough workaround (using

alwaysGenerateTypesMatching

). I'll make a PR to remove this

Eduard Boloș

10/20/2022, 12:21 PM

big disclaimer about breaking changes in the release notes? data builders are experimental either way, so such a change would be acceptable imo 😄

mbonnin

10/20/2022, 12:22 PM

Yea, agreed. A note in the release notes should do it 👍

Eduard Boloș

10/20/2022, 12:25 PM

not sure how familiar you are with Android and Android Lint, but do you have any hunch why linting tasks were also taking longer, although generated code should be ignored?

mbonnin

10/20/2022, 12:51 PM

I'm not really sure. Maybe it doesn't ignore generated code?

mbonnin

10/20/2022, 12:51 PM

PR: https://github.com/apollographql/apollo-kotlin/pull/4471

Eduard Boloș

10/20/2022, 12:56 PM

yeah, could be. thanks for the quick fix!

mbonnin

10/20/2022, 12:57 PM

Sure thing. Thanks for reporting this!

Eduard Boloș

10/20/2022, 3:13 PM

Hmm, in anticipation of the above PR getting shipped, I tried the proposed fix locally, but it doesn't seem to work.

Schema

still contains all the types, including the ones that are not used. Not sure why, but it looks like this won't help 😞

mbonnin

10/20/2022, 4:03 PM

I'm guessing you have a

Node

interface or so that almost every object implements?

Eduard Boloș

10/20/2022, 4:08 PM

we do have a

Node

interface, but only 7 of our types implement it

mbonnin

10/20/2022, 4:09 PM

Maybe another interface or union that has a lot of implementations?

mbonnin

10/20/2022, 4:10 PM

The problem is when accessing an interface like this:

Copy code

{ 
  node {
    id
    # more fields
  }
}

Eduard Boloș

10/20/2022, 4:10 PM

no, not really, we have very few types that implement anything.

Eduard Boloș

10/20/2022, 4:10 PM

ah, let me see

mbonnin

10/20/2022, 4:11 PM

If you do the query above, then we have to generate builders for all implementations because you could do

Copy code

Data {
  node = buildCat {
  }
  // or
  node = buildDog {
  }
  // etc, for all types
}

mbonnin

10/20/2022, 4:12 PM

But if you use very little interfaces, it might be something else...

Eduard Boloș

10/20/2022, 4:15 PM

could it be because the

Mutation

and

Query

types basically include all the operations with all the types?

mbonnin

10/20/2022, 4:16 PM

Shouldn't be. It's only based on the operations so if you never query these root fields, they shouldn't be generated

Eduard Boloș

10/20/2022, 4:18 PM

I can see

Query.type

being used in the

Query<>

classes.

Copy code

public override fun rootField(): CompiledField = CompiledField.Builder(
    name = "data",
    type = com.wave.backend.type.Query.type
  )

We use the

compat

codegen method, in case it makes a difference.

mbonnin

10/20/2022, 4:19 PM

We use the
compat
codegen method, in case it makes a difference.

it shouldn't

mbonnin

10/20/2022, 4:21 PM

The code that adds all the implementations is there: https://github.com/apollographql/apollo-kotlin/blob/05f2a3295517fb7a64c2259c38b453[…]/main/kotlin/com/apollographql/apollo3/compiler/ir/IrBuilder.kt

mbonnin

10/20/2022, 4:22 PM

I can see
Query.type
being used in the
Query<>
classes

Query

and

Mutation

will always be generated

Eduard Boloș

10/20/2022, 4:24 PM

oh, I wonder how could I debug what gets added and why, thanks for that link!

mbonnin

10/20/2022, 4:26 PM

Well a few lines below is this: https://github.com/apollographql/apollo-kotlin/blob/05f2a3295517fb7a64c2259c38b453[…]/main/kotlin/com/apollographql/apollo3/compiler/ir/IrBuilder.kt

mbonnin

10/20/2022, 4:26 PM

So you were spot on

mbonnin

10/20/2022, 4:26 PM

That's 100% because the

Mutation

and

Query

types basically include all the operations with all the types? (e

mbonnin

10/20/2022, 4:26 PM

Damn

mbonnin

10/20/2022, 4:28 PM

We would need to track field usages and not only types...

mbonnin

10/20/2022, 4:29 PM

Out of curiosity, how many operations do you have in your codebase?

Eduard Boloș

10/20/2022, 4:33 PM

our schema has 237 mutations and 59 queries, and in our 4 apps (which are under a single Gradle project) we have 101 mutation and 81 query .graphql files, with many overlaps

mbonnin

10/20/2022, 4:36 PM

Thanks! So now we generate 588 data builder classes (without nested classes), to be compared with ~296 test builder classes (with nested classes)

mbonnin

10/20/2022, 4:37 PM

I see no way around tracking field usage to solve this how my...

mbonnin

10/20/2022, 4:46 PM

It's getting late here but I'll take a look tomorrow!

Eduard Boloș

10/20/2022, 4:47 PM

of course! thanks for spending the time to diagnose this!

mbonnin

10/21/2022, 12:06 PM

Forgot to post the link here. Stab at this is there: https://github.com/apollographql/apollo-kotlin/pull/4472

mbonnin

10/21/2022, 12:07 PM

I've added more tests for common use cases like unused type, unused field, field on interface type but the test schema is nowhere near complete as yours so would be awesome if you could try in your environment

mbonnin

10/21/2022, 12:07 PM

It would also tell us how much time that saves, hopefully that makes it on par with the test builders or even better

mbonnin

10/21/2022, 12:11 PM

Also, if you're open to it, we can add your schema and operations to our test suite (either public or internal one)

Eduard Boloș

10/21/2022, 12:11 PM

Hey! I saw the PR, I was testing this as you wrote 😄 Number of generated types got down to a third of what it was before 😄

Eduard Boloș

10/21/2022, 12:13 PM

In terms of build time, not sure if I can test this reliably without a snapshot build published on sonatype, so I can test on the CI. my laptop has very different build times depending on how much thermal throttling it suffers 🙃

mbonnin

10/21/2022, 12:18 PM

Makes sense. I'll merge this once our own CI is ✅

mbonnin

10/21/2022, 12:47 PM

It's merged, now we need CI#2 to kickoff and publish to the SNAPSHOTs 😅

mbonnin

10/21/2022, 12:48 PM

Sorry for the delays. Compiling native and running Gradle tests is 🐌

Eduard Boloș

10/21/2022, 1:00 PM

no worries, there's no rush on my end 😄

Eduard Boloș

10/21/2022, 1:01 PM

however, I just realised now, I think that there is another problem

Eduard Boloș

10/21/2022, 1:01 PM

now I get a bunch of "unresolved reference" errors in the other modules

mbonnin

10/21/2022, 1:01 PM

mbonnin

10/21/2022, 1:02 PM

Multi modules

Eduard Boloș

10/21/2022, 1:02 PM

our project has 4 application modules, and one shared library module

Eduard Boloș

10/21/2022, 1:02 PM

it seems like only the types used in the shared modules are generated, but not the ones only used in the app ones

mbonnin

10/21/2022, 1:03 PM

I see

Eduard Boloș

10/21/2022, 1:03 PM

sorry for making things so complicated xD

mbonnin

10/21/2022, 1:04 PM

No worries!

mbonnin

10/21/2022, 1:04 PM

Multi module is indeed a bit complicated, I always forget this case

mbonnin

10/21/2022, 1:05 PM

The problem being. If you're using a

User

in the

schema

module and using

User.firstName

there

mbonnin

10/21/2022, 1:05 PM

But only using

User.lastName

in the

feature1

module...

mbonnin

10/21/2022, 1:07 PM

We now need to track field usages accross all modules...

Eduard Boloș

10/21/2022, 1:07 PM

yes, then

UserBuilder

doesn't contain a

lastName

field 🙃

mbonnin

10/21/2022, 1:07 PM

Exactly

mbonnin

10/21/2022, 1:08 PM

track field usages accross all modules...

That means we configure and run codegen in all modules. Kindof defeats the isolation benefit of modules 😕

Eduard Boloș

10/21/2022, 1:08 PM

is that even possible? I mean, the app modules depend on the shared library one that has the schema, but I don't know how Apollo works

mbonnin

10/21/2022, 1:10 PM

I guess it's possible but not very pretty as changing a thing in

feature1

might trigger a recompilation in

feature2

mbonnin

10/21/2022, 1:31 PM

I'll try something, might be useful for other multi-modules scenarios

mbonnin

10/21/2022, 1:32 PM

Might break the project isolation thingie completely but there could be a fallback

mbonnin

10/24/2022, 2:02 PM

Small update there: looks like it's going to work but is a significant change to how the plugin is wired so might take a bit of time. I'll post here the updates

Eduard Boloș

10/25/2022, 9:52 AM

Thanks for the update! Let me know if there's anything I can do to support. Cheers!

Eduard Boloș

11/10/2022, 10:09 AM

Hi, @mbonnin! I saw that there is a new version of Apollo 👏 I wanted to ask you, would the new "usedCoordinates auto detection" feature help with generating just what's needed in the schema when using data builders?

mbonnin

11/10/2022, 10:19 AM

Yes! Sorry I meant to reach out about this. There are two things that could help you in this release: • https://github.com/apollographql/apollo-kotlin/pull/4494 -> to detect all used types automatically • https://github.com/apollographql/apollo-kotlin/pull/4486 -> to prevent lint for scanning all the generated source

mbonnin

11/10/2022, 10:20 AM

The (data builders reusing is still in the works)

mbonnin

11/10/2022, 10:21 AM

I'm not 100% happy with the way the

usedCoordinates

turned out. It's a lot of manual Gradle configuration. I'm currently looking into ways to make this process a bit more smooth (you currently have to doubly link the schema and feature modules)

mbonnin

11/10/2022, 10:22 AM

But it should make it possible to detect all the used types and fields (coordinates) at the price of some Gradle config

mbonnin

11/10/2022, 10:22 AM

Let me know what you think!

Eduard Boloș

11/10/2022, 10:29 AM

This looks great! Yeah, all the manual config might not scale well for some people (I know companies with dozens of modules – on the other hand, they usually have custom plugins that you apply once per module, taking care of all the config), but for us, with only a few modules, it's totally fine. Thank you so much, I will give this a try later today! 🙂

39 Views

Open in Slack

Previous Next