I m experimenting with a use of the compose runtime For a qu kotlinlang #compose

I'm experimenting with a use of the compose runtim...

quicksteve

02/07/2023, 9:28 PM

I'm experimenting with a use of the compose runtime. For a quick viability check, I whipped up a test scenario similar to the https://github.com/krausest/js-framework-benchmark tests. If I try to add 10,000 items (with each item being a tree of 8 compose nodes) the compose runtime spends a very long time in Compose:applyChanges, around 57 seconds (see attached flame graph). Trying with only 1,000 items the runtime is much reduced, at less than a second, leading me to believe I might be running into a performance cliff of the slot table implementation.

🤔 1

quicksteve

02/07/2023, 9:31 PM

I've found a possibly related bug report here: https://issuetracker.google.com/issues/246751843

02/07/2023, 9:50 PM

Super interested in this. We're running Compose with a custom tree in a JS VM for work. One of the things I've done around the Compose runtime is switch to JS-native collections. The Kotlin collections and collections written in Kotlin around things Kotlin Array (even though it bottoms out in a JS array) are pretty slow.

quicksteve

02/07/2023, 10:37 PM

Very interesting. When I change my main iteration loop from

Copy code

for (item in data) {
    key(item.id) {
        Row(item.id == selected, item, { selected = item.id }, {
            data.removeAt(data.indexOfFirst { it.id == item.id })
        })
    }
}

to a chunked one, like this

Copy code

for (chunk in data.chunked(100)) {
    for (item in chunk) {
        key(item.id) {
            Row(item.id == selected, item, { selected = item.id }, {
                data.removeAt(data.indexOfFirst { it.id == item.id })
            })
        }
    }
}

the performance is vastly improved (from 80 seconds down to 2 seconds)

shikasd

02/08/2023, 3:34 PM

interesting, I wouldn't expect these two to have a significant difference, as it results in the same amount of rows in the end could you capture a trace of both?

shikasd

02/08/2023, 3:37 PM

also looking at the trace in the original comment, it looks like a lot of the time is spent in a binary search over

ArrayList

, which might be not optimized for this in JS implementation.

quicksteve

02/08/2023, 5:46 PM

Here's a flamegraph for the fast version; while most of the time is still spent in

SlotWriter.moveFrom

the whole apply changes now is just 1200ms

shikasd

02/08/2023, 5:47 PM

oh wait, it is Java impl 😄

quicksteve

02/08/2023, 5:47 PM

Also not sure if it was clear, but this behavior is observed running on the JVM, not with JS

shikasd

02/08/2023, 5:47 PM

my bad 🙂

shikasd

02/08/2023, 5:50 PM

yeah, this is surprising in general, i'd expect these two to be equivalent in perf most probably there's a different slot table structure generated by the compiler which makes it perform better

quicksteve

02/08/2023, 5:59 PM

Decompiling the class file it seems that in the chunked version there is an extra startReplaceableGroup/endReplaceableGroup wrapping each chunk. Could it be possible that the case where a node has a huge amount of children is not well optimized, leading to some n^2 or worse perf?

shikasd

02/08/2023, 6:03 PM

I think some node pointer tracking doesn't scale well in that case Feel free to file a bug or comment with your use case on the bug you referenced above, I'll share it internally

quicksteve

02/08/2023, 6:06 PM

Will do. If I create even more nested loops (chunking by 1000/100/10) the performance is again improved, now spending most of the time in user code. Would it be helpful if I submitted the bug with sample code? If that's the case, I would also create a demo repository showing this behavior.

shikasd

02/08/2023, 6:06 PM

yeah, snippets or sample repo are definitely helpful

quicksteve

02/08/2023, 7:28 PM

I've submitted a ticket here: https://issuetracker.google.com/issues/268366116

11 Views

Open in Slack

Previous Next