Is there any special attention that should be paid to loadin kotlinlang #io

Is there any special attention that should be paid...

Zyle Moore

06/08/2025, 9:43 PM

Is there any special attention that should be paid to loading large files into the browser with

Buffer

? (200MB+)

Zyle Moore

06/09/2025, 2:01 AM

What I'm seeing is that, in JVM, I can read the file in about 2.5 seconds. In the browser, running the same verbatim code, takes about 40 seconds, and has a much higher/faster memory consumption.

Zyle Moore

06/09/2025, 5:19 AM

It actually might be deserializing inline classes in JS that's taking so much time

Filipp Zhinkin

06/09/2025, 1:04 PM

Could you please clarify how do you load files? Initially, I though about reading files using

SystemFileSystem

and for JS it's indeed tremendously slow and could be improved. But the

SystemFileSystem

is not supported for browser.

Zyle Moore

06/10/2025, 2:06 AM

Got a couple parts at play. I have a Kotlin Multiplatform JS project using the React Wrapper. This project has a React component, which boils down to an

<input type="file">

. It has an

onChange

listener that attaches a

FileReader

. The

FileReader

has an

onLoad

event that reads the file as an

ArrayBuffer

. After that, I make an

Int8Array(ArrayBuffer)

to get a plain

ByteArray

. Then, as a sanity check, I actually iterate through the 249MB file, and

xor

every byte with the last. All of that is the fast part.

Zyle Moore

06/10/2025, 2:11 AM

Then I convert that

ByteArray

to a

ByteString

, and perform the same

xor

test. Again, still pretty fast.

Zyle Moore

06/10/2025, 2:38 AM

But then for some reason, when I go to deserialize ~4 million 4-byte objects from this

ByteString

, it takes about 80 seconds. Iterating over every byte, milliseconds. It's a custom format, so I'm sure it's something I'm doing, but running the same code on JVM is like, immediately faster in all ways.

Zyle Moore

06/10/2025, 2:41 AM

In the browser, I get an output like this.

4387994, 50494, 869688, 3467813, 1m 20.358s, 457ms, 6.555s, 5.44s, 22.593s, 0s, 0s

• 4,387,994 4-byte inline strings • 50,494 4-byte inline ints • 869,688 different 4-byte inline ints • 3,467,813 2-byte inline shorts • 1m20s total time to read 4,387,994 (first) • 6.555s total time to read 869,688 (third) • 5.44s total time to read a different group of inline ints • 22.593s total time to read 3,467,813 (fourth)

Zyle Moore

06/10/2025, 2:49 AM

On the JVM, the results are

4387994, 50494, 869688, 3467813, 1.650348681s, 11.886447ms, 172.236873ms, 304.934099ms, 677.563268ms, 30.814us, 3.202us

Zyle Moore

06/11/2025, 1:47 AM

According to the profiler in the browser, most of the time is being spent in Major GC, or

TypedArraySpeciesConstructor

, coming from

source.readTo

calls. Gonna try making an implementation that doesn't depend on

readTo

Filipp Zhinkin

06/11/2025, 2:54 PM

Maybe you could share a snippet of code where you're experiencing the performance problem? Unfortunately, I'm struggling with reconstructing it from the description 😿

Zyle Moore

06/12/2025, 5:37 AM

It's a bit of a mess, but I got it down to 16 seconds. The slow way used this decoder https://github.com/Zymus/cultoftheancestormoth/blob/1f46419ca11b9ed4c7d9a8e9d2af02[…]theancestormoth/serialization/encoding/BethesdaBufferDecoder.kt The faster way uses this decoder https://github.com/Zymus/cultoftheancestormoth/blob/initial-format-codec/serializa[…]cestormoth/serialization/encoding/ByteStringStreamingDecoder.kt The steps I'm trying to run in both places is this https://github.com/Zymus/cultoftheancestormoth/blob/1f46419ca11b9ed4c7d9a8e9d2af02[…]in/kotlin/games/studiohummingbird/cultoftheancestormoth/Load.kt But the browser was too slow, without doing any of the actual file reading, just the accumulating of tokens https://github.com/Zymus/cultoftheancestormoth/blob/1f46419ca11b9ed4c7d9a8e9d2af02[…]mes/studiohummingbird/cultoftheancestormoth/web/PluginViewer.kt

thank you color 1

Zyle Moore

06/12/2025, 6:25 PM

I don't know if it exists already, but what would be nice is a re-readable Buffer over a fixed ByteString. I didn't want to really make the little-endian code, especially once I found out the

.toInt

methods for the primitives fill the remaining upper bits with the sign bits. Made

OR

ing the values require a mask to get the correct values. Additionally, I still needed the stream-like interaction, reading from the first bit to the last in order. But I had to effectively split up this very long ByteString into smaller chunks at certain points, and

readTo

would make a copy. So I tried to figure something out without using

readTo

or use a copy.

2 Views

Open in Slack

Previous Next