Hey folks is it possible to make use of any zero copy serial kotlinlang #webassembly

Hey folks, is it possible to make use of any zero-...

Alexey Zolotarev

02/18/2025, 8:07 AM

Hey folks, is it possible to make use of any zero-copy serialization protocols (like https://capnproto.org/) with Kotlin Wasm/Wasi target in context of exchanging data with the host? I guess it is not possible directly but maybe it would be possible with the use of cinterop somehow (not sure if it would work with Wasm)?

Michael Paus

02/18/2025, 9:28 AM

How is that supposed to cross the Wasm module sandbox boundary?

Alexey Zolotarev

02/18/2025, 9:35 AM

Through the linear memory

Michael Paus

02/18/2025, 9:41 AM

With zero copy?

Alexey Zolotarev

02/18/2025, 9:42 AM

Obviously data will be copied to the linear memory but deserialization can in theory be zero-copy

Michael Paus

02/18/2025, 9:42 AM

Ok, if you just mean the deserialization part.

Alexey Zolotarev

02/18/2025, 9:45 AM

I mean zero-copy serialization protocols in general. If it would be possible to allocate original data in the linear memory too instead of using WasmGC memory, that works too.

ephemient

02/18/2025, 9:57 AM

Kotlin types (

String

etc.) all live in GC memory

ephemient

02/18/2025, 9:58 AM

you could create a structure in linear memory and only interact with it through pointers but you'd have to copy to GC in order to use standard Kotlin functions

Michael Paus

02/18/2025, 10:01 AM

That’s what I anticipated. It does not seem to make sense to me.

Alexey Zolotarev

02/18/2025, 10:06 AM

you could create a structure in linear memory and only interact with it through pointers but you'd have to copy to GC in order to use standard Kotlin functions

Yes, this I understand. I'm wondering if there might be some sort of way (e.g. with cinterop maybe) to implement wrappers around objects allocated in the linear memory that would act as native Kotlin types. E.g. Go has

unsafe.Pointer

that can be used for that purpose. Of course, with Go/Wasm, everything is allocated in linear memory.

Alexey Zolotarev

02/18/2025, 10:12 AM

For the context, I'm experimenting with a small app that consists of several wasm modules implemented using different stacks, each module needs access to a large utf8 string produced by a different Wasm module. With Kotlin it currently requires coping the data 2 times essentially - once from linear memory to a

ByteArray

, another to convert the byte array into a

String

. This becomes a problem when the data is large. There are some options like implementing streaming with some smaller buffer or using Wasi files but it still requires making copies.

Alexey Zolotarev

02/18/2025, 10:17 AM

There isn't even a way to create a

ByteArray

mapped to a linear memory. Even JS has this option with

Uint8Array

ephemient

02/18/2025, 10:19 AM

you could create your own (as a pointer + length structure) but "bytearray whose storage may be deallocated" cannot be a

ByteArray

ephemient

02/18/2025, 10:20 AM

IIRC wasm gc proposal did include sharing references to bytearrays in GC memory between module instances. I'm not sure if it made it to the final spec though

🤔 1

Alexey Zolotarev

02/18/2025, 10:23 AM

you could create your own (as a pointer + length structure) but "bytearray whose storage may be deallocated" cannot be a
ByteArray

Yes, and that means I can't use any functions that operate on byte arrays so this wouldn't achieve much unfortunately.

Charlie Tapping

02/19/2025, 11:49 AM

strings in webassembly exist in lineary memory not in the wasm gc, theres different ways to encode strings. Heres some documentation: https://github.com/CharlieTap/chasm/commit/8ef18e56f2e5cfa4742a6ff0f718170b1511f2d0

Alexey Zolotarev

02/19/2025, 1:34 PM

Hi @Charlie Tapping, thank you. Let me clarify, this documentation you referenced is for the host, correct? I was talking more about the guest side.

Say you have a function like the following which you compile to WebAssembly:

```kotlin

fun concat(input: String): String

```

The compiler will either output:

In case of Kotlin as a guest it would not compile such function since

@WasmImport

and

@WasmExport

only support primitive types that can be passed through the stack. I.e

Copy code

@WasmExport("process")
fun process(input: String)

> Unsupported @WasmImport and @WasmExport return type String

But it's ok, say we define the function that takes a pointer and length instead of the string (actually it would be more involved than this with Kotlin since memory can't be directly allocated from the host to copy the input string to). Now when the function is called, the guest can read the string from the linear memory at the specified location but it would have to first read it into a

ByteArray

and then decode it to

String

, hence 3 copies of the data are made. I can implement a function that would decode the string from the linear memory directly I guess but in general the problem is that I can't use most of the stdlib (or many other libs) functions directly for data in the linear memory without copying it out into some builtin type first. Except maybe

<http://kotlin.io|kotlin.io>

for which

RawSource

over linear memory can be implemented most likely 🙂

Charlie Tapping

02/19/2025, 1:55 PM

Yeah the above is about the embedding api for the host, but that might be useful as you can define host functions and import them. Which would give you more control than say, writing code in kotlin, compiling it to wasm instructions and hoping both the instructions implement zero copy semantics and also the runtime is doing zero copy whilst executing them. Without deep diving it I would say you’re very unlikely to get zero copy anything writing code in kotlin or another gc language and compiling it to wasm, you’d have more chance doing it in a language where you have pointers, pinned memory and you can just unsafe cast things

Alexey Zolotarev

02/19/2025, 2:07 PM

Yes, this is what it looks like. To be fair, Go is a language with GC too but it does make it possible to implement zero-copy in case of byte arrays at least. Unfortunately it has its own set of issues where Wasm is concerned. Like huge binaries, slow memory allocations, "big" Go lags years behind all the recent Wasm proposals, for TinyGo half of the stdlib doesn't pass tests, GC doesn't work properly, etc. Frankly it looks like at the moment there are few stable and flexible options for implementing Wasm guests with large set of 3rd party libs available - C/C++ and Rust. Of course Wasm spec itself is not very stable for the most part 🙂

34 Views

Open in Slack

Previous Next