Hey folks, is it possible to make use of any zero-...
# webassembly
a
Hey folks, is it possible to make use of any zero-copy serialization protocols (like https://capnproto.org/) with Kotlin Wasm/Wasi target in context of exchanging data with the host? I guess it is not possible directly but maybe it would be possible with the use of cinterop somehow (not sure if it would work with Wasm)?
m
How is that supposed to cross the Wasm module sandbox boundary?
a
Through the linear memory
m
With zero copy?
a
Obviously data will be copied to the linear memory but deserialization can in theory be zero-copy
m
Ok, if you just mean the deserialization part.
a
I mean zero-copy serialization protocols in general. If it would be possible to allocate original data in the linear memory too instead of using WasmGC memory, that works too.
e
Kotlin types (
String
etc.) all live in GC memory
you could create a structure in linear memory and only interact with it through pointers but you'd have to copy to GC in order to use standard Kotlin functions
m
That’s what I anticipated. It does not seem to make sense to me.
a
you could create a structure in linear memory and only interact with it through pointers but you'd have to copy to GC in order to use standard Kotlin functions
Yes, this I understand. I'm wondering if there might be some sort of way (e.g. with cinterop maybe) to implement wrappers around objects allocated in the linear memory that would act as native Kotlin types. E.g. Go has
unsafe.Pointer
that can be used for that purpose. Of course, with Go/Wasm, everything is allocated in linear memory.
For the context, I'm experimenting with a small app that consists of several wasm modules implemented using different stacks, each module needs access to a large utf8 string produced by a different Wasm module. With Kotlin it currently requires coping the data 2 times essentially - once from linear memory to a
ByteArray
, another to convert the byte array into a
String
. This becomes a problem when the data is large. There are some options like implementing streaming with some smaller buffer or using Wasi files but it still requires making copies.
There isn't even a way to create a
ByteArray
mapped to a linear memory. Even JS has this option with
Uint8Array
.
e
you could create your own (as a pointer + length structure) but "bytearray whose storage may be deallocated" cannot be a
ByteArray
IIRC wasm gc proposal did include sharing references to bytearrays in GC memory between module instances. I'm not sure if it made it to the final spec though
🤔 1
a
you could create your own (as a pointer + length structure) but "bytearray whose storage may be deallocated" cannot be a
ByteArray
Yes, and that means I can't use any functions that operate on byte arrays so this wouldn't achieve much unfortunately.
c
strings in webassembly exist in lineary memory not in the wasm gc, theres different ways to encode strings. Heres some documentation: https://github.com/CharlieTap/chasm/commit/8ef18e56f2e5cfa4742a6ff0f718170b1511f2d0
a
Hi @Charlie Tapping, thank you. Let me clarify, this documentation you referenced is for the host, correct? I was talking more about the guest side.
Say you have a function like the following which you compile to WebAssembly:
```kotlin
fun concat(input: String): String
```
The compiler will either output:
In case of Kotlin as a guest it would not compile such function since
@WasmImport
and
@WasmExport
only support primitive types that can be passed through the stack. I.e
Copy code
@WasmExport("process")
fun process(input: String)

> Unsupported @WasmImport and @WasmExport return type String
But it's ok, say we define the function that takes a pointer and length instead of the string (actually it would be more involved than this with Kotlin since memory can't be directly allocated from the host to copy the input string to). Now when the function is called, the guest can read the string from the linear memory at the specified location but it would have to first read it into a
ByteArray
and then decode it to
String
, hence 3 copies of the data are made. I can implement a function that would decode the string from the linear memory directly I guess but in general the problem is that I can't use most of the stdlib (or many other libs) functions directly for data in the linear memory without copying it out into some builtin type first. Except maybe
<http://kotlin.io|kotlin.io>
for which
RawSource
over linear memory can be implemented most likely 🙂
c
Yeah the above is about the embedding api for the host, but that might be useful as you can define host functions and import them. Which would give you more control than say, writing code in kotlin, compiling it to wasm instructions and hoping both the instructions implement zero copy semantics and also the runtime is doing zero copy whilst executing them. Without deep diving it I would say you’re very unlikely to get zero copy anything writing code in kotlin or another gc language and compiling it to wasm, you’d have more chance doing it in a language where you have pointers, pinned memory and you can just unsafe cast things
a
Yes, this is what it looks like. To be fair, Go is a language with GC too but it does make it possible to implement zero-copy in case of byte arrays at least. Unfortunately it has its own set of issues where Wasm is concerned. Like huge binaries, slow memory allocations, "big" Go lags years behind all the recent Wasm proposals, for TinyGo half of the stdlib doesn't pass tests, GC doesn't work properly, etc. Frankly it looks like at the moment there are few stable and flexible options for implementing Wasm guests with large set of 3rd party libs available - C/C++ and Rust. Of course Wasm spec itself is not very stable for the most part 🙂