Lazar Prijović
03/15/2024, 2:43 PM@OptIn(ExperimentalUnsignedTypes::class, UnsafeWasmMemoryApi::class)
fun wasmSimdShowcase(): UByteArray {
val vector1: Vec128
// (1) We can initialize v128 with the raw memory
withScopedMemoryAllocator { allocator ->
val pointer = allocator.allocate(16)
for (i in 0..15) {
pointer.plus(i).storeByte(0x03)
}
vector1 = pointer.loadV128()
}
// (2) We can initialize v128 with const UBytes
val vector2 = v128OfUBytes(
0x00_u, 0x00_u, 0x00_u, 0x00_u,
0x01_u, 0x02_u, 0x03_u, 0x04_u,
0xA0_u, 0xB0_u, 0xC0_u, 0xD0_u,
0xFF_u, 0xFF_u, 0xFF_u, 0xFF_u
)
return withScopedMemoryAllocator { allocator ->
val pointer = allocator.allocate(16)
// (3) We can add one v128 to another one and (4) store v128 within the raw memory
pointer.storeV128(vector1 + vector2)
val result = UByteArray(16)
for (i in 0..15) {
result[i] = (pointer + i).loadByte().toUByte()
}
result
}
println(wasmSimdShowcase().joinToString { it.toHexString() })
// Output:
// 03, 03, 03, 03, 04, 05, 06, 07, a3, b3, c3, d3, 02, 02, 02, 02
}
The WAT output of wasmSimdShowcase
function:
Interesting parts:
(local $0_vector1 v128)
(local $9_vector2 v128)
...
local.get $5_pointer ;; type: kotlin.wasm.unsafe.Pointer
v128.load align=1
local.set $0_vector1 ;; type: kotlin.wasm.internal.vectypes.Vec128
...
v128.const 0 0 0 0 1 2 3 4 160 176 192 208 255 255 255 255
local.set $9_vector2
...
;; Inlined call of `kotlin.wasm.internal.vectypes.Vec128.plus`
block (result v128)
nop
local.get $0_vector1 ;; type: kotlin.wasm.internal.vectypes.Vec128
local.set $15_this
local.get $9_vector2 ;; type: kotlin.wasm.internal.vectypes.Vec128
local.set $16_other
local.get $15_this ;; type: kotlin.wasm.internal.vectypes.Vec128
local.get $16_other ;; type: kotlin.wasm.internal.vectypes.Vec128
i8x16.add
br 0
end
v128.store align=1
The full WAT output:
https://gist.github.com/madrazzl3/704e96cba51a86221f9a862fae25a74eArtem Kobzar
03/15/2024, 4:00 PMSvyatoslav Kuzmich [JB]
03/15/2024, 7:05 PMv128
can be interpreted as different values like f32x4
and i8x16
. We would need to have multiple plus
methods, loosing the operator
in fun
. Or have multiple vector types, alternatively.
• Vec128
, as a primitive value type, needs equals
and hashCode
overrides. And, perhaps, a toString
as well.
• Current number types have corresponding array types. We likely would need a Vec128Array
to extract value from SIMD in the regular app code. Linear memory API support is great, but it is mostly intended for interop needs and is quite limited.
• I would expect methods for packing vectors (like v128OfUBytes
) to work with non-constants too. Constant case could be an added optimisation (unless binaryen can do this already).