Chasm has <a new release> :tada: The focus of thi...
# webassembly
c
Chasm has a new release 🎉 The focus of this release is largely performance, and its quite significant. I can give a bunch of benchmarks but the easiest way to explain the difference is the time chasm takes to pass the official wasm testsuite. This testsuite has thousands of wasm binaries and hundreds of thousands of tests, previously this took around 8 minutes on the JVM and now chasm completes the testsuite in around 6 seconds. Blog post coming soon with details for those interested. Performance will continue to be the focus ahead of the 1.0, as theres still some low hanging fruit I’d like to tackle before then. Also this release includes 2 of the 3 phases needed for the threads proposal, meaning you can decode and validate modules with the threads proposal bytecode. Native will use a small rust library to do actual atomics whilst JVM will probably emulate atomics with fine grained mutexes. Unfortunately androids support for VarHandle is very recent and the JVM has no other means of performing atomics at arbitrary memory addresses. Anyways execution phase will come in the next release Under the hood chasm now uses the previously mentioned rust library to manage memory on native platforms, this library leverages mmap and thus is able to page memory in on demand rather than allocating it up front. This also unlocks a bunch of future optimisations like the use of guard pages. In order to integrate it I built a small gradle plugin that consumes static libraries from a url (I use github releases), its really trivial but I found it to be useful in keeping the rust project, its toolchain and sources out of my kmp project. This is the last release before the year ends, if you get some time over the holidays give chasm a spin and let me know your thoughts. Next year we 1.0 and then its on to the component model 😬
👍 10
👍🏾 1
e
accessing
sun.misc.Unsafe::compareAndSwap*
like older desktop JVMs should work on Android
c
So unsafe has a few problems 🥲 It doesn’t support all the atomic operations I need for wasm but aside from that its being removed in future releases of the JVM now that safe alternatives are available. I guess I could do some sort of multirelease jar but I’m not sure ART would be able to work with that and its just a ballache. I might try to just use JNI bindings or Panama to bind my rust lib in the future, but for now its not a major priority
m
I am wondering how iOS support is going to work. If Apple doesn’t allow Java on iOS why is WASM via chasm supposed to work then?
e
says who? you can ship your own Java VM on iOS, and WASM is much simpler
m
The problem seems to be the JIT. See, e.g., https://github.com/LuaJIT/LuaJIT/issues/1072 Probably this is the reason why there is no official JDK for iOS. If you fully interpret the bytecode or use an AoT compiler then it’s doable. I am just not sure into which category chasm falls.
e
there are a number of non-JIT interpreters for Java, including Zero within HotSpot
m
Yes, that’s the sentence that I had in mind: This is due to the fact that Apple does not allow dynamic code generation on iOS ARM platforms. This restriction prohibits using the Hotspot dynamically generated template interpreter since it is generated at runtime.
e
I'm refuting your statements above. interpreter-only mode is configurable in Oracle OpenJDK and there is an official OpenJDK iOS port. iOS doesn't ban it or other interpreters as long as they follow other restrictions as well (no direct API access, no arbitrary downloaded code). there are iOS apps shipping now with Java or WASM runtimes. Chasm is not a JIT.
c
Virtual machines like the JVM or V8 which is what chrome uses to run JS/Wasm typically have “tiers” of execution. The first tier is always an interpreter, call it Tier 1, then Tier 2 would be a Baseline JIT, Tier 3 a slow optimising JIT then Tier 4 AOT etc. All Tiers aside from Tier 1 require you to “generate code” , the problem with this in mobile applications is that your app cannot mark a page of memory it creates as executable, so those tiers are not possible given the mobile sandbox. Tier 1 is always possible and as @ephemient pointed out you can typically specify how you want the VM to execute. Chasm only includes tier 1 for now
m
@Charlie Tapping Thanks for the clarification. That’s what I wanted to know. To complete the picture, AOT would also be possible if done at build time and not on the device (or would that already be called Tier 5?)
e
The first tier is always an interpreter
not always; V8 didn't have one before https://v8.dev/docs/ignition, its lowest tier was the baseline JIT. and as WASM was designed for compilation, most implementations (https://v8.dev/docs/wasm-compilation-pipeline https://firefox-source-docs.mozilla.org/js/index.html#wasm-baseline-rabaldrmonkey) don't have an interpreter either, they only have tiers of compilers. as far as I know, of the major implementations, only wasmtime is working on an interpreter https://github.com/bytecodealliance/rfcs/blob/main/accepted/pulley.md
(but aside from WASM, most VMs have a baseline interpreter)
@Michael Paus
AOT would also be possible if done at build time
AOT to what? Chasm doesn't produce code, it's an interpreter
@Charlie Tapping I see that you are using
ByteArray
for linear memory on JVM. I'm not seeing what parts of the WASM threading spec you can't implement with
Unsafe
? they should all be implementable by
Unsafe.compareAndSwap
even if you have to use a wider type, e.g.
Copy code
private val unsafe = Unsafe::class.java.getDeclaredField("theUnsafe").apply { isAccessible = true }.get(null) as Unsafe
private val baseOffset = unsafe.arrayBaseOffset(ByteArray::class.java)
    .also { check(unsafe.arrayIndexScale(ByteArray::class.java) == Byte.SIZE_BYTES) }

fun ByteArray.getAndAdd(index: Int, arg: Byte): Byte {
    val offset = baseOffset + index.and(3.inv())
    var value: Byte
    do {
        val expected = unsafe.getIntVolatile(this, offset.toLong())
        value = expected.shr(index.and(3) * Byte.SIZE_BITS).toByte()
        val desired = expected.and(0xFF.inv() shl index.and(3) * Byte.SIZE_BITS) or
            (value + arg).and(0xFF).shl(index.and(3) * Byte.SIZE_BITS)
    } while (!unsafe.compareAndSwapInt(this, offset.toLong(), expected, desired))
    return value
}
but perhaps it would be better to switch to
ByteBuffer
- you can get
IntBuffer
etc. views, and they can be either array-backed or direct (off-heap memory)
m
@ephemient I mentioned AOT more as a conceptional possibility to point out that code generation is not generally impossible. Only if you do it on the device inside the app.
c
@Michael Paus You can AOT wasm programs at built time if you know the target architecture yeah, in other runtimes that is, chasm doesn’t have this planned anytime soon.
not always; V8 didn’t have one before https://v8.dev/docs/ignition, its lowest tier was the baseline JIT. and as WASM was designed for compilation, most implementations (https://v8.dev/docs/wasm-compilation-pipeline https://firefox-source-docs.mozilla.org/js/index.html#wasm-baseline-rabaldrmonkey) don’t have an interpreter either, they only have tiers of compilers. as far as I know, of the major implementations, only wasmtime is working on an interpreter https://github.com/bytecodealliance/rfcs/blob/main/accepted/pulley.md
So wasms a bit of a weird exception as it wasn’t designed for its bytecode to be interpreted (because originally it was for the web and thus V8 w JIT), but broadly when building a VM an interpreter is the first thing you build. This is because its the fastest way to verify your decoding is correct, in fact if you check any of the wasm proposals you’ll notice they all include a reference interpreter for this reason.
I see that you are using
ByteArray
for linear memory on JVM. I’m not seeing what parts of the WASM threading spec you can’t implement with
Unsafe
? they should all be implementable by
Unsafe.compareAndSwap
even if you have to use a wider type, e.g.
The instructions I need to support are here if you’re interested. When I looked at the unsafe docs I could see that it didn’t have 1:1 equivalents and some instructions would require a CAS or more than one instruction which deterred me. But i guess more important unsafe if being removed from java in the coming versions as there are now safe equivalents.
but perhaps it would be better to switch to
ByteBuffer
- you can get
IntBuffer
etc. views, and they can be either array-backed or direct (off-heap memory)
Originally I used ByteArray as ByteBuffer isn’t KMP friendly, but now I have implemented the native memory impl with rust I should be able to specialise this. ByteBuffer also has a 2GB limit which is annoying as I would have to stitch multiple together to make it work. Which then means more indirection on memory access, if ART gets a move on I might be able to use this in the future
e
I think it'll be faster to use Unsafe where it is available and use JNI when it is not available (which can do the same kind of operations on ByteArray that Unsafe does, but can't be inlined by the JVM so there's always FFI call overhead)
and yeah FFM is out of incubator in Java 22 so you can already use it already