The main benefit of the type system and algebras as far as I can tell are for user friendliness and to provide a unifying API, but when it comes to actually executing these things on a machine, you don't want to be calling functions or even using the JVM. At a low level, it all gets optimized into binary arithmetic, but the goal should be to have a robust type system to check the user level program, then throw everything away. No lambdas, generics, no types, no invokedynamic, no virtual functions, everything runs as fused sparse BLAS primitives on a native interface to the C/G/T/IPU. Then the question becomes which intermediate representation is the most efficient way to access the performance you need for various workloads. This is an interesting problem, but I'm more concerned about the semantics and expressiveness of the API