Thread
#mathematics
    Ролан

    Ролан

    1 year ago
    Thanks to Andrey Kislitsin, we have a example of a neural network on kotlin mutliplatform using kmath's tensors and including both forward and backward pass (so you can train it everywhere) https://github.com/mipt-npm/kmath/blob/feature/tensor-algebra/examples/src/main/kotlin/space/kscience/kmath/tensors/NeuralNetwork.kt
    h

    Hampus Londögård

    1 year ago
    Awesome! How big is the performance penalty in comparison to using Python (or well, Python DSL to Torch/TF one could call it as you don’t really write python 😅)? Are the tensors ever shuffled to the JVM or they stay native until you try to print?
    Ролан

    Ролан

    1 year ago
    no it's all KMP including tensors, no dependencies on anything and you can run it anywhere KMP works. In terms of performance, that wasn't our concern yet.
    h

    Hampus Londögård

    1 year ago
    KMP = Kotlin Multiplatform? I ment more in the idea, do you shuffle the tensors into DoubleArrays or do you keep them as native (like pytorch and others usually do) so that the operations happens through the original C/C++-code?
    Ролан

    Ролан

    1 year ago
    Tensors are backed by
    DoubleArray
    indeed
    there is nothing native
    in that sense
    yes for kotlin multiplatform, and that was the point of the exercise. Now perfomance wise we need to see, but I think we look more for functionality right now. You won't train mega networks in a browser after all.
    h

    Hampus Londögård

    1 year ago
    Oh, I thought this was related to your previous pytorch contribution 😄 But this is really cool, seems simple enough to code that you could easily fit a framework on top which abstracts it into using lambda functions. Typed lambda functions + DL is something I’ve wanted for a while, Python simply doesn’t cut it. Really cool contribution! (unrelated) Do you happen to know if there’s any progress on supporting the new Vector-api on the JVM for kmath?
    Ролан

    Ролан

    1 year ago
    Thanks )), no the pytorch story is perpendicular to that. In fact, we wanted the user to be able to prototype simple things in a lightweight framework before getting monsters like pytorch, tf or dl4j. I am sorry for the vector-api, I haven't heard anything yet
    altavir

    altavir

    1 year ago
    Current work i the prelude to pytorch integration. We need to understand how to make better API for that. As for performance, it is not yet optimized but after optimization I expect the difference with native solution less than factor of 3. It is also possible to make easy parallelization and lazy computation optimization, so we can even win in some places.
    As for vectors, do you mean https://openjdk.java.net/jeps/338? It is on the roadmap, but our current research show that there is already good automatic vectorization in latest JVMs, including GraalVM. Also there is Viktor project and we have bindings for it (not for tensors though).
    h

    Hampus Londögård

    1 year ago
    That’s the one yeah. Agreed auto-vectorization is good. But when you know you want it from the get-go it could make sense to code for it, rather than hoping the JVM is smart or that the loop simply runs enough times and is tight 🙂 F64 is more precision than I would’ve preferred hehe.
    I’ll most likely be migrating my project (londogard-nlp-toolkit) to kmath in the future, I really like the idea of swapping backends. For now I’ve simply used Ejml (which you wrap) because there’s no expensive native-interop when running single math operations. But once I introduce ML-models & DL I’ll have to use something else I think, at least for the DL-models.
    altavir

    altavir

    1 year ago
    Indeed. The issue is here: https://github.com/mipt-npm/kmath/issues/249. It is marked as waiting for external contributions. so I hope some students will work on it soon. Meanwhile, as said, we get very good results with GraalVM automatic vectorization.
    Ролан

    Ролан

    1 year ago
    @altavir talking about performance is misleading here. Deep Learning is just made for GPU. (I am also advocating that beyond DL but that's another story). We are trying to offer some functionality in places where you cannot afford huge GPUs - and there are really a lot of such applications. But you have to forget about performance.
    altavir

    altavir

    1 year ago
    Indeed, I was talking about CPU only. Doing GPU directly from JVM would be hard.
    i

    Iaroslav Postovalov

    1 year ago
    It's simply impossible because Cuda like APIs can't be created natively for JVM, so the FFI overhead in different forms is unavoidable.
    altavir

    altavir

    1 year ago
    It is possible for example with http://www.jcuda.org/. You can't create shared memory with the gpu anyway. But the work is tedious. They are experimenting with it right now in MultiK.
    i

    Iaroslav Postovalov

    1 year ago
    It is FFI, too.
    Ролан

    Ролан

    1 year ago
    @altavir those are just java bindings to C wrappers of CUDA libraries, you still cannot integrate your own CUDA kernels like you would in python or C++. You would have to pass by JNI with all the pain.
    altavir

    altavir

    1 year ago
    I've actually used opencl bindings and it does not require jni. As for object copy, you need to do that anyway to work with gpu.
    Ролан

    Ролан

    1 year ago
    Of course with OpenCL you can send your shader programs from the JVM - they are just strings. In C++ you can use
    boost_compute
    namespace bc = boost::compute;
    auto src_code = std::string_view{
    "float circle_area_gpu(Circle c) { "
    " float pi = 3.14f;
    "
    " return c.r * c.r * pi;
    "
    "}
    "
    };
    auto circle_area_gpu = bc::make_function_from_source<float(Circle)> (
    "circle_area_gpu", src_code.data()
    );
    altavir

    altavir

    1 year ago
    Yes and things like aparapi do the same for Java. I think there is some kind of idea to use Kotlin IR to produce kernels for CUDA/OpenCL, but it is not implemented yet.
    Ролан

    Ролан

    1 year ago
    Yes, I was looking at TornadoVM as well, it looks great
    altavir

    altavir

    1 year ago
    I've never used it. At a time when I played with OpenCL, it has only first appeared. But basically it is the same idea as in your sample, we generate kernels from the Java bytecode dynamically. Kotlin IR would be even more avanced in this regard since it is higher level.
    Ролан

    Ролан

    1 year ago
    I don't know whether TornadoVM or Aparapi somehow integrade nvcc compilation on the fly like numba does in python, or maybe it's all OpenCL
    altavir

    altavir

    1 year ago
    🤷‍♂️