Interesting fact It seems like using newer Oracle GraalVM JI kotlinlang #mathematics

Join Slack

Interesting fact. It seems like using newer Oracle...

# mathematics

altavir

02/18/2024, 6:11 AM

Interesting fact. It seems like using newer Oracle GraalVM JIT significantly improves performance even for Multik

altavir

02/18/2024, 6:15 AM

I see about 3x performance imporvement for EJML and about 2x improvement for Multik in the same benchmarks.

altavir

02/18/2024, 7:12 AM

Another fun fact. Switching two lines in KMath (turning on parallel buffer processing) allows to rump up performance almost to the level of Multik. I am not sure if parallel processing should be turned on by default...

Iaroslav Postovalov

02/18/2024, 4:17 PM

Which two lines? I don't get the context.

altavir

02/18/2024, 4:18 PM

Check the new pull request. I've added a new version of LinearAlgebra for Jvm that introduces parallel processing for matrix building. It works quite well on dot operation.

Iaroslav Postovalov

02/18/2024, 4:27 PM

So, it's just JVM's parallel stream? Then it's not surprising, because it's just MISD. Clearly should it been tried before...

Iaroslav Postovalov

02/18/2024, 4:28 PM

We have spent many efforts to achieve SIMD with Viktor or ND4J.

altavir

02/18/2024, 4:30 PM

Yes. I mean that usually, it is not effective for mathematical operations because the overhead from parallel processing is comparable to the benefit. But the dot operation specifically benefits a lot from it. Operations for each i and j could be done in parallel. And we do not need to make all operations parallel, only this one. We do it by creating a context for optimisation of a specific operation.

altavir

02/18/2024, 4:31 PM

I checked the same context with other operations and it does not work well.

Iaroslav Postovalov

02/18/2024, 4:32 PM

GPU computations are beneficial from parallelism exactly because of lower overhead you mentioned.

altavir

02/18/2024, 4:33 PM

It is not lower. The computation itself is cheaper, but data transfer is much more expensive. GPU works well when you can load all your data at once and not update it.

Iaroslav Postovalov

02/18/2024, 4:36 PM

Do you have a benchmark comparing kmath-multik, kmath-multik with parallel stream and pure multik?

Iaroslav Postovalov

02/18/2024, 4:37 PM

Oh, I am wrong.

Iaroslav Postovalov

02/18/2024, 4:38 PM

Clearly, multik has their own

dot

. And I'm sure kmath when wraps multik gives no or minimal overhead.

altavir

02/18/2024, 4:39 PM

Pure multik is the same as KMath-multik since KMath is a thin wrapper on top. But yes. I've done exactly that. On Oracle GraalVM Multik is about 3-4 times faster than kmath with parallel processing. And about 20 times faster than Kmath without parallel processing.

altavir

02/18/2024, 4:39 PM

At the same time KMath parallel is faster that both Ejml and TensorFlow-CPU

Iaroslav Postovalov

02/18/2024, 4:41 PM

I think there's no use case for kmath-core with parallelism involved, when it's anyway worse than wrappers.

altavir

02/18/2024, 4:42 PM

It is a good example. And it IS better than most of wrappers. Multik is good, but it is rather limited. Dot operations is practically the only non-trivial operation it could do.

altavir

02/18/2024, 4:43 PM

So if one wants reasonable flexibility, but has a botleneck at dot operation, it could help. You just need to do a context switch.

Iaroslav Postovalov

02/18/2024, 4:43 PM

If it's just an example, then more experiments should be done, with tensors, for instance.

altavir

02/18/2024, 4:44 PM

Indeed. But tensors require a lot of clean up, which I can't do right now.

3 Views

Open in Slack

Previous Next