< Pavel Gorgulov> Hello could you provide an update on this kotlinlang #datascience

<@U8AT4H5K7> Hello, could you provide an update on...

Alexandre Brown

04/03/2022, 5:45 AM

@Pavel Gorgulov Hello, could you provide an update on this issue ?https://github.com/Kotlin/multik/issues/92 We are seeing very high memory usage spikes from 750MB to 9GB of RAM after preprocessing where we create a tensor of size 500, 3, 512, 612 using

Copy code

mk.ndarray(it, batchSize, 3, targetHeight, targetWidth)

I tried chunking it to batches of 4 but I still hit 5GB RAM usage from this line

👀 1

Pavel Gorgulov

04/03/2022, 12:50 PM

What is “it” here?

Pavel Gorgulov

04/03/2022, 1:01 PM

If you are working with arrays of a specific type, you can convert them to a primitive array:

Copy code

val arr: Array<Float>
...
arr.toFloatArray()

This will still increase memory consumption, but not by that much. The problem is general and related to boxing/unboxing. Adding methods for

Array<Number>

is not difficult and they will appear in the new release as soon as I am done with kotlin-native and build for Arm.

👀 1

Alexandre Brown

04/03/2022, 1:18 PM

it

is a list of FloatArray. Basically I have images that get preprocessed using KotlinDL, each image gives a FloatArray, then I convert the images (list of FloatArray) to a MultiArray<Float, D4> using MultiK Let me try your suggestion of using Arrays.

Pavel Gorgulov

04/03/2022, 2:03 PM

In ndarray, the data is one flat primitive array, that is, in your case, you need to collect all the FloatArray into one array. You have to be careful here, because if you do it iteratively or use a List for this, much more memory can be used than necessary. You also need to be careful with the dimensions. Unfortunately, there are no convenient methods for this out of the box. We have plans for the future to add Multik to KotlinDL, maybe @Julia Beliaeva or @zaleslaw will comment on this point

👀 1

8 Views

Open in Slack

Previous Next