<@U8AT4H5K7> Hello, could you provide an update on...
# datascience
a
@Pavel Gorgulov Hello, could you provide an update on this issue ?https://github.com/Kotlin/multik/issues/92 We are seeing very high memory usage spikes from 750MB to 9GB of RAM after preprocessing where we create a tensor of size 500, 3, 512, 612 using
Copy code
mk.ndarray(it, batchSize, 3, targetHeight, targetWidth)
I tried chunking it to batches of 4 but I still hit 5GB RAM usage from this line
đź‘€ 1
p
What is “it” here?
If you are working with arrays of a specific type, you can convert them to a primitive array:
Copy code
val arr: Array<Float>
...
arr.toFloatArray()
This will still increase memory consumption, but not by that much. The problem is general and related to boxing/unboxing. Adding methods for
Array<Number>
is not difficult and they will appear in the new release as soon as I am done with kotlin-native and build for Arm.
đź‘€ 1
a
it
is a list of FloatArray. Basically I have images that get preprocessed using KotlinDL, each image gives a FloatArray, then I convert the images (list of FloatArray) to a MultiArray<Float, D4> using MultiK Let me try your suggestion of using Arrays.
p
In ndarray, the data is one flat primitive array, that is, in your case, you need to collect all the FloatArray into one array. You have to be careful here, because if you do it iteratively or use a List for this, much more memory can be used than necessary. You also need to be careful with the dimensions. Unfortunately, there are no convenient methods for this out of the box. We have plans for the future to add Multik to KotlinDL, maybe @Julia Beliaeva or @zaleslaw will comment on this point
đź‘€ 1