Hi, I am relatively new to KotlinDL and have a que...
# kotlindl
m
Hi, I am relatively new to KotlinDL and have a question regarding the usage of a custom dataset. I want to do some Object-Recognition and have a labeled dataset in the YOLO format. Meaning: dataset/images/train/*.png dataset/images/test/*.png dataset/labels/train/*.txt dataset/labels/test/*.txt A given image always corresponds to a given txt file by the name. For instance
1.png
in the
/images/train/
directory corresponds to
1.txt
in the
labels/train
directory. A txt file is either empty (if no object is in the corresponding image) or contains rows like
0 0.403260 0.570903 0.090604 0.160194
. Each row represents an object labeled in the corresponding image where the first element is the class and the other 4 elements describe the bounding boxes top left and bottom right coordinates. I want to use the
OnHeapDataset
. I was a bit confused regarding the usage of FloatArray everywhere but I think I got the
featuresExtractor
working:
Copy code
featuresExtractor = { basePath ->
            println("extracting features in $basePath")
            val folder = File(basePath)

            val res = folder.listFiles().map { image ->
                println("converting ${image.absolutePath}")
                val imageWithAlphaChannel = ImageIO.read(image)

                val imageWithoutAlphaChannel = BufferedImage(
                    imageWithAlphaChannel.width,
                    imageWithAlphaChannel.height,
                    TYPE_INT_RGB
                ).apply {
                    for (x in 0 until imageWithAlphaChannel.width) {
                        for (y in 0 until imageWithAlphaChannel.height) {
                            val rgb = imageWithAlphaChannel.getRGB(x, y)
                            setRGB(x, y, rgb)
                        }
                    }
                }


                pipline.apply(imageWithoutAlphaChannel).first
            }.toTypedArray()
            println("finished extracting features in $basePath")
            res
        }
However, I am unsure how to build the
labelExtractor
since it awaits
(String, Int) -> FloatArray
. So I am am unable to encode the described row structure (containing class, x1, y1, x2, y2) into an
FloatArray
as I would need a
Array<FloatArray>
to encode the information for the bounding box. I hope my questions make sens and that I am not completely on the wrong track 😄
j
Hi, thank you for the question. Unfortunately, current dataset classes only support classification, so creating a
OnHeapDataset
for your task is not currently possible.
m
Alright, that's good to know. Ill keep an eye on the issue. thank you 🙂