Paulo Cereda
04/11/2025, 5:27 PM.csv
file in which one the columns has a comma-separated string. I would like to split it and have the line replicated for each element. I have a working code, but it's far from optimal. Code in thread. 🧵Paulo Cereda
04/11/2025, 5:28 PMPaulo Cereda
04/11/2025, 5:28 PM// ➜ cat test.csv
// id,list
// 1,"a,b,c"
// 2,"d"
// 3,"e,f"
data class Thing(val id: Int, val list: String)
DataFrame
.readCSV("test.csv")
.toListOf<Thing>()
.flatMap { thing ->
thing.list.split(",").map { thing.copy(list = it) }
}
.toDataFrame()
.print()
// id list
// 0 1 a
// 1 1 b
// 2 1 c
// 3 2 d
// 4 3 e
// 5 3 f
Paulo Cereda
04/11/2025, 5:28 PMroman.belov
04/11/2025, 6:33 PMroman.belov
04/11/2025, 6:36 PMPaulo Cereda
04/11/2025, 6:49 PMroman.belov
04/12/2025, 7:51 AMDataFrame.readDelimStr(csv)
.split("list").by(',').inplace()
.explode("list")