altavir
05/03/2021, 5:54 PMaltavir
05/03/2021, 6:01 PMdf = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
While such things are not possible in a type-safe language, we can do things, that are similar yet much more safe:
val a by symbol
val b by symbol
val df = DataFrame{
a(1..4)
b("M", "F", "F", "M")
}
column constructors here could be done via member extension like Symbol.invoke()
. Currently they all must be pre-defined in the builder, but with Multi-receivers we can add them as extensions.altavir
05/03/2021, 6:03 PMjulia> df.A
Currently there is an ongoing effort by the Kotlin DataFrame team to do that via staged compilation, but in KMath we found another way. We can use pre-defined symbol objects as identifiers. so we could do things like this:
val A by symbol
df[A]
One could use column definitions instead and get a type-safe accessor.altavir
05/03/2021, 6:05 PMaltavir
05/03/2021, 6:08 PMjulia> df[(df.A .> 500) .& (300 .< df.C .< 400), :]
Well, it is hard to do that and I am pretty sure we should not do it this way. It is better to create a DataFrameQuery object and create a builder for it like this:
df.query{
a{it>500}
c{it in 300..400}
}
altavir
05/03/2021, 6:10 PMaltavir
05/03/2021, 6:12 PMIlya Muradyan
05/03/2021, 9:00 PMSymbolic accessorsAFAIK it is already implemented this way https://github.com/nikitinas/dataframe/blob/bc9feae6d8200499bf7d7a0fbb45bb0491b90e[…]rc/test/kotlin/org/jetbrains/dataframe/person/DataFrameTests.kt
Ilya Muradyan
05/03/2021, 9:03 PMAccessor by expressionNice point, but simple (DataFrame) -> Boolean filter is generally enough
Ilya Muradyan
05/03/2021, 9:07 PMaltavir
05/04/2021, 7:24 AMaltavir
05/07/2021, 10:15 AMaltavir
05/07/2021, 10:16 AMholgerbrandl
07/16/2021, 8:26 PM