Hello, I have trouble making the compiler recognis...
# datascience
i
Hello, I have trouble making the compiler recognise a dataframe's columns. Context • I have a dataframe with some columns • Based on row values within these columns, I want to create another column with some specific value (this is essentially the
CASE WHEN
syntax in SQL) Example syntax
Copy code
val df2: DataFrame<*> = df1
    .add("derivedCol") {
        when {
            colA == 0 -> "NewValueA"
            colB == 0 -> "NewValueB"
            else -> "Error"
        }
    }
    .remove { colA and colB }
In Kotlin Notebook • Both
colA
and
colB
are marked with an 'Unresolved reference' error • However, the dataframe
df2
is still created without issue, and I can manipulate and work with it in subsequent cells (along with the new column
derivedCol
) • The code in the notebook is for prototyping; once ready, it will be transferred to a Kotlin file In source code • I am running the same code as I've done in the notebook above • The same error is being shown for
colA
and
colB
• Now, however, I am unable to build my code My questions • How does one make a dataframe's column recognisable to the Kotlin compiler? • Why did it work in the Kotlin notebook but not in my Kotlin file?
1
j
Hi! First of all, thank you for your extensive explanation about DataFrame in this channel :) Your question is about accessing columns via auto-generated extension properties, right? This is described here in the docs: https://kotlin.github.io/dataframe/extensionpropertiesapi.html The docs link to how to achieve this in Gradle projects and in notebooks, which behave differently here. As TL;DR, in notebooks, these accessors are generated in between cell executions bases on the schema/data in the dataframe https://kotlin.github.io/dataframe/schemasjupyter.html (If they appear unresolved but execute fine, this is likely a bug in the notebook plugin, which is also still under active development :) ) In Gradle projects we have a Gradle plugin that can generate these automatic accessors for @DataSchema interfaces/classes you define yourself or you can let them be generated for you based on a data sample or types definition https://kotlin.github.io/dataframe/schemasgradle.html#execute-the-assemble-task-to-generate-type-safe-accessors-for-schemas We're also working on a compiler plugin that can generate these accessors on the fly, without having to define a @DataSchema yourself, but this is still experimental https://github.com/Kotlin/dataframe/issues/704 I hope this answers your question :)
i
Hello @Jolan Rensen [JB], I got it to work! Thanks for pointing me to the docs and pardon the delayed reply
🎉 1