Do you want to make a `@DataSchema` annotated inte...
# datascience
n
Do you want to make a
@DataSchema
annotated interface sealed?
y
Will that be a possibility? the way I am thinking is perhaps it provides like a generic way to pass the dataschema object down a parser function which can then call the respective sealed classes to do the transformation on the dataschema object..
n
I think we can figure something out, maybe with some extra code generation on KSP / compiler plugin side. But i still need to understand better what's the final result should be. Logic like this? (in pseudo code)
Copy code
@DataSchema
sealed interface Test {
    val a: Int
}

@DataSchema
interface Test2 : Test {
    val b: String
}

@DataSchema
interface Test3 : Test {
    val c: String
}

fun transform(df: DataFrame<*>) {
  val typed = Test.of(df)
  when (typed) {
    is Test1 -> ...
    is Test2 -> ...
    is Test3 -> ...
    null -> ...
  }
}
y
thanks, yes that was my usecase, to do extraction from the csv dataframe, transform based on custom rules, and load them to the database(iceberg tables through spark api or trino)
n
I see. So at the time of reading this csv, you know it must be one of N schemas. You asked about
isinstance
before, seems connected. I understand better now. I have this idea in mind:
Copy code
@SealedDataSchema
sealed interface Test {
    val a: Int
    
    companion object
}

@DataSchema
interface Test2 : Test {
    val b: String
}

@DataSchema
interface Test3 : Test {
    val c: String
}
generated code:
Copy code
sealed class Wrapper {
    abstract val df: DataFrame<Test>
}

class Wrapper2 : Wrapper {
    override val df: DataFrame<Test2>
}

class Wrapper3 : Wrapper {
    override val df: DataFrame<Test3>
}

fun Test.Companion.of(df: DataFrame<*>): Wrapper? {
    TODO("need to figure out this part, need to pick most suitable schema for df")
}
use:
Copy code
fun transform(df: DataFrame<*>) {
  val typed = Test.of(df)
  when (typed) {
    is Wrapper -> typed.df.a
    is Wrapper2 -> typed.df.b
    is Wrapper3 -> typed.df.c
    null -> ...
  }
}
🙌 1
y
thanks this will be very useful!
n