Hey there, I've been messing with kotlin trying to...
# announcements
a
Hey there, I've been messing with kotlin trying to create a class that will be used by an Apache Spark enabled application. so previously I had a class that had a method:
Copy code
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
        sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
        return sc.binaryFiles(path, 1)
    }
And that was fine, It all worked properly. But now I had to add a .filter() to that returning RDD, and I'm struggling to get the code to compile, even though I don't get any errors in the IDE The smallest example I can give is:
Copy code
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
        sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
        return sc.binaryFiles(path, 1).filter { true }
    }
Which should just return true from the filter and not do anything But I get this error when trying to build:
Copy code
Error:(56, 47) Kotlin: Type mismatch: inferred type is () -> Boolean but Function1<Tuple2<String!, PortableDataStream!>!, Any!>! was expected
The closest I have got to making it work was:
Copy code
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
        sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
        return sc.binaryFiles(path, 1).filter(getFilterFunction())
    }

    private fun getFilterFunction(): scala.Function1<Tuple2<String, PortableDataStream>, Any> {
        return Function1<Tuple2<String, PortableDataStream>, Any> { tuple: Tuple2<String, PortableDataStream> ->
            return@Function1 true
        }
    }
But I get
Copy code
Error:(60, 16) Kotlin: Interface Function1 does not have constructors
Does anyone have experience with interfacing with Apache Spark from kotlin?
o
#C4W52CFEZ
a
Thanks, I'll ty posting this there
k
I've been doing Apache Spark in Kotlin for the last 6+ months.
You might want to consider using the JavaRDD API.
You're currently using the scala API which does not interop well with Kotlin lambdas.
The Java API accepts Java interfaces so the interop is orders of magnitude better.
If you need to recover the scala RDD object there is a getter for that.
a
Thanks and sorry I missed this. I'll definitely have a look into that