Copied from <#C0922A726|general> : Hey there, I've...
# datascience
a
Copied from #general : Hey there, I've been messing with kotlin trying to create a class that will be used by an Apache Spark enabled application. so previously I had a class that had a method:
Copy code
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
        sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
        return sc.binaryFiles(path, 1)
    }
And that was fine, It all worked properly. But now I had to add a .filter() to that returning RDD, and I'm struggling to get the code to compile, even though I don't get any errors in the IDE The smallest example I can give is:
Copy code
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
        sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
        return sc.binaryFiles(path, 1).filter { true }
    }
Which should just return true from the filter and not do anything But I get this error when trying to build:
Copy code
Error:(56, 47) Kotlin: Type mismatch: inferred type is () -> Boolean but Function1<Tuple2<String!, PortableDataStream!>!, Any!>! was expected
The closest I have got to making it work was:
Copy code
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
        sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
        return sc.binaryFiles(path, 1).filter(getFilterFunction())
    }

    private fun getFilterFunction(): scala.Function1<Tuple2<String, PortableDataStream>, Any> {
        return Function1<Tuple2<String, PortableDataStream>, Any> { tuple: Tuple2<String, PortableDataStream> ->
            return@Function1 true
        }
    }
But I get
Copy code
Error:(60, 16) Kotlin: Interface Function1 does not have constructors
Does anyone have experience with interfacing with Apache Spark from kotlin?