Antero Duarte
10/01/2018, 12:27 PMoverride fun getFiles(path: String, sc: SparkContext): RDD<*> {
sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
return sc.binaryFiles(path, 1)
}
And that was fine, It all worked properly.
But now I had to add a .filter() to that returning RDD, and I'm struggling to get the code to compile, even though I don't get any errors in the IDE
The smallest example I can give is:
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
return sc.binaryFiles(path, 1).filter { true }
}
Which should just return true from the filter and not do anything
But I get this error when trying to build:
Error:(56, 47) Kotlin: Type mismatch: inferred type is () -> Boolean but Function1<Tuple2<String!, PortableDataStream!>!, Any!>! was expected
The closest I have got to making it work was:
override fun getFiles(path: String, sc: SparkContext): RDD<*> {
sc.hadoopConfiguration().set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem::class.java.name)
return sc.binaryFiles(path, 1).filter(getFilterFunction())
}
private fun getFilterFunction(): scala.Function1<Tuple2<String, PortableDataStream>, Any> {
return Function1<Tuple2<String, PortableDataStream>, Any> { tuple: Tuple2<String, PortableDataStream> ->
return@Function1 true
}
}
But I get
Error:(60, 16) Kotlin: Interface Function1 does not have constructors
Does anyone have experience with interfacing with Apache Spark from kotlin?orangy
Antero Duarte
10/01/2018, 12:29 PMkz
10/01/2018, 9:13 PMkz
10/01/2018, 9:13 PMkz
10/01/2018, 9:13 PMkz
10/01/2018, 9:14 PMkz
10/01/2018, 9:14 PMAntero Duarte
10/02/2018, 1:45 PM