Hey folks! Using Datalore or a Jupyter notebook a...
# datascience
i
Hey folks! Using Datalore or a Jupyter notebook and spark is there anyway to access local files? I’m getting:
Copy code
No FileSystem for scheme "file"
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "file"
a
You can't access files on your computer since DataLore does not know anything about it. You can add files to your datalore workspace though.
i
I’ve added the file to the workspace in datalore and then I try to read it as follows:
Copy code
val equipment = spark.read().json("allEquipment.json" )
But I still get the same error. Have you been able to read a file from the workspace using spark?
I would be glad to read the file using standard kotlin. And if helpful convert to json and feed it into spark. But I have no idea how to do anything like this.
a
Not to the workspace. To attached files. It works differently from local jupyter since each notebook has its own isolated environment.
i
This is what I have:
I should probably mention it works with kotlin serialization and I’m able to get it working using the kotlin dataframe. But I am really hoping I can connect spark.
a
You should ask @Pasha Finkelshteyn, but if you know how to use spark with regular file, notebook is not different.
p
@altavir thank you for mentioning me! But all the experimenting is done by @Jolan Rensen [JB] But it looks to me that you don't need to use scheme to access files entirely. But I know nothing about filesystem in datalore
a
It is a regular linux local filesystem
i
@Jolan Rensen [JetBrains] @Pasha Finkelshteyn I would love to see any samples of accessing files using spark on datalore or jupyter notebooks. My experience with Spark was years ago (when spark was new-ish) in Scala and I set up my own environments.
a
I am not familiar with spark, but probably something like this should do: https://spark.apache.org/docs/latest/sql-data-sources-text.html
i
Yeah that’s what I’ve been following. If you look at my example line it follows their example. 🤔
j
Oh that's weird, it appears I get
Copy code
No FileSystem for scheme "file"
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "file"
as well. Thanks for letting us know! Looks like a bug
Okay, I found a fix! Related to https://stackoverflow.com/questions/17265002/hadoop-no-filesystem-for-scheme-file/27532248#27532248 Will be in the next release for sure! Until then, a temporary solution is to restart spark with the right parameters: https://datalore.jetbrains.com/view/notebook/tMW3yTujL1lv08P4ZzBgtq Created an issue for it: https://github.com/Kotlin/kotlin-spark-api/issues/164
i
Thank you so much! I really appreciate all your help!
🙂 1
166 Views