We are experimenting with notebooks. Is there a wa...
# datascience
p
We are experimenting with notebooks. Is there a way to share code across multiple ipnyb notebooks?
a
You can load any Jas file or a directory with classfiles into notebook via
@file:DependsOn
directive
p
That doesn’t seem to work
Okay that was a different issue, it’s still not resolving:
a
You can't load kts. You can load compiled jar or classes
By the way @Ilya Muradyan, great work!
p
Thanks! That’s a long document. It would be great if there were samples for this. Our current use case is: We are migrating away from apache zeppelin. My plan is to create a repository for all analysis that we use for dev internal stuff. In the end we will use a jdbc connection and then transform the results from a sql query to a dataframe. In the solution I’m picturing we’d have a src/main/kotlin structure where common queries and parsing is located. And then we’d have a bunch of single-analysis targeted ipynb’s which make use of these files. Does that concept make sense with the way these ipynb’s and generally kotlin notebooks are set up?
a
I had to spend few minutes searching, but I found proper documentation for annotation-based imports: https://github.com/Kotlin/KEEP/blob/master/proposals/scripting-support.md#kotlin-main-kts
You actually can use external kts via
@file:Import
directive. You can also write a direct path to a jar inside
DependsOn
, but it must be compiled jar.
p
Thanks! I’m still missing the greater picture here. How is this designed to work with some form of a locally built, reusable code infrastructure?
a
Let me find a demo...
Here it is: https://datalore.jetbrains.com/notebook/ptQDfQAcrjNxzIO0AEqovZ/rDxhvLe6OXRYcwgp3vz4OO. I use a jar file compiled from a project. I also can use an artifact, deployed to mavenLocal (it is actually a more reliable way, because then I don't have to think about dependencies).
Should that link from above be accessible without any kind of login?
a
The best way is to: • Publish artefacts from your project to mavenLocal • Use
@file:Repository("*mavenlocal")
• Use your reusable code as a regular dependency.
It requires free datalore account. But I recommend it anyway if you want to play with kotlin notebooks.
This is example for this talk:

https://youtu.be/4Sg2Qju67kE

(the talk itself is in Russian, but I guess, you can read subtitles.
p
Do you think there is any way to automate that part? I’d like to use it an edit, then press run replacement. I’d like to find a way to not write documentations for my devs like: If you edit the files there, you need to run this script and then later update your notebooks
Maybe some process.execute(./gradlew build) and then a file:import as the header of each noteobooks? In an ideal world I’d just have a structure like
Copy code
./
├── build.gradle.kts
└── src
    └── main
        └── kotlin
            ├── helper.kt
            └── notebook1.ipynb
a
It depends on what is the process for your common code. If is some kind of library with regular development cycle, it is already automated. You deploy new version to the repository and then ask your scripters to update version number and reload notebooks. They do not have to do anything else.
and yes, you can run gradle build (probably better to use
gradle publishToMavenLocal
) from notebook.
But I have a feeling is what you need is actually idea-based notebooks: https://plugins.jetbrains.com/plugin/16340-kotlin-notebook. They work in project classpath already. You do no need to reload anything. It is still in development, but you can try it.
@roman.belov can probably tell more about it.
p
Yes! That sounds way more like what I want
Now we’re talking!
a
On the plus side, no Python at all. But I still can't make it work properly on MPP projects. It is in active development though.
p
I did some magic for our upcoming department days and managed to convert a git commit history to a dataframe. This is really great stuff!
This is so nice!
197 Views