andyg
10/24/2025, 3:35 AMaltavir
10/24/2025, 5:12 AMaltavir
10/24/2025, 5:15 AMDidier Villevalois
10/24/2025, 8:05 AMandyg
10/24/2025, 8:25 AMandyg
10/24/2025, 8:29 AMaltavir
10/24/2025, 9:24 AMNikita Klimenko [JB]
10/24/2025, 11:41 AMCapture Memory Snapshot in Profiler toolwindow for org.jetbrains.kotlinx.jupyter.IKotlinKt process (running Kotlin Notebook) and share a screenshot? Here i have a dataframe with 10 million rows, mostly String columns.
Based on your profile we can figure something. As an idea, for this specific dataframe custom String interning could reduce footprint a lot because most columns only have about 10 unique String values.
> I would like to try using the Kotlin Jupyter kernel in VSCode without the overhead of IDEA or Kotlin Notebook
I want to recommend trying it in regular Gradle project with compiler plugin enabled.
Read parquet data in notebook, call df.generateDataClasses() and copy the schema in the project. Given initial schema, dataframe will be able to provide typesafe results, much like in notebooks, for lots of operations like add, convert, remove, groupBy+aggregate, ..., with exceptions being some split overloads, pivot, and similar
As a result: 1. your pipeline will run in its own process 2. gc might collect more intermediate objects 3. maybe it'll work well for you in general? 🙂andyg
10/24/2025, 5:47 PM