Hi, In kotlin-dataframe, are there some built-in e...
# datascience
h
Hi, In kotlin-dataframe, are there some built-in example datasets? E.g. the docs in https://kotlin.github.io/dataframe/kpropertiesapi.html refer to
DataFrame.read("titanic.csv")
which is not running unless I dig the internet to find that csv. Most dataframe libraries bundle some examples, as in krangl
DataFrame.irisData
or similar. Same for dplyr in R. This would imho also much better practice (and is the common practice in R help) in the docs, which are never self-contained at the moment but always refer to some
df
which a user can easily create on his end.
h
Thx for the pointer. I found it meanwhile myself. But I guess most users would give up earlier and go back to dplyr/pandas. So I argue my point is still valid.
Also, it seems the api-docs in https://kotlin.github.io/dataframe/kpropertiesapi.html refer to a different version of the dataset:
survived
is an Int in the csv but referred to as Boolean in the mentioned link (see data class Passenger).
n
Hi! I think now there are example datasets only in example notebooks / idea projects. But most of the documentation don't explicitly reference datasets. I do agree that samples from the doc should be more self contained. As for titanic.csv, my guess is that this code never actually ran, because i cannot find a dataset with Booleans in the history. So it also should be fixed. Thank you for sharing your thoughts and drawing attention to onboarding experience 🙂 dplyr docs look like a good source of inspiration for future improvements
h
Btw, as part of my efforts to support users to migrate from krangl to kdf, I've started https://github.com/holgerbrandl/kdfutils recently where I have implemented bidirectol conversions, and expose a facade API to provide the functions which seem IMHO missing yet in kdf. This includes example datasets, see https://github.com/holgerbrandl/kdfutils/blob/main/src/main/kotlin/DataSets.kt. Importantaly, the krangl API is completely internalized in kdfutils (via gradle API dependency), to provide an actual migration path away from it. Cleary kdfutils is still WIP.