Today I want to share with you my presentation abo...
# kotlindl
z
Today I want to share with you my presentation about an upcoming 0.2 KotlinDL release (0.2-alpha-1 artifact is available now on Maven Central). Currently, the KotlinDL framework can boast that this is the only way to construct and train complex neural networks on JVM, such as VGG, ResNet, or MobileNet. There is also support for transfer learning for the popular models trained in Keras (or available in Keras. applications). For image preprocessing, several functions are available that allow you to avoid complex and routine work on the JVM. Code from the demo: https://github.com/zaleslaw/KotlinDL-Demo/tree/demo-1 Library: https://github.com/JetBrains/KotlinDL

https://www.youtube.com/watch?v=jCFZc97_XQU

P.S. Support our project with a star on Github!
🎉 5
👍 4
b
What is your roadmap like for language models? This is an area where I feel other JVM implementations (e.g. DJL, DL4J, TF4J et al.) currently lag behind but but where the JVM with its large ecosystem of text libraries (e.g. Lucene, CoreNLP, StringUtils, LingPipe et al.) could be competitive with other language ecosystems. A lot of development effort seems to focus on images, but I don’t know of as many end users doing image processing on the JVM. For example, it would be great to have an API that gave me access to deploy and fine tune the HuggingFace models.
z
@breandan We planned to start NLP early integration in the second half of this year. Agree about amazing HuggingFace models and great JVM Text processing ecosystem, but at this moment and during 1-2 releases, we will be concentrated on the image domain are mostly. So, maybe we arrange a call with you later, in the second half of May (when we will work on the roadmap), and discuss your vision.
р
@breandan we need a hero to port the tokeniser from huggingface to JVM - then the DL part will be easy
z
Agree, tokeniser part takes a lot of time for many NLP related models, I saw the experiments with Bert and so on
р
yeah their tokeniser sits on Rust
and they have bindings for python and nodejs
b
Wish I could be more helpful, but happy to chat at some point after your release. I know there is already some existing work on tokenization, although not as complete as the Rust implementation, but at least you might not need to start from scratch: https://github.com/huggingface/tokenizers/issues/242#issuecomment-617801291 https://github.com/huggingface/tflite-android-transformers/tree/master/bert/src/main/java/co/huggingface/android_transformers/bertqa/tokenization https://github.com/robrua/easy-bert/tree/master/src/main/java/com/robrua/nlp/bert
z
Very interesting links, thanks
р
One approach could be also to leverage SparkNLP
I don't know if there are any good
kotlin
wrappers for that, but one could process the text first there and then use BERT and transformers on the output
@zaleslaw I have actually bumped into your

Spark talk

the other day (very informative actually and I was surprised to hear that there are still people who try to deal with streams without a message broker). Should say your style in Russian talks is quite different from English ones 🤣
z
Good old times with offline conferences, I am really missing that. There are a few topics related to Spark in my channel. If you could listen in Russian, maybe you could find something useful there. It's interesting to me what is the main difference in styles between Russian and English talks. Of course, in my native language, I can afford more language tools, local memes, references, and the requirements at Russian-language conferences - to be closer to a daring stand-up because the competition between speakers is very high. And the Russian audience is very demanding, does not forgive poor preparation or factual mistakes. Sometimes it's exhilarating, sometimes tiring.
р
Yeah, I agree it feels quite awkward to make jokes nowdays behind your screen when you give a talk and not hearing anything in response. And you are just alone, no feedback from those who listen to you on how you are doing (you need to see the faces of people to get an idea about where your audience stands with you). For the style, take the title already:

Kafka льёт, а Spark разгребает

I don't know how to transmit the cheeky side of that in English 😄
... в кавке есть брокеры, они торгуют данными по дешевке ... 🤣
metal 1