Any way to load roBERTa model using kotlin? I am a...
# datascience
a
Any way to load roBERTa model using kotlin? I am aware that certain language have implementation to vectorize the text like
RobertaTokenizerFast
in pytho.. In kotlin I was managed to get the size and some information using `OnnxInferenceModel`(kotlindl) but I am quite confused on vectorizing the sentence because i don't have access to the vocabulary, the approach used the vectorize the sent.. or the dimensions to implement the vectorizer manualy So I was wondering if there's any possible way or library that could workaround
i
@Julia Beliaeva @zaleslaw
z
At this moment I could not find a way to port tokenizers
a
Thanks @zaleslaw for the response Is there any other model that I can utilize instead of roBerta I tried importing pretrained keras model which doesn't workout due to the the lack of Embedding layer Or maybe I should wait until the layers are ready
r
@Ananiya I guess https://github.com/londogard/londogard-nlp-toolkit/ should work for preprocessing
a
@roman.belov thanks for the response! I looked at this library earlier but doesn't seems to provide same vectorizer as Roberta
h
@Ananiya it’s something I’m working on adding through using DJL. You should be able to use DJL directly for now 🙂 It supports HuggingFace Tokenizers.
a
Interesting! thanks @Hampus Londögård for letting me know
h
@Ananiya I’ve added support for
ClassifierPipeline
&
TokenClassificationPipeline
which means that Text. Classiification and Token Classiification (e.g. NER) is supported now 👍 Files can be loaded from file-system (ONNX & PyTorch) and from HuggingFace Hub (ONNX)
a
Fantastic @Hampus Londögård ! Excited to test it