Let me give you more context:
We will have a common multiplatform module, a multiplatform library that uses ONNX runtime (or KInference maybe) for offline users and a JVM online backend (ktor web service) for online users.
The common multiplatform module will have shared business rules, search algorithms and pre/post processing.
The library will use the common module to have access to preprocess code and process the data (eg: images) before sending it to onnx/kinference (it assumes the model is downloaded locally). By preprocess I mean resize,crop, normalization (for now that's all we need).
The JVM backend will use the common multiplatform module for the same reasons, have access to shared pre/post-processing code before sending the image/tensors to an online model server (kubeflow). Here the model is never downloaded, we simply send a json with the tensors of the images pre/processed. Once we get the response we might apply post-processing to it eg: argmax etc.