Hi! *Koog `0.2.0`* has just been released with all...
# ai
v
Hi! Koog
0.2.0
has just been released with all the latest updates and fixes. Thank you everyone for your contributions! Full changelog https://github.com/JetBrains/koog/releases/tag/0.2.0 Some notable changes: • media types support (image, audio, document) • token counting support • Lots of Ollama-related bug fixes + dynamic model discovery support for Ollama • Fixed structured data bugs with property serialization • Fixed missing support for Double and List in the annotation-based tools API • … and many other bug-fixes and improvements (see the full CHANGELOG) Slack Conversation
👏 3
kodee happy 4
g
@Vadim Briliantov are you looking for contributors to koog?
v
Sure, we are very open to external contributions, it’s open-source 🙂 What would you like to contribute?
g
I've worked a lot with ai and audio (my company is a voice ai for restaurants thing). So I was eyeing the lack of audio capabilities as something interesting. But I'm really open to what's most critical and needs help
v
If you have the most experience with audio — please feel free to contribute to that part. We recently unblocked the media support in
0.2.0
(it now supports audio in prompts for the models that do support audio), but of course there’s much more in that area! So please feel free to open issues or PRs 🙂
I think specifically as you are working in the field, you contribution will be very valuable!
g
Ok. I'll see what I can do. I'm assuming there's already support for streaming?
🙏 1
v
Yes
🔥 1
p
my company is a voice ai for restaurants thing)
@Gabriel Duncan that sounds cool. Would you mind elaborating the tech stack a little? We are also doing AI and have recently started with voice. ... Also if you are looking for ideas, you could also look at integrating Koog with our A2A implementation https://github.com/a2a-4k/a2a-4k :)
👀 1
g
I started it 6 years ago so the stack has evolved interestingly (there were no llms then). But the back end is all kotlin and it connects Twilio media streams to speech to text, LLMs and some nlu, and finally back to audio via text to speech. I recently added litellm so that all the speech model and llm calls can originate from any openAI compatible client and route to multiple models with fun stuff like load balancing. Happy to chat more if you want. feel free to DM
@Vadim Briliantov is the best place to start a discussion a draft PR? or is there somewhere else i should propose any conceptual changes etc?
v
@Gabriel Duncan Depending on the changes you may either start a PR or describe your ideas in Issues 🙂
🙏 1