Hi, what do you recommend for streaming an audio r...
# koog-agentic-framework
e
Hi, what do you recommend for streaming an audio response? Koog doesn't have yet Google's
Gemini 2.5 Flash Preview TTS
model, but I figured how to just instantiate a
LLModel
manually providing the model id and capabilities. For the capabilities though, there's no audio capability. The only multimodal one is
Vision
. Aside from this, how would you handle the strategy definition part? In the docs, I could only see an example for streaming structured data, which is very useful, but didn't help me much with this.
p
Hi Right now, only text input, output are supported. Support for media content, especially images, is planned for later. You can create a related issue to make it easier to track I didn’t quite understand your question about the strategy, what’s your use case?
e
Thanks, I created an issue 👍 For the strategy, I was referencing your code example here for writing a node that streams data. I was just wondering how we could stream audio output following this example.
thank you color 1
p
For the strategy, I was referencing your code example here for writing a node that streams data. I was just wondering how we could stream audio output following this example.
If I understand your request correctly, you will need any JVM (Java or Kotlin) library that can work with audio streams. from the llm you will simply be receiving a bytearray, which you can then handle in the usual way. But as I mentioned, media data is currently not supported, only text
thank you color 1