I'm trying to use llm streaming, is there a more c...
# koog-agentic-framework
m
I'm trying to use llm streaming, is there a more concrete example I can see? I'd like to send partial strings to the user as the llm is 'responding'. I've looked at the existing doc but it seems focused on streaming that content into structured data.
p
You don’t have to pass anything to
requestLLMStreaming
(according to the docs) in that case, you’ll just get the string back as it comes
m
Won't the entire agent just hold until the streaming is finished before the response is sent? the runAndGetResult would halt until the entire strategy is finished no?
Also, is the documentation detailing usage for an upcoming version? I see code examples that simply don't compile in .21
I see that I can do simple prompt streaming, but now I'm left wondering how to make use of streaming while using the agent strategy dsl. It seems like the output is locked to a full string response? Apologies for the whirlwind of messages, just trying to get a handle as I attempt to migrate from langchain4j.
p
You can process results in the nodes, like in the documentation example, as the information about books comes in,
Book
instances will be created. So no, it’s not necessary to wait until the agent finishes its work.
Also, is the documentation detailing usage for an upcoming version? I see code examples that simply don't compile in .21
No, the documentation should match the current published version. Could you let us know which specific examples aren’t working?
how to make use of streaming while using the agent strategy dsl. It seems like the output is locked to a full string response?
What do you mean by that?
f
@mdepies I had this problem also while trying to stream partial events. Essentially what I did is wrap the function that contains the call to
runAndGetResult
in a
Flow
then pass the emit function from the flow as an input to the strategy. So that the nodes can emit anything they need to the flow. Then you can collect that flow and emit server sent events or whatever you are doing to send to the user. Here is a minimal example of a node that will emit the partial responses and also collect the full response as the output from the node.
Copy code
fun AIAgentSubgraphBuilderBase<*, *>.streamPartial(emit: suspend (delta: String) -> Unit): AIAgentNodeDelegateBase<String, String> {
    return node<String, String>("streamPartial") {
        val responseStream = llm.writeSession {
            requestLLMStreaming()
        }
        var finalMessage = ""
        responseStream.collect { partial ->
            emit(partial)
            finalMessage += partial
        }
        finalMessage
    }
}
m
I was seeing some examples (though I'm struggling to find them now...) where the agent.run("some string") was being assigned to a result. Leading me to wonder if that was an api change not yet published. I did manage to get something similar to @Finn Jensen working, however, it did not seem scalable if I had a more complex graph where the process could loop to the stream node more than once, or with the use of tools. I guess I'll continue to hack at it and see if it makes more sense. I get the feeling I'm fighting the design as I implement streaming feedback for ux.
a
Also, is the documentation detailing usage for an upcoming version? I see code examples that simply don’t compile in .21
Hi, make sure you’re looking at
main
branch, which corresponds to the latest release. Default branch is
develop
, so some things here might be different from the latest released version.
Regarding streaming and asynchronous communication, you can use coroutine channels that solve exactly this problem. Here’s a simple example to demonstrate how they can be used.
processResponses
is just some dummy function that only streams the whole response and sends acknowledgement to the agent, but in the real world it can be some other component of your application.
Or, if your use case is more simple and you just want to communicate these events e.g. to UI, you can use callbacks instead. Check demo android app in our examples, it uses callbacks to show messages and events to a user.
👍 1