I am not getting full responses back. The Ollama m...
# koog-agentic-framework
a
I am not getting full responses back. The Ollama model says it found a tool, but it does not call it:
Copy code
fun main() {
    val config = ConfigFactory.load()
    val kafkaService = KafkaService(config)

    val toolRegistry =
        ToolRegistry {
            tools(KafkaToolSet(kafkaService).asTools())
        }

    val agent =
        AIAgent(
            executor = simpleOllamaAIExecutor(),
            systemPrompt = "You a diagnostics agent. You can answer questions about the workings of systems provided to you via tools.",
            llmModel = OllamaModels.Alibaba.QWEN_3_06B,
            toolRegistry = toolRegistry,
            maxIterations = 20,
            strategy = singleRunStrategy(),
        )

    runBlocking {
        val result = agent.runAndGetResult("Get me the consumer offsets and lag for all consumers")
    }
}
This is the result:
Copy code
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<think>
Okay, the user is asking to get the consumer offsets and lag for all consumers. Let me check the tools available. There's a function called getConsumerOffsetsAndLags. The parameters are empty, so I don't need any arguments. I should call that function without any parameters. Let me make sure I'm using the right function name and structure the response correctly. Alright, I'll return the tool call as specified.
</think>
The model should support tools, so either I do not understand something or Koog does not handle the tool requests.
d
Hey Arjan. What version of Ollama are you using? (Viewing that "thinking" answer, I feel like you are using Ollama 0.9, which is not yet supported by Koog.) It looks like Qwen wants to output multiple messages (like some thinking response and then some tool calls), however
singleRunStrategy()
is using
requestLLM()
which is only keeping the first answer. You have two options to try and make it work: 1. Prompt the model more to tell it to ◦ NOT share its thinking and reasoning ◦ NOT answer anything but the tool call, if it wants to make a tool call 2. Write your own strategy using
nodeLLMRequestMultiple()
instead of
nodeLLMRequest()
a
Hi Didier, I’ll try these out, thanks! What Ollama version is supported?
d
You're very welcome! 🤗 For now, Ollama 0.8 is supported. Ollama 0.9 support will arrive soon.
a
Ollama 0.8 I have the same issue, I'll try your other tips
d
Yeah, I am quite sure the other messages are gobbled by Koog (here: https://github.com/JetBrains/koog/blob/develop/agents%2Fagents-core%2Fsrc%2FcommonMain%2Fkotlin%2Fai%2Fkoog%2Fagents%2Fcore%2Fagent%2Fsession%2FAIAgentLLMSession.kt#L115) which is what is used in the
singleRunStrategy
.
a
But that isn't exclusive to Ollama then? It means that all examples in the docs don't work?
d
The thing is that proprietary models behind APIs (OpenAI, Anthropic, etc) have parameters to control whether you want tool calls and/or only tool calls. But I believe they do that by over-prompting the model...
@Vadim Briliantov
singleRunStrategy()
has to support Ollama from the get-go (by using normal LLM responses, i.e. lists of response messages, and not silently discarding all the messages but the first). You can see above that I am not the only one confused by that. Relying on non-standard things like tool-choice might not be a good idea on the simple examples in the documentation.
a
Hmm okay. Building my own strategy is now too far fetched for me. I need to understand all the extension functions and how the graphs work.
👍 1
But I am very excited about the prospect of building agents with Koog. I know it is all still quite early.
K 1
d
@Arjan van Wieringen you can try this:
Copy code
import ai.koog.agents.core.dsl.builder.*
import ai.koog.agents.core.dsl.extension.*
import ai.koog.prompt.message.*

private fun alternativeSingleRunStrategy() = strategy("alternativeSingleRunStrategy") {
    val initialRequest by nodeLLMRequestMultiple()
    val processResponses by nodeDoNothing<List<Message.Response>>()
    val executeTools by nodeExecuteMultipleTools(parallelTools = true)
    val toolResultsRequest by nodeLLMSendMultipleToolResults()

    edge(nodeStart forwardTo initialRequest)
    edge(initialRequest forwardTo processResponses)

    edge(processResponses forwardTo executeTools onToolCallsPresent { true })
    edge(processResponses forwardTo nodeFinish transformed { it.first() } onAssistantMessage { true })

    edge(executeTools forwardTo toolResultsRequest)
    edge(toolResultsRequest forwardTo processResponses)
}

infix fun <IncomingOutput, IntermediateOutput, OutgoingInput>
        AIAgentEdgeBuilderIntermediate<IncomingOutput, IntermediateOutput, OutgoingInput>.onToolCallsPresent(
    block: suspend (List<Message.Tool.Call>) -> Boolean
): AIAgentEdgeBuilderIntermediate<IncomingOutput, List<Message.Tool.Call>, OutgoingInput> {
    return onIsInstance(List::class)
        .transformed { it.filterIsInstance<Message.Tool.Call>() }
        .onCondition { toolCalls -> toolCalls.isNotEmpty() && block(toolCalls) }
}
This is untested. I cooked it just for you.
a
Holy smokes batman. It works!
🎉 1
Copy code
<think>
Okay, so the user is asking if any of their consumers are lagging. The tool response shows consumer offsets and lag information for various groups. I need to parse this data to determine if any lag exists. Let me check each group's offset and lag. For example, in "accepted-testevent", the offset is 53 and lag is 0, which is good. Similarly, "rcr-perf-test" also has 0 lag. There are multiple entries with 0 lag, which suggests that all consumers are not lagging. Therefore, the answer should confirm that there is no lagging consumer based on the provided data.
</think>

From the consumer offset and lag data, **no consumers are lagging**. All entries show a lag of 0, indicating no lag in their processes.
This is great
d
Enjoy! 😊
❤️ 1
v
Hi, thanks for raising this issue. I think we indeed have to update the default strategy to work with such responses out of the box
🔥 4