:rocket: Koog 0.4.0 is here! This release brings ...
# koog-agentic-framework
d
🚀 Koog 0.4.0 is here! This release brings powerful new capabilities for building scalable and production-ready AI agents. Highlights include: 🕵️‍♀️ Langfuse and W&B Weave support 🧩 Ktor integration 🏛️ Native structured output 📱iOS target 🧠 GPT-5 Learn more here.
K 12
🎉 9
👌 1
❤️ 17
u
Awesome ! 🤩
m
Given this release would you say long term goals include some of the things Akka is highlighting in their highlights? https://www.linkedin.com/posts/paoloperrone_finally-someone-built-what-agent-developers-activity-7359968596040568832-WTqu/
v
Hi @Michael Wills! Great question. Let’s break down what was written in the post you shared: Orchestration that doesn’t break > → Handles crashes without losing state — https://docs.koog.ai/agent-persistency/ > → Sequential, parallel, hierarchical workflows • https://docs.koog.ai/custom-strategy-graphs/https://docs.koog.ai/parallel-node-execution/ > → Human-in-the-loop when needed It’s essentially just human-facing tool calling (either LLM-triggered, or mannally from the node) https://docs.koog.ai/annotation-based-tools/ — just wrap your favourite way of doing a println() + readline() (of course — rather showing some fancy UI message and getting some command or feedback from the user) into a
@Tool
> Real agent memory > → Session memory that persists > → Shared across agents automatically https://docs.koog.ai/agent-memory/ — exactly the memory that can be shared and persisted in a database / file / S3 across sessions, agents, and even products. Just connect to your DB and define what you would like to persist. > → Nanosecond writes (not kidding) It depends on the Database / infrastructure you use, not the agentic framework. Sure — there are plenty of enterprise-ready solutions for that — Oracle, Amazon RDS, PostgreSQL, etc… > Production-grade streaming > → 1 billion tokens/second capacity > → Real-time audio, video, metrics > → Built-in backpressure (finally) That’s an interesting one, indeed. Even fastest and optimized LLMs can produce hundreds to thousands of tokens per second… Having a 1 billion tokens capacity is great but to actually use it — would mean at least a million (or tens of millions) parallel LLM conversations with optimized smaller models PER SECOND. To meet such demand (if it exists in your system) you would need to scale your application and indeed — build some distributed layer (or realistically — rather some horizontally scaled, well-distributed across regions and low-latency set of instances with corporate LLM connections, i.e. “AI Cloud”) that would balance everything out. And this is not really a responsibility of agentic framework, but rather a responsibility of the cloud infrastructure layer. So you would either have to purchase some AI Cloud subscription, or build and scale your own (that’s what Akka is offering, apparently, but in my backend and distributed systems experience — it doesn’t have much connections with agentic flows development). For example, in JetBrains we have our own internal server-side solution to serve millions of IDE users with LLM inference so that everyone can use Juinie and AI assistant simultaneously at scale across different regions on the planet. And the team who are working on that are not involved in end-user AI products development or the development of agentic flows. Other teams would just use Koog with a single LLMClient to that JetBrains AI backend that would route requests to the right LLMs. The rest of the development process remains the same. TLDR — if you are a large company and have resources and desire to develop your own internal high-performance AI/LLM Cloud — you can do that but it’s orthogonal to the AI development. Or alternatively — you can just buy a subscription to an existing commercial AI Cloud and use it at scale. > The part that got me: > 15 years of actor-based runtime underneath. > Not another rushed framework. > Battle-tested infrastructure. That’s absolutely true. I personally love Akka Actors as my own career started with a research project of using Akka Actors to speed-up Java code analysis on a multi-machine cluster, which I enjoyed as an experience. Maybe at some point when AI agent requirements would grow to the scale of running strategies and agentic pipelines on multiple machines — that would be a natural thing to add into Koog as well. Currently, though, I’m not aware of any significant demand or practical benefits for such AI systems.
To sum things up — we’ll continue focusing on enterprise needs, specifically improving persistency, memory and integrations. We’ll also be exploring some distributed compute scenarios as well once (and if) there will be such demand either externally or internally for real applications. At least for the coding agents (that are so far some of the most advanced and complex agentic AI systems that exist right now), we didn’t face any demand to build a distributed system in practice (although, I would love to 🙂)
m
Thank you for the feedback @Vadim Briliantov! Your points are well noted. I'll be digging into the docs you linked, but I do have a question about best practices for HITL similar to what @renato.java was asking. I have some ideas using Ktor/Connect to tie it to user-facing UI. Some I plan to be very deterministic as just a specific edge in the graph, where others might need LLM tool calling. Is there an existing example you would recommend?
I am also hoping to integrate BAML https://www.boundaryml.com
For HITL this example for LangGraph describes an
interrupt
node:
Copy code
# Human editing node
def human_review_edit(state: State) -> State:
    """Allow human to review and edit the AI-generated summary"""
    
    result = interrupt({
        "task": "Please review and edit the generated summary if necessary.",
        "generated_summary": state["summary"]
    })
    
    return {
        "summary": result["edited_summary"]
    }
What’s happening here: • This is the critical HITL node where the magic happens •
interrupt()
pauses the workflow execution • It presents the current state to the human reviewer • The workflow waits for human input before continuing • Returns the state with the human-edited summary and then a
Command(resume=
to continue. https://medium.com/the-advanced-school-of-ai/human-in-the-loop-in-langgraph-approve-or-reject-pattern-fcf6ba0c5990#id_token=eyJhbG[…]wDIzn-NAiwBE3rA
d
Native Structured Output is huuge, let's go! 🚀
t
@Michael Wills I personally found langgraph hitl implementation flawed. The interrupt function essentially throws an error which in my opinion is bad practice. Exceptions should not be used for flow control. I found that my error handling logic was catching the error and subverting the pause I desired. We ended up implementing our own hitl layer on top of langgraph. We’ve since moved onto koog. I think hitl via a tool is a nice way of doing it. We also maintain our own state management layer on top of koog as we need a strong data contract between us and our clients which is agnostic of the agentic framework we are using.
👍 1
v
@Michael Wills I believe @Sebastian Aigner might have some simple example with deterministic calling of user input from a node, but in general - you can call any tool either directly by calling any function, or — preferably — using
Copy code
llm.writeSession {
    callTool(::function, arg1, arg2) // or findTool(::function).execute(arg1, arg2)
}
m
Thank you for sharing your experience @Tom Molloy. Framework agnostic state management is definitely a good idea. I certainly didn't get the sense it would throw an exception as the way to handle it. I would have to try to see if I could find the reason for that decision. That requires more careful handling than I would have expected. Thanks also @Vadim Briliantov. Both of your points about tools are noted,
llm.writeSession
to call a tool looks spot on. @renato.java if you don't have a solution yet this might be helpful for you as well.