Daniela
08/28/2025, 2:49 PMאליהו הדס
08/28/2025, 3:40 PMMichael Wills
08/28/2025, 7:14 PMVadim Briliantov
08/28/2025, 8:40 PM@Tool
> Real agent memory
> → Session memory that persists
> → Shared across agents automatically
https://docs.koog.ai/agent-memory/ — exactly the memory that can be shared and persisted in a database / file / S3 across sessions, agents, and even products. Just connect to your DB and define what you would like to persist.
> → Nanosecond writes (not kidding)
It depends on the Database / infrastructure you use, not the agentic framework. Sure — there are plenty of enterprise-ready solutions for that — Oracle, Amazon RDS, PostgreSQL, etc…
> Production-grade streaming
> → 1 billion tokens/second capacity
> → Real-time audio, video, metrics
> → Built-in backpressure (finally)
That’s an interesting one, indeed.
Even fastest and optimized LLMs can produce hundreds to thousands of tokens per second… Having a 1 billion tokens capacity is great but to actually use it — would mean at least a million (or tens of millions) parallel LLM conversations with optimized smaller models PER SECOND.
To meet such demand (if it exists in your system) you would need to scale your application and indeed — build some distributed layer (or realistically — rather some horizontally scaled, well-distributed across regions and low-latency set of instances with corporate LLM connections, i.e. “AI Cloud”) that would balance everything out. And this is not really a responsibility of agentic framework, but rather a responsibility of the cloud infrastructure layer. So you would either have to purchase some AI Cloud subscription, or build and scale your own (that’s what Akka is offering, apparently, but in my backend and distributed systems experience — it doesn’t have much connections with agentic flows development).
For example, in JetBrains we have our own internal server-side solution to serve millions of IDE users with LLM inference so that everyone can use Juinie and AI assistant simultaneously at scale across different regions on the planet. And the team who are working on that are not involved in end-user AI products development or the development of agentic flows. Other teams would just use Koog with a single LLMClient to that JetBrains AI backend that would route requests to the right LLMs. The rest of the development process remains the same.
TLDR — if you are a large company and have resources and desire to develop your own internal high-performance AI/LLM Cloud — you can do that but it’s orthogonal to the AI development. Or alternatively — you can just buy a subscription to an existing commercial AI Cloud and use it at scale.
> The part that got me:
> 15 years of actor-based runtime underneath.
> Not another rushed framework.
> Battle-tested infrastructure.
That’s absolutely true.
I personally love Akka Actors as my own career started with a research project of using Akka Actors to speed-up Java code analysis on a multi-machine cluster, which I enjoyed as an experience.
Maybe at some point when AI agent requirements would grow to the scale of running strategies and agentic pipelines on multiple machines — that would be a natural thing to add into Koog as well. Currently, though, I’m not aware of any significant demand or practical benefits for such AI systems.Vadim Briliantov
08/28/2025, 8:44 PMMichael Wills
08/28/2025, 9:03 PMMichael Wills
08/28/2025, 9:04 PMMichael Wills
08/28/2025, 9:08 PMinterrupt
node:
# Human editing node
def human_review_edit(state: State) -> State:
"""Allow human to review and edit the AI-generated summary"""
result = interrupt({
"task": "Please review and edit the generated summary if necessary.",
"generated_summary": state["summary"]
})
return {
"summary": result["edited_summary"]
}
What’s happening here:
• This is the critical HITL node where the magic happens
• interrupt()
pauses the workflow execution
• It presents the current state to the human reviewer
• The workflow waits for human input before continuing
• Returns the state with the human-edited summary
and then a Command(resume=
to continue.
https://medium.com/the-advanced-school-of-ai/human-in-the-loop-in-langgraph-approve-or-reject-pattern-fcf6ba0c5990#id_token=eyJhbG[…]wDIzn-NAiwBE3rAdarkmoon_uk
08/28/2025, 11:44 PMTom Molloy
08/29/2025, 6:58 PMVadim Briliantov
08/29/2025, 9:46 PMllm.writeSession {
callTool(::function, arg1, arg2) // or findTool(::function).execute(arg1, arg2)
}
Michael Wills
08/30/2025, 1:03 AMllm.writeSession
to call a tool looks spot on. @renato.java if you don't have a solution yet this might be helpful for you as well.