The Problem
Currently, when an agent is running a long task (e.g., involving multiple tool calls or a complex chain of thought), there is no way to programmatically stop or cancel the execution via the ADK web server API.
In a web application scenario, a user might trigger a long-running agent task. If the user decides to stop the process, the front-end can send a request, but there is no endpoint on the ADK server to handle this cancellation. Disconnecting a streaming connection (like SSE or WebSocket) only stops the client from receiving further updates; the agent continues to run on the server, consuming resources until it completes its task.
This makes it difficult to build responsive and user-friendly interfaces on top of ADK agents, as there is no mechanism for user-initiated interruption.
Proposed Solution
I propose implementing a session-state-based cancellation mechanism. This would involve a few coordinated changes to the framework that should integrate cleanly with the existing architecture.
1. Introduce a cancelled Flag in the Session State
The Session object's state dictionary (defined via TypeAlias in src/google/adk/sessions/state.py) is a perfect place to manage this. We can introduce a cancelled: bool flag.
2. Create a New Cancellation Endpoint
A new FastAPI endpoint should be added to src/google/adk/cli/adk_web_server.py:
@app.post("/apps/{app_name}/users/{user_id}/sessions/{session_id}:cancel")
async def cancel_session(app_name: str, user_id: str, session_id: str):
session = await self.session_service.get_session(
app_name=app_name, user_id=user_id, session_id=session_id
)
if not session:
raise HTTPException(status_code=404, detail="Session not found")
if session.state is None:
session.state = {}
session.state["cancelled"] = True
await self.session_service.update_session(session)
return {"status": "cancelled", "session_id": session_id}
This endpoint would be responsible for retrieving the session, setting the cancelled flag to True, and persisting the change.
3. Modify Core Agent Logic to Respect the Flag
The agent's execution loops need to be updated to check for this flag periodically. The key integration points would be:
- In
src/google/adk/flows/llm_flows/base_llm_flow.py: Before making a call to the LLM in _call_llm_async(), the agent should check session.state.get("cancelled"). If true, it should stop execution and yield a final "cancelled" message.
- In Streaming Loops: For both SSE and WebSocket (
/run_live) connections, the message processing loops should check the flag. If it's true, the connection should be gracefully closed.
- In Tool Execution: Long-running tools should also be designed to accept the
ToolContext and check the session's cancelled flag, allowing them to terminate early.
Here is a conceptual example of how the check in base_llm_flow.py might look:
# In base_llm_flow.py's _call_llm_async
session = invocation_context.session
if session and session.state and session.state.get("cancelled"):
yield LlmResponse(
model_response=types.GenerateContentResponse(
# ... create a response indicating cancellation ...
),
turn_complete=True
)
return
Additional Implementation Details
4. Function/Tool Execution Cancellation
In src/google/adk/flows/llm_flows/functions.py, the parallel tool execution should check for cancellation:
# In handle_function_calls_live() before creating tasks
session = invocation_context.session
if session and session.state and session.state.get("cancelled"):
return None # Skip tool execution if cancelled
5. WebSocket Connection Handling
The WebSocket endpoint should also monitor the cancellation flag:
# In agent_live_run() message processing loop
async def process_messages():
while True:
if session.state and session.state.get("cancelled"):
await websocket.close(code=1000, reason="Session cancelled")
break
# ... continue processing
Alternative Approaches Considered
- AsyncIO Task Cancellation: Using Python's native task cancellation could be considered but would require significant refactoring of the execution model.
- Separate Cancellation Service: A dedicated service to track cancellations could be considered but adds unnecessary complexity.
- Timeout-Based Approach: Simply using timeouts could be considered but doesn't provide the immediate responsiveness users expect.
Testing Considerations
The implementation should include tests for:
- Setting and retrieving the cancellation flag
- Agent respecting cancellation during LLM calls
- Tool execution stopping when cancelled
- Streaming connections closing properly
- Partial results being returned correctly
Justification
This feature is critical for building production-ready applications with ADK. It provides:
- Better User Experience: Users can interrupt tasks without having to wait for them to complete or time out.
- Resource Management: Prevents orphaned, long-running agent processes from consuming unnecessary server resources.
- Robustness: Creates a more complete and professional API for managing the agent lifecycle.
- Alignment with Industry Standards: Most AI/LLM frameworks (OpenAI, Anthropic, LangChain) provide cancellation mechanisms.
I believe this approach is minimally invasive and leverages the existing session management infrastructure effectively.
Related Issues/Context
This feature request arose from real-world usage where developers are building web applications on top of ADK agents deployed on Agent Engine and need responsive user interfaces with stop/cancel functionality.
(Co-authored by Gemini CLI and Claude Code)
The Problem
Currently, when an agent is running a long task (e.g., involving multiple tool calls or a complex chain of thought), there is no way to programmatically stop or cancel the execution via the ADK web server API.
In a web application scenario, a user might trigger a long-running agent task. If the user decides to stop the process, the front-end can send a request, but there is no endpoint on the ADK server to handle this cancellation. Disconnecting a streaming connection (like SSE or WebSocket) only stops the client from receiving further updates; the agent continues to run on the server, consuming resources until it completes its task.
This makes it difficult to build responsive and user-friendly interfaces on top of ADK agents, as there is no mechanism for user-initiated interruption.
Proposed Solution
I propose implementing a session-state-based cancellation mechanism. This would involve a few coordinated changes to the framework that should integrate cleanly with the existing architecture.
1. Introduce a
cancelledFlag in the Session StateThe
Sessionobject'sstatedictionary (defined viaTypeAliasinsrc/google/adk/sessions/state.py) is a perfect place to manage this. We can introduce acancelled: boolflag.2. Create a New Cancellation Endpoint
A new FastAPI endpoint should be added to
src/google/adk/cli/adk_web_server.py:This endpoint would be responsible for retrieving the session, setting the
cancelledflag toTrue, and persisting the change.3. Modify Core Agent Logic to Respect the Flag
The agent's execution loops need to be updated to check for this flag periodically. The key integration points would be:
src/google/adk/flows/llm_flows/base_llm_flow.py: Before making a call to the LLM in_call_llm_async(), the agent should checksession.state.get("cancelled"). If true, it should stop execution and yield a final "cancelled" message./run_live) connections, the message processing loops should check the flag. If it's true, the connection should be gracefully closed.ToolContextand check the session'scancelledflag, allowing them to terminate early.Here is a conceptual example of how the check in
base_llm_flow.pymight look:Additional Implementation Details
4. Function/Tool Execution Cancellation
In
src/google/adk/flows/llm_flows/functions.py, the parallel tool execution should check for cancellation:5. WebSocket Connection Handling
The WebSocket endpoint should also monitor the cancellation flag:
Alternative Approaches Considered
Testing Considerations
The implementation should include tests for:
Justification
This feature is critical for building production-ready applications with ADK. It provides:
I believe this approach is minimally invasive and leverages the existing session management infrastructure effectively.
Related Issues/Context
This feature request arose from real-world usage where developers are building web applications on top of ADK agents deployed on Agent Engine and need responsive user interfaces with stop/cancel functionality.
(Co-authored by Gemini CLI and Claude Code)