#11210: Harden Tool-Call Streaming Parser for OpenAI-Completions Backends (gpt-oss-120b + LiteLLM + LMDeploy)
stale
Cluster:
Error Handling in Agent Tools
# PR: Harden Tool-Call Streaming Parser for OpenAI-Completions Backends (LiteLLM + LMDeploy)
## Summary
This PR adds a defensive tool-call parsing guard for streamed `openai-completions` responses used by OpenClaw via `@mariozechner/pi-ai`.
It is required because some backends (notably `litellm -> lmdeploy -> gpt-oss-120b`) can emit tool-call deltas in variants that cause malformed/partial argument assembly, resulting in repeated empty tool arguments (`{}`), validation loops, and degraded agent behavior.
## Why This Is Needed
OpenClaw’s tool execution depends on complete, valid structured tool-call payloads.
When tool-call argument reconstruction fails during streaming:
- `read` is called with `{}` instead of `{ "path": ... }`
- `exec` is called with `{}` instead of `{ "command": ... }`
- agent loops on retries and may produce gibberish / fallback replies
This PR prevents malformed tool blocks from propagating to execution and broadens compatibility with real-world streaming variants.
## Related Issues
- `openclaw/openclaw#9916` - https://github.com/openclaw/openclaw/issues/9916
- `openclaw/openclaw#10507` https://github.com/openclaw/openclaw/issues/10507
- `openclaw/openclaw#9956` https://github.com/openclaw/openclaw/issues/9956
- `openclaw/openclaw#7867` https://github.com/openclaw/openclaw/issues/7867
- `badlogic/pi-mono#952` - https://github.com/badlogic/pi-mono/issues/952
## Affected Runtime Path
- File patched at runtime package level:
- `dist/providers/openai-completions.js` (inside `@mariozechner/pi-ai`)
- Patch file in this repo:
- `patches/pi-ai-0.52.6-toolcall-json-guard.patch`
## Root Cause (Observed)
Some streamed responses include tool/function deltas in forms not fully handled by the original parser path:
- alternate argument fields (`arguments`, `parsed_arguments`, `args`, `input`)
- `response.function_call_arguments.delta` and `...done` event types
- tool-call `id` updates/churn mid-stream
- final blocks where `name` is missing while partial args exist
Without guards, these edge cases leave `partialArgs` unparseable or unbound to the right tool block.
## What This Patch Changes
### 1) Adds robust argument extraction and parsing helpers
- `extractRawToolArgs(toolCall)`
- `recoverFirstJsonObject(raw)`
- `parseToolArgsSafely(raw, fallbackArgs)`
- `applyRawToolArgs(block, rawToolArgs)`
- `findToolCallBlock(blocks, callId)`
### 2) Handles additional streaming event variants
Adds handling for:
- `response.function_call_arguments.delta`
- `response.function_call_arguments.done`
This allows tool-call argument state to be updated from backend variants outside the basic `choice.delta.tool_calls` path.
### 3) Improves tool-call block continuity
- Avoids unnecessary new tool block creation when only `id` changes and no new tool name is present.
- Accepts late-arriving `choice.message.tool_calls` hydration into existing blocks.
### 4) Prevents malformed tool blocks from execution
- On tool block finalization: drop block if parse fails or tool `name` is blank.
- Final safety-net pass before `done`: strip any nameless `toolCall` blocks.
This ensures invalid tool calls do not reach OpenClaw’s tool validator/executor.
## Key Logic (High Level)
```mermaid
flowchart TD
A[Stream chunk received] --> B{Chunk type}
B -->|tool delta variants| C[Extract raw args from known fields]
B -->|choice.delta.tool_calls| D[Update/create tool block]
B -->|choice.message.tool_calls| E[Hydrate existing block by id/name]
C --> F[Append/replace partialArgs]
D --> F
E --> F
F --> G[Parse args safely]
G -->|ok| H[Emit toolcall_delta/end]
G -->|bad or nameless| I[Drop malformed block]
H --> J[Continue stream]
I --> J
J --> K[Final safety-net: remove nameless tool blocks]
K --> L[Emit done]
```
## Before vs After Behavior
```mermaid
sequenceDiagram
participant U as User
participant A as OpenClaw Agent
participant P as Parser (@mariozechner/pi-ai)
participant T as Tool Runtime
U->>A: "Use weather skill"
A->>P: Stream chat completion
P-->>A: toolCall(read, arguments={}) (before)
A->>T: read {}
T-->>A: validation error (missing path)
A->>P: retry
P-->>A: toolCall(read, arguments={})
Note over A,P: loop / degraded response
rect rgb(220,245,220)
U->>A: "Use weather skill"
A->>P: Stream chat completion
P-->>A: toolCall(read, arguments={"path":"/app/skills/weather/SKILL.md"}) (after)
A->>T: read {path: ...}
T-->>A: success
A->>T: exec {command: ...}
T-->>A: success
end
```
## Validation Results (Container Runtime)
Environment validated:
- Container: `Openclaw-1`
- Provider/API/Model:
- provider: `litellm`
- api: `openai-completions`
- model: `ga3/gpt-oss-120b`
Patch markers present in runtime file:
- `toolNameDelta = toolCall.function?.name || ""`
- `Final safety net: strip malformed tool calls before sending \\`done\\`.`
Latest session analyzed:
- `/home/node/.openclaw/agents/main/sessions/59214c76-2d15-447d-87b7-9ac6dd2fd90b.jsonl`
Metrics:
- Total tool calls: `59`
- Empty-args tool calls: `32`
- Validation errors:
- `read.path` missing: `26`
- `exec.command` missing: `6`
Stabilization split (same session):
- Pre-reset segment: `44` tool calls, `32` empty (`72.7%` empty)
- Post-reset segment: `15` tool calls, `0` empty (`0%` empty)
Observed successful post-stabilization flow:
- `read /app/skills/weather/SKILL.md` (valid args)
- `exec curl ... wttr.in ...` (valid args)
- `read /home/node/common-skills/gitea/SKILL.md` (valid args)
- `exec tea repos list` + login/env-driven gitea flow
## Scope and Safety
### Scope
- Parsing/assembly logic for tool-call deltas in streaming path only.
- No change to business logic of OpenClaw tools themselves.
### Safety controls in patch
- Defensive parsing with fallback recovery.
- Drop invalid or nameless tool blocks before execution.
- Preserve existing valid behavior for standard OpenAI-compatible streams.
## Risk Assessment
Potential risks:
- Over-filtering a valid but unusual tool block if name is absent too long.
- Behavioral differences for providers that rely on nonstandard delayed naming.
Mitigations:
- Name is updated from multiple event paths (`delta`, `message.tool_calls`).
- Filtering is intentionally final-stage and targeted at malformed blocks.
## Test / Repro Guidance
### Repro prompt set
1. `What is weather in Johannesburg, South Africa today? use your weather skill`
2. `use your gitea skill to list repos you can see`
### Expected with fix
- No repeated `{}` arguments in tool calls.
- `read` includes `path`.
- `exec` includes `command`.
- Tool failures, if any, should be setup/auth errors, not parser-shape errors.
## Rollout Recommendation
1. Merge parser hardening patch.
2. Keep session reset (`/reset`) in troubleshooting SOP when validating new deployments.
3. Add regression tests at parser layer for:
- function_call delta variants
- id churn
- partial JSON recovery
- nameless-tool block stripping
## Notes
- This patch is a pragmatic compatibility hardening for mixed provider chains.
- Long term, ideal resolution is in upstream `pi-ai` source + release, then consume via dependency bump.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds a patch applied to `@mariozechner/pi-ai`’s `dist/providers/openai-completions.js` to harden streamed tool-call parsing for OpenAI-completions backends.
Key changes include: extracting tool-call arguments from multiple possible fields, safer JSON parsing/recovery for assembled argument strings, handling additional streaming event types (`response.function_call_arguments.delta/done`), improving continuity when tool-call IDs churn, and dropping malformed/nameless tool-call blocks before emitting `toolcall_end` and before the final `done` event. These changes are confined to the streaming parser path and are intended to prevent empty `{}` tool arguments from reaching OpenClaw’s tool validator/executor.
<h3>Confidence Score: 4/5</h3>
- This PR looks safe to merge and is narrowly scoped to defensive parsing in the streaming tool-call path.
- The change is contained to a runtime patch file and adds guardrails around tool-call argument assembly and finalization, including explicit dropping of malformed/nameless tool calls. I didn’t find a definitive logic error in the introduced conditions from the patch alone, but the patch touches subtle streaming state, so it merits careful runtime validation across providers.
- patches/pi-ai-0.52.6-toolcall-json-guard.patch
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#19394: fix(agents): normalize tool call arguments dropped to {} (#19261)
by DevvGwardo · 2026-02-17
79.9%
#19094: Fix empty tool_call_id and function names in provider transcript pa...
by yxshee · 2026-02-17
78.6%
#15050: fix: transcript corruption resilience — strip aborted tool_use bloc...
by yashchitneni · 2026-02-12
77.0%
#23648: fix: detect truncated file paths from partial JSON streaming
by davidemanuelDEV · 2026-02-22
77.0%
#19024: fix: Fix normalise toolid
by chetaniitbhilai · 2026-02-17
76.9%
#14328: fix: strip incomplete tool_use blocks from errored/aborted messages...
by Kropiunig · 2026-02-12
76.9%
#6687: fix(session-repair): strip malformed tool_use blocks to prevent per...
by NSEvent · 2026-02-01
76.6%
#3647: fix: sanitize tool arguments in session history
by nhangen · 2026-01-29
76.2%
#7392: OpenResponses: tool output items, reasoning summary, opt‑in tool da...
by lylepratt · 2026-02-02
75.9%
#9011: fix(session): auto-recovery for corrupted tool responses [AI-assisted]
by cheenu1092-oss · 2026-02-04
75.3%