#11754: fix(read): persist image data and inject MEDIA directive for channel delivery

by QDenka open 2026-02-08 08:11 View on GitHub →

agents stale

Cluster: Voice Call and TTS Improvements

## Summary When the `read` tool reads an image file, the base64 image data is returned as a content block visible to the LLM but never converted to a deliverable media URL. This means images read by agents are not sent to Telegram or other channels — the user only sees the agent's text reply without the image. ## Changes - **`src/agents/pi-tools.read.ts`**: After reading an image, persist the base64 data to a cache file under `.openclaw/media-cache/` in the workspace and inject a `MEDIA:./relative-path` directive into the text content block. The existing delivery pipeline then picks up the relative path via `splitMediaFromOutput` and sends the image to the channel. - **`src/agents/pi-tools.ts`**: Pass `workspaceRoot` to `createOpenClawReadTool` for the non-sandboxed path. - **`src/agents/pi-tools.read.image-delivery.test.ts`**: Tests verifying MEDIA injection for image reads, no injection for text reads, and no injection without workspaceRoot. ## How it works 1. Agent calls `read` on an image file 2. Read tool returns `{ type: 'image', data: '<base64>', mimeType: 'image/png' }` content block 3. **New**: The image data is persisted to `.openclaw/media-cache/<hash>.png` 4. **New**: A `MEDIA:./…` directive is appended to the text content block 5. The delivery pipeline (`splitMediaFromOutput`) extracts the media URL 6. Image is sent to Telegram/other channels via `sendMedia` Fixes #11735  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR updates the `read` tool wrapper so that when an image is read (base64 `image` content block), the image payload is persisted into a workspace-local cache directory (`.openclaw/media-cache/`) and a `MEDIA:` directive is appended to the tool’s text output. The existing media parsing/delivery pipeline (`splitMediaFromOutput` → channel `sendMedia`) can then detect the directive and deliver the image to downstream channels. It also wires `workspaceRoot` through the non-sandboxed tool creation path and adds a Vitest suite covering the injection behavior. <h3>Confidence Score: 2/5</h3> - This PR has a few correctness issues that can break media delivery and should be fixed before merging. - The core idea (persist + MEDIA directive) fits the existing `splitMediaFromOutput` pipeline, but the current directive/path formatting and injection behavior are inconsistent with how MEDIA tokens are parsed/consumed, and the persistence step can produce invalid/corrupted files without detection. These are likely to cause real delivery failures or duplicated MEDIA extraction in normal operation. - src/agents/pi-tools.read.ts (MEDIA path format, injection target, image persistence validation); src/agents/pi-tools.read.image-delivery.test.ts (would currently fail once path semantics are corrected/validated)  <sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub>