#12039: TTS voice bubbles, node role auth, media fixes, CI lint cleanup
channel: whatsapp-web
app: macos
gateway
scripts
agents
stale
Cluster:
Voice Call and TTS Improvements
## Summary
- **Gateway: allow node role to call UI methods** — Mac/iOS apps connect as `role: "node"` but need `config.get`, `config.schema`, `chat.send`, `talk.mode`, etc. for Settings and Talk Mode to function. Adds a `NODE_ALSO_ALLOWED` whitelist in `server-methods.ts`.
- **macOS: set speech recognition `taskHint` for Talk Mode** — adds `.dictation` hint to `SFSpeechAudioBufferRecognitionRequest`, matching Voice Wake.
- **Media: auto-convert non-WAV audio to WAV for whisper-cli** — Telegram voice notes arrive as OGG Opus; adds an ffmpeg conversion step so local whisper.cpp transcription works.
- **TTS: deliver tool audio as Telegram voice bubble** — fixes `isValidMedia()` rejecting absolute temp paths and plumbs the `audioAsVoice` flag through the full `onToolResult` callback chain. Adds `audio/opus` MIME mapping.
- **Security: block symlink traversal in media loading** — resolves symlinks via `fs.realpath()` before reading local media files to prevent exfiltration outside temp directories.
- **Extract binary-resolution utilities from runner.ts** — moves `findBinary`/`hasBinary`/`fileExists` to `runner-binary.ts` to fix CI code-size check.
- **Fix symlink traversal check for Windows temp paths** — `fs.realpath()` resolves Windows 8.3 short names, causing false rejections; uses `os.tmpdir()` resolved through realpath as baseline.
- **Fix CI: SwiftLint + docker-setup test** — exclude generated `GatewayModels.swift` from SwiftLint, resolve bash path cross-platform in docker-setup test, extract helpers to reduce cyclomatic complexity and function body length warnings.
## Security
`isValidMedia()` allows absolute paths only under:
- `os.tmpdir()` (cross-platform, handles Windows 8.3 names)
- `/tmp/`, `/private/tmp/` (Unix standard)
- `/var/folders/<xx>/<hash>/T/` (macOS temp only — not `/C/` caches or `/0/` cleanup)
Symlinks are resolved before the prefix check. All original LFI rejection tests pass unchanged.
## Test plan
- [x] `pnpm build` — no type errors
- [x] `vitest run src/media/parse.test.ts` — 11/11 passing
- [x] `vitest run src/docker-setup.test.ts` — 5/5 passing
- [x] `swiftlint` — 0 warnings, 0 errors
- [x] Telegram voice bubble confirmed on device
- [x] LFI paths (`/etc/passwd`, `/Users/...`, `~/...`, `../../`) still rejected
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Most Similar PRs
#21110: fix(tts): deliver audio via structured mediaUrl instead of MEDIA: t...
by hydro13 · 2026-02-19
71.2%
#11704: feat(tts): OpenAI TTS baseUrl support for local servers (Chatterbox...
by mateusz-michalik · 2026-02-08
70.5%
#21075: fix(media): use sips on Node.js + darwin to prevent Photos TCC prompt
by irchelper · 2026-02-19
70.0%
#20475: fix(macos): resolve 120%+ CPU regression and gateway stability
by teknomage8 · 2026-02-19
69.9%
#11710: fix: security hardening — exec blocking, auth validation, timing-sa...
by zendizmo · 2026-02-08
69.3%
#7400: media: allow temp-dir MEDIA paths for tool outputs
by grammakov · 2026-02-02
68.8%
#19439: fix(tts): pass audioAsVoice flag through tool result pipeline
by brandonwise · 2026-02-17
67.9%
#19522: feat(bluebubbles): send TTS as native iMessage voice memos
by mwmacmahon · 2026-02-17
67.9%
#15914: feat: add messages.suppressMediaPlaceholders config option
by Shuai-DaiDai · 2026-02-14
67.6%
#23627: fix(telegram,feishu): pass mediaLocalRoots through channel action a...
by rockkoca · 2026-02-22
66.7%