#9703: feat(macos): Voice settings restructure + Whisper transcription support
app: macos
stale
Cluster:
Voice Transcription Enhancements
## Summary
Restructures the Voice settings UI and adds Whisper transcription support for both Push-to-Talk and Voice Wake.
## Changes
### UI Restructure
- Renamed tab from "Voice Wake" → "Voice"
- Reorganized settings into 5 clear sections:
- **Voice Wake** - Wake word detection (always uses Apple Speech)
- **Push-to-Talk** - Hotkey-triggered transcription
- **Transcription Model** - Combined picker for Apple Speech + Whisper models
- **Audio** - Input device selection
- **Sounds** - Audio feedback toggles
### Whisper Integration
- Fixed binary detection: `whisper-cpp` → `whisper-cli` (Homebrew renamed it)
- Added combined model picker showing Apple Speech and all Whisper model sizes
- Implemented rolling audio buffer for Voice Wake → Whisper handoff
- Push-to-Talk now supports Whisper transcription via sox/rec
### Voice Wake Whisper Handoff
- Apple Speech handles wake word detection (efficient for always-on)
- After wake phrase detected, audio buffer is sent to Whisper for command transcription
- Maintains ~10s rolling buffer so pre-wake audio isn't lost
## Testing
- [x] App builds and signs
- [x] Voice settings UI renders correctly
- [x] Whisper model detection works
- [x] Wake word matching logic verified with unit test
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR restructures the macOS Voice settings UI (renaming the tab to “Voice” and splitting settings into Voice Wake / Push-to-Talk / Transcription Model / Audio / Sounds). It also adds local Whisper support by introducing a `WhisperTranscriber` actor, new persisted state for transcription backend + model, and a `RollingAudioBuffer` used to hand off buffered post-wake audio to Whisper for command transcription.
<h3>Confidence Score: 3/5</h3>
- This PR is close, but has a few concrete runtime/UX issues around Whisper execution and user guidance.
- Main changes are straightforward UI + new Whisper plumbing, but multiple code paths hardcode Homebrew binary locations and the availability checks/messages are inconsistent, which can cause Whisper to be reported as available yet fail at runtime or mislead users on setup.
- apps/macos/Sources/OpenClaw/WhisperTranscriber.swift; apps/macos/Sources/OpenClaw/VoicePushToTalk.swift; apps/macos/Sources/OpenClaw/VoiceWakeSettings.swift; apps/macos/Sources/OpenClaw/MenuContentView.swift
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#12157: feat(macos): add Granola-style meeting notes with live transcription
by npow · 2026-02-08
76.8%
#8848: feat(stt): Add Whisper as first-class audio transcription provider
by emadomedher · 2026-02-04
76.4%
#14458: fix(voicewake): avoid crash on foreign transcript ranges
by guchang · 2026-02-12
75.5%
#10012: Webui voice
by nanxiacc · 2026-02-06
75.1%
#18235: macOS: prevent Voice Wake crash when no input device is available
by agisilaos · 2026-02-16
72.9%
#20053: feat(voicewake): trigger-based routing to agent/session
by longbiaochen · 2026-02-18
72.8%
#19073: feat(voice-call): streaming TTS, barge-in, silence filler, hangup, ...
by odrobnik · 2026-02-17
72.6%
#9456: feat(mac): add enhanced Siri neural voice support for Talk mode
by teknomage8 · 2026-02-05
72.4%
#23572: feat(voice): enable voice note conversation loop for Telegram and W...
by davidrudduck · 2026-02-22
72.3%
#20475: fix(macos): resolve 120%+ CPU regression and gateway stability
by teknomage8 · 2026-02-19
70.9%