← Back to PRs

#21931: feat(config): auto-rollback to last known-good backup on invalid config startup

by Protocol-zero-0 open 2026-02-20 15:02 View on GitHub →
docs gateway size: S
## TL;DR for maintainers When `openclaw.json` fails validation at gateway startup, the gateway now automatically restores the most recent valid `.bak` backup instead of crashing into an unrecoverable restart loop. The broken config is preserved as `openclaw.json.broken-<timestamp>` for debugging. Closes #18200 --- ## Problem If a user (or an agent via `config.patch`) writes a broken `openclaw.json`, the gateway throws at startup and exits. Process supervisors (`launchd`, `systemd`) immediately restart it — into the same crash. Because the gateway **is** the bot connection, users cannot fix the config through the UI and are stuck in a silent failure loop (see [tweet with 828 likes](https://x.com/xBenJamminx/status/1888741825891164190)). ## Solution OpenClaw already creates up to 5 rotated backups (`.bak`, `.bak.1`, …, `.bak.4`) every time `writeConfigFile` persists a change. This PR adds the **restore** side: 1. **`tryLoadValidConfigBackup(configPath)`** (`src/config/io.ts`) Iterates `.bak` → `.bak.4`, reads + validates each file, and returns the first snapshot that passes `validateConfigObjectRawWithPlugins`. Returns `null` when no usable backup exists. 2. **Gateway startup fallback** (`src/gateway/server.impl.ts`) When `configSnapshot.exists && !configSnapshot.valid`: - Copies the broken config to `openclaw.json.broken-<timestamp>` (best-effort) - Calls `tryLoadValidConfigBackup` — if a valid backup is found, writes it back to `openclaw.json` and continues startup with a loud `log.warn` - If no backup is valid, throws the same error as before (with an updated message noting that no backup was found) 3. **Re-export** (`src/config/config.ts`) — exposes the new helper. ## What's NOT in this PR - Notification via messaging channel on rollback (Issue #18200 item 3 — separate PR) - Crash-loop rate limiting (tracked in #16810) ## Testing - 4 new unit tests for `tryLoadValidConfigBackup`: - Returns `null` when no backups exist - Finds first valid `.bak` - Skips invalid backups and finds the next valid one - Returns `null` when all backups are invalid - All existing config IO tests pass (8/8 write-config, 25/25 config-misc) ## AI disclosure This change was AI-assisted (research + implementation). All code was manually reviewed and tested by the author. Made with [Cursor](https://cursor.com) <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds automatic config rollback to prevent crash loops when `openclaw.json` fails validation at gateway startup. The gateway now attempts to restore from up to 5 rotated backups (`.bak` through `.bak.4`) when the config is invalid, preserving the broken config as `openclaw.json.broken-<timestamp>` for debugging. If no valid backup exists, the gateway throws an error as before. The implementation is well-structured with clear separation of concerns: - `tryLoadValidConfigBackup` in `src/config/io.ts` handles the backup search and validation logic - Gateway startup in `src/gateway/server.impl.ts` orchestrates the rollback when needed - Comprehensive test coverage validates all rollback scenarios The rollback logic correctly leverages the existing backup rotation system (5 backups created on each config write) and uses the same validation function (`validateConfigObjectRawWithPlugins`) to ensure consistency. The broken config is preserved for debugging before restoration, and a warning is logged with full details about the rollback. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge - the implementation is well-designed, thoroughly tested, and solves a critical crash-loop issue without introducing risk. - The code follows existing patterns, includes comprehensive unit tests covering all scenarios (no backups, first valid backup, skipping invalid backups, all invalid), and handles edge cases properly with best-effort error handling where appropriate. The rollback logic integrates cleanly with the existing config validation and backup rotation systems. No breaking changes or risky modifications to core logic. - No files require special attention <sub>Last reviewed commit: 73fcbf9</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs