← Back to PRs

#19801: fix: pre-check write permissions before global install to prevent EACCES destruction

by menhguin open 2026-02-18 05:41 View on GitHub →
size: S
## Summary - **Problem:** `runGatewayUpdate()` runs `npm i -g` (or pnpm/bun equivalent) without checking write permissions first. Package managers remove the existing install before writing the new one, so an EACCES failure mid-install leaves the CLI partially deleted with no rollback. - **Why it matters:** The gateway becomes a zombie — process stays alive but the underlying code is gone. Messages are silently dropped. We discovered this after a real-world **34-hour silent outage** caused by a permission misconfiguration. - **What changed:** Added a single `fs.access(pkgRoot, W_OK)` pre-check before the destructive global install step. If the directory is not writable, the update returns early with `status: "error"` and `reason: "no-write-permission"` — no install attempt, no damage. - **What did NOT change (scope boundary):** No behavior change for working installs. No refactoring. Git-based update paths are untouched. The existing global install logic is identical when permissions are fine. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Docs - [x] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Related #9100 (npm global update drops bundled deps) - Related #6318 (npm global upgrade path is fragile) - Related #11237 (Gateway Uninstalled on Restart) ## User-visible / Behavior Changes - When `update.run` detects the global install directory is not writable, it now returns `{ status: "error", reason: "no-write-permission" }` instead of attempting the install and potentially destroying the existing installation. - No change for users with correct permissions — behavior is identical. ## Security Impact (required) - New permissions/capabilities? `No` - Secrets/tokens handling changed? `No` - New/changed network calls? `No` - Command/tool execution surface changed? `No` — the npm/pnpm/bun install command is *prevented* from running, not changed - Data access scope changed? `No` ## Repro + Verification ### Environment - OS: Linux (GCP, x64) - Runtime/container: Node v22.22.0, npm global install - Model/provider: N/A (infra bug) - Integration/channel: All channels affected (gateway-level) - Relevant config: Global npm install at `/usr/lib/node_modules/openclaw` owned by root, gateway running as non-root user ### Steps 1. Install OpenClaw globally: `sudo npm i -g openclaw` 2. Run gateway as non-root user 3. Trigger update: `update.run` (via cron, weekly update job, or manual) 4. npm tries `npm i -g openclaw@latest`, hits EACCES, partially removes existing install 5. Gateway continues running but all tool execution / message handling breaks silently ### Expected - Update detects missing write permission and returns a clear error *before* touching the filesystem ### Actual (before fix) - npm begins uninstalling the old version, hits EACCES writing the new version, leaves a corrupted half-deleted install ## Evidence - [x] Failing test/log before + passing after - Added test: "returns error with no-write-permission when global install path is not writable" - All 16 tests in `update-runner.test.ts` pass (including the new one) - Linter clean (oxlint: 0 warnings, 0 errors) ## Human Verification (required) - Verified scenarios: Ran `vitest run src/infra/update-runner.test.ts` — all 16 tests pass. Test uses `fs.chmod(pkgRoot, 0o555)` to simulate a non-writable global install directory. - Edge cases checked: Ensured the check uses `W_OK` on the package root directory itself (not parent), which matches where npm/pnpm/bun need to write. Verified `fs.constants.W_OK` is available on `node:fs/promises`. - What I did **not** verify: Full `pnpm build && pnpm check && pnpm test` suite (ran only the affected test file). Did not test on Windows (EACCES semantics differ). ## Compatibility / Migration - Backward compatible? `Yes` - Config/env changes? `No` - Migration needed? `No` ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: Revert this single commit - Files/config to restore: None - Known bad symptoms reviewers should watch for: If the W_OK check gives false negatives (reports not writable when it is), updates would be blocked with `no-write-permission`. User can verify by checking `ls -la` on the global install path. ## Risks and Mitigations - Risk: On some platforms, `fs.access(dir, W_OK)` may not perfectly predict whether `npm i -g` will succeed (e.g., ACLs, SELinux, Windows) - Mitigation: This is a best-effort pre-check. False negatives (blocking a valid update) are much less harmful than the current behavior (destroying the install). Users hitting a false negative get a clear error with the specific path to investigate. --- 🤖 *AI-assisted PR (Claude via OpenClaw agent). Fully tested. The humans understand the code.* <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds a write-permission pre-check (`fs.access(pkgRoot, W_OK)`) before the destructive `npm i -g` / `pnpm` / `bun` global install step in `runGatewayUpdate()`. This prevents the package manager from partially deleting the existing installation when it hits EACCES mid-install, which previously caused a 34-hour silent outage where the gateway stayed alive but all tool execution and message handling broke. - `update-runner.ts`: Inserts a `try { await fs.access(pkgRoot, fs.constants.W_OK) }` check immediately after detecting the global install manager but before any install commands run. On failure, returns `{ status: "error", reason: "no-write-permission" }` with zero steps executed. - `update-runner.test.ts`: Adds a test using `fs.chmod(pkgRoot, 0o555)` to simulate a non-writable directory, verifying the error return and confirming the install command is never invoked. Permissions are properly restored in a `finally` block. - The change is minimal and well-scoped — no behavior change for working installs, no refactoring, git-based update paths untouched. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge — it adds a defensive pre-check with no behavior change for working installs. - The change is minimal (17 lines of production code), well-tested, correctly typed against UpdateRunResult, and properly integrated with existing consumers. The pre-check is a strict improvement: it prevents a destructive failure mode with no downside for correct-permission scenarios. The fs.access/W_OK API is well-established in Node.js and already used elsewhere in this codebase. - No files require special attention. <sub>Last reviewed commit: 3c8f94c</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) - Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13)) <!-- /greptile_comment -->

Most Similar PRs