← Back to PRs

#19255: feat(gateway): add WebSocket connection metrics monitoring

by Wike-CHI open 2026-02-17 15:43 View on GitHub →
gateway size: M
## Summary Describe the problem and fix in 2–5 bullets: - Problem: - Why it matters: - What changed: - What did NOT change (scope boundary): ## Change Type (select all) - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # - Related # ## User-visible / Behavior Changes List user-visible changes (including defaults/config). If none, write `None`. ## Security Impact (required) - New permissions/capabilities? (`Yes/No`) - Secrets/tokens handling changed? (`Yes/No`) - New/changed network calls? (`Yes/No`) - Command/tool execution surface changed? (`Yes/No`) - Data access scope changed? (`Yes/No`) - If any `Yes`, explain risk + mitigation: ## Repro + Verification ### Environment - OS: - Runtime/container: - Model/provider: - Integration/channel (if any): - Relevant config (redacted): ### Steps 1. 2. 3. ### Expected - ### Actual - ## Evidence Attach at least one: - [ ] Failing test/log before + passing after - [ ] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) ## Human Verification (required) What you personally verified (not just CI), and how: - Verified scenarios: - Edge cases checked: - What you did **not** verify: ## Compatibility / Migration - Backward compatible? (`Yes/No`) - Config/env changes? (`Yes/No`) - Migration needed? (`Yes/No`) - If yes, exact upgrade steps: ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: - Files/config to restore: - Known bad symptoms reviewers should watch for: ## Risks and Mitigations List only real risks for this PR. Add/remove entries as needed. If none, write `None`. - Risk: - Mitigation: <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR adds WebSocket connection metrics monitoring to the gateway. It introduces a `WsMetricsCollector` singleton class that tracks connection lifecycle events (connect, handshake, disconnect), message counts, byte throughput, and per-client stats. Two new gateway methods are exposed: `ws.metrics` (read-scoped, available to operators) and `ws.clients` (admin-only, returns detailed per-client stats). - New `WsMetricsCollector` class in `src/gateway/server/ws-metrics.ts` with capped rolling window for average connection duration (last 1000) and EMA-based latency tracking - `ws.metrics` added to `READ_METHODS` — accessible to any operator with `operator.read` scope - `ws.clients` relies on the catch-all admin fallthrough in `authorizeGatewayMethod` rather than being explicitly listed in an authorization set — functionally correct but fragile - Pre-handshake messages (e.g., `connect.challenge`) are counted in global `messagesSent` / `bytesSent` but not in per-client stats, creating a minor data inconsistency - The `updateLatency` method is defined on the collector but not called anywhere in this PR - No tests are included for the new metrics collector or handlers <h3>Confidence Score: 3/5</h3> - This PR is likely safe to merge — it adds read-only observability instrumentation with no changes to existing business logic — but has minor data consistency concerns and no test coverage. - The implementation is structurally sound and the metrics collector is well-designed with appropriate caps. However: (1) no tests are included for a feature that touches the hot path of every WebSocket message, (2) pre-handshake message counting creates a global-vs-per-client data inconsistency, and (3) the `ws.clients` authorization relies on an implicit fallthrough rather than explicit registration. The PR description is also entirely empty, making it harder to assess intent and scope. - `src/gateway/server/ws-connection.ts` (pre-handshake metrics accounting), `src/gateway/server-methods/ws-metrics.ts` (authorization approach for `ws.clients`) <sub>Last reviewed commit: 2f06e27</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs