โ† Back to PRs

#17618: Automated issue dedupe with Codex Action

by odysseus0 open 2026-02-16 00:47 View on GitHub โ†’
scripts size: M
## What this does Every time a new issue is opened, an AI agent searches for duplicates and posts a comment like: > Found 2 possible duplicate issues: > 1. https://github.com/openclaw/openclaw/issues/42 > 2. https://github.com/openclaw/openclaw/issues/87 > > This issue will be automatically closed as a duplicate in 3 days. > - To prevent auto-closure, ๐Ÿ‘Ž this comment After 3 days, if nobody thumbs-down reacts, a daily cron job closes it. No human intervention needed for the happy path. ## Why The repo has a growing issue backlog with many duplicates. Manual triage doesn't scale. This automates the boring part โ€” finding and closing duplicates โ€” while keeping humans in the loop via the ๐Ÿ‘Ž escape hatch. ## Where this pattern comes from Direct adaptation of [`anthropics/claude-code`](https://github.com/anthropics/claude-code)'s production dedupe system, running on a repo with 5k+ issues. Their implementation: - [`scripts/comment-on-duplicates.sh`](https://github.com/anthropics/claude-code/blob/main/scripts/comment-on-duplicates.sh) โ€” posts the marker comment - [`scripts/auto-close-duplicates.ts`](https://github.com/anthropics/claude-code/blob/main/scripts/auto-close-duplicates.ts) โ€” daily cron closer - [`.claude/commands/dedupe.md`](https://github.com/anthropics/claude-code/blob/main/.claude/commands/dedupe.md) โ€” the AI prompt We swap Claude for [Codex Action](https://github.com/openai/codex-action) (`openai/codex-action@v1`) since the project has OpenAI credits. ## Files added (5) | File | What it does | |------|-------------| | `scripts/comment-on-duplicates.sh` | Posts the "possible duplicate" comment with 3-day grace period. Called by the AI agent via `gh`. | | `scripts/auto-close-duplicates.ts` | Daily cron script. Scans open issues for uncontested duplicate comments older than 3 days โ†’ closes them. Uses Bun (already in CI via `oven-sh/setup-bun@v2`). | | `.codex/config.toml` | Sandbox config so Codex can run `gh` (needs network access + env var passthrough). | | `.github/workflows/codex-dedupe-issues.yml` | Triggers on `issues: [opened]` + `workflow_dispatch` for manual backfill. Runs Codex with an inline prompt to search for duplicates using `gh`. | | `.github/workflows/auto-close-duplicates.yml` | Daily cron at 09:00 UTC. Runs the TypeScript closer. | ## How auto-close decides The closer will only close an issue if ALL of these are true: 1. A bot comment containing "Found" + "possible duplicate" exists 2. That comment is older than 3 days 3. Nobody has ๐Ÿ‘Ž reacted to the duplicate comment Comments don't block closure โ€” only a thumbs-down does. Reviewers can comment on flagged issues without accidentally preventing auto-close. ## Setup required (one secret) After merging, add this repo secret: - **`OPENAI_API_KEY`** โ€” for Codex Action to call the model `GITHUB_TOKEN` is provided automatically by Actions. ## Testing it 1. Merge this PR 2. Add the `OPENAI_API_KEY` secret 3. Go to Actions โ†’ "Codex Dedupe Issues" โ†’ "Run workflow" 4. Enter an issue number you know has duplicates 5. Watch it post a duplicate comment The daily closer will pick up uncontested duplicates automatically after 3 days. --- ### CONTRIBUTING.md checklist - **Local validation:** `pnpm build/check/test` not applicable โ€” this PR adds CI workflows and standalone scripts, no application code changes. Shell script validated with `bash -n`. Real validation is `workflow_dispatch` after merge. - **Focused scope:** Single feature โ€” automated issue deduplication. - **Clear what + why:** Above. - **AI-assisted:** Yes. Research, planning, and implementation done with Claude Code (Opus). Scripts adapted from anthropics/claude-code's production system with repo-specific changes (branding, defaults, auto-close logic). All code reviewed and understood by the contributor. <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR adds automated issue deduplication using OpenAI's Codex Action: a workflow triggers on new issues to search for duplicates and post a warning comment, and a daily cron job auto-closes uncontested duplicates after 3 days. The shell script (`comment-on-duplicates.sh`) is well-structured with proper validation. However, the TypeScript auto-close script has bugs that need fixing before merge: - **Label replacement bug**: The `closeIssueAsDuplicate` function uses `labels: ["duplicate"]` in a PATCH request, which will **replace all existing labels** on the issue instead of appending. This could silently strip important labels like `bug`, `security`, or `high-priority`. - **Invalid `state_reason`**: `state_reason: "duplicate"` is not a valid GitHub API value โ€” only `"completed"`, `"not_planned"`, and `"reopened"` are accepted. - **Non-existent env var**: `GITHUB_REPOSITORY_NAME` is not a GitHub Actions environment variable. The fallback to `"openclaw"` masks this bug for now. - **Codex sandbox config**: `inherit = "all"` in `.codex/config.toml` passes all environment variables (including secrets) into the sandbox โ€” should be scoped to specific variables. <h3>Confidence Score: 2/5</h3> - The auto-close script has bugs that will silently strip labels from closed issues and use an invalid API field โ€” recommend fixing before merge. - Score of 2 reflects two confirmed API misuse bugs in the auto-close script: (1) label replacement that will destroy existing labels on every issue it closes, and (2) an invalid state_reason value. The GITHUB_REPOSITORY_NAME bug is latent (masked by the fallback). The shell script and workflows are well-structured. - `scripts/auto-close-duplicates.ts` has the critical bugs โ€” the label replacement and invalid state_reason. `.codex/config.toml` has an overly permissive environment variable policy. <sub>Last reviewed commit: 96bd492</sub> <!-- greptile_other_comments_section --> <sub>(1/5) You can manually trigger the agent by mentioning @greptileai in a comment!</sub> <!-- /greptile_comment -->

Most Similar PRs