GitHub PR 自动化监控与修复工具 babysit-pr | 持续集成与代码审查助手

babysit-pr by openai/codex

298 周安装量

67,100 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/openai/codex --skill babysit-pr

开发自动化开发运维

🇨🇳中文介绍

PR 保姆

目标

持续照看一个 PR，直到出现以下任一终止结果：

PR 被合并或关闭。
CI 成功，监视器未发现未处理的评审意见，所需的评审批准不阻碍合并，且没有潜在的合并冲突（PR 可合并 / 未报告冲突风险）。
出现需要用户帮助的情况（例如 CI 基础设施问题、重试预算耗尽后重复出现的偶发性失败、权限问题，或无法安全解决的模糊情况）。

不要仅仅因为某个快照返回 idle 状态而停止，只要检查仍在进行中。

输入

接受以下任一输入：

无 PR 参数：从当前分支推断 PR（--pr auto）
PR 编号
PR URL

核心工作流程

当用户要求“监控”/“监视”/“照看”一个 PR 时，除非你特意进行一次性诊断快照，否则应使用监视器的连续模式（--watch）开始。
运行监视器脚本来获取 PR/CI/评审状态的快照（或消费来自 --watch 的每个流式快照）。
检查 JSON 响应中的 actions 列表。
如果存在 diagnose_ci_failure，则检查失败的运行日志并对故障进行分类。

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

连续监视（JSONL）

python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --watch

触发偶发性重试循环（仅在监视器指示时）

python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --retry-failed-now

显式指定 PR 目标

python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr <number-or-url> --once

在决定重新运行之前，使用 gh 命令检查失败的运行。

gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha
gh run view <run-id> --log-failed

当日志指向更改的代码（在触及区域内的编译/测试/lint/类型检查/快照/静态分析）时，倾向于将故障视为与分支相关。当日志显示短暂的基础设施/外部问题时（超时、运行器配置失败、注册表/网络中断、GitHub Actions 基础设施错误），倾向于将故障视为偶发性/无关。如果分类不明确，在选择重新运行之前进行一次手动诊断尝试。阅读 .codex/skills/babysit-pr/references/heuristics.md 以获取简明的检查清单。

监视器从以下来源发现评审项：

PR 问题评论
行内评审评论
评审提交（COMMENT / APPROVED / CHANGES_REQUESTED）

除了人工评审反馈外，它特意发现 Codex 评审机器人反馈（例如来自 chatgpt-codex-connector[bot] 的评论/评审）。大多数无关的机器人噪音仍应被忽略。为安全起见，监视器仅自动发现受信任的人工评审作者（例如仓库 OWNER/MEMBER/COLLABORATOR，加上经过身份验证的操作者）和已批准的评审机器人（如 Codex）。在新的监视器状态文件上，现有的待处理评审反馈可能会立即被发现（不仅仅是监控开始后到达的评论）。这是有意为之，以免遗漏已打开的评审评论。

当你同意某个评论并且它是可操作时：

在本地修补代码。
使用 codex: address PR review feedback (#<n>) 提交。
推送到 PR 头部分支。
立即在新的 SHA 上恢复监视（不要在报告推送后停止）。
如果监控是以 --watch 模式运行的，则在推送后立即在同一轮对话中重新启动 --watch；不要等待用户再次询问。

如果你不同意，或者评论不可操作/已处理，则通过继续监视器循环将其记录为已处理（脚本在发现项目后通过状态进行去重）。如果某个代码评审评论/线程在 GitHub 中已标记为已解决，则将其视为不可操作并安全忽略，除非出现新的未解决的后续反馈。

仅在 PR 头部分支上工作。
避免破坏性的 git 命令。
除非必要，否则不要切换分支以恢复上下文。
在编辑之前，检查是否存在无关的未提交更改。如果存在，停止并询问用户。
每次成功修复后，提交并 git push，然后重新运行监视器。
如果你中断了正在运行的 --watch 会话来进行修复，请在推送后立即在同一轮对话中重新启动 --watch。
不要为同一个 PR/状态文件运行多个并发的 --watch 进程；保持一个监视器会话处于活动状态并重复使用它，直到它停止或你故意重新启动它。
推送不是终止结果；除非满足严格的停止条件，否则继续监控循环。

提交消息默认值：

codex: fix CI failure on PR #<n>
codex: address PR review feedback (#<n>)

在实时 Codex 会话中使用此循环：

运行 --once。
读取 actions。
首先检查 PR 是否已合并或以其他方式关闭；如果是，报告该终止状态并立即停止轮询。
检查 CI 摘要、新的评审项以及可合并性/冲突状态。
诊断 CI 故障并分类为与分支相关还是偶发性/无关。
当两者都存在时，在处理偶发性重试之前先处理可操作的评审评论；如果评审修复需要提交，则推送它并跳过在旧的 SHA 上重新运行失败的检查。
仅当存在 retry_failed_checks 并且你不打算用评审/CI 修复提交替换当前 SHA 时，才重试失败的检查。
如果你推送了提交或触发了重新运行，简要报告该操作并继续轮询（不要停止）。
在评审修复推送后，除非已达到严格的停止条件，否则在同一轮对话中主动重新启动连续监控（--watch）。
如果所有检查都通过、可合并、不受所需评审批准阻碍，并且没有未处理的评审项，则报告成功并停止。
如果被需要用户帮助的问题阻碍（基础设施中断、偶发性重试耗尽、评审者请求不明确、权限问题），报告阻碍并停止。
否则，根据下面的轮询节奏休眠并重复。

当用户明确要求监控/监视/照看一个 PR 时，优先使用 --watch，以便轮询在一个命令中自主继续。仅在调试、本地测试或用户明确要求一次性检查时，才使用重复的 --once 快照。不要停下来询问用户是否继续轮询；自主继续，直到满足严格的停止条件或用户明确中断。不要在评审修复推送后仅仅因为创建了新的 SHA 就将控制权交还给用户；重新启动监视器并重新进入轮询循环是同一照看任务的一部分。如果一个 --watch 进程仍在运行且未达到严格的停止条件，则照看任务仍在进行中；继续流式传输/消费监视器输出，而不是结束本轮对话。

使用自适应轮询，即使在 CI 变绿后也继续监控：

当 CI 未变绿（待处理/运行中/排队中或失败）时：每分钟轮询一次。
CI 变绿后：从每分钟一次开始，然后在没有变化时按指数退避（例如 1m, 2m, 4m, 8m, 16m, 32m），上限为每小时一次。
每当有任何变化时（新提交/SHA、检查状态变化、新评审评论、可合并性变化、评审决定变化），将绿色状态的轮询间隔重置回 1 分钟。
如果 CI 再次变绿失败（新提交、重新运行或回归）：返回到 1 分钟轮询。
如果任何轮询显示 PR 已合并或以其他方式关闭：立即停止轮询并报告终止状态。

停止条件（严格）

仅当以下任一条件为真时才停止：

PR 已合并或关闭（一旦轮询/快照确认即停止）。
PR 已准备好合并：CI 成功，没有发现的未处理评审评论，不受所需评审批准阻碍，且无合并冲突风险。
需要用户干预，且 Codex 无法单独安全地继续。

在以下情况下继续轮询：

actions 仅包含 idle 但检查仍在待处理。
CI 仍在运行/排队中。
评审状态安静但 CI 未终止。
CI 为绿色但可合并性未知/待处理。
CI 为绿色且可合并，但 PR 仍处于打开状态，你正在根据绿色状态节奏等待可能的新评审评论或合并冲突变化。
PR 为绿色但受评审批准阻碍（REVIEW_REQUIRED / 类似情况）；继续按绿色状态节奏轮询，并发现任何新的评审评论，无需请求确认以继续监视。

在监控期间提供简洁的进度更新，并提供包含以下内容的最终摘要：

在长时间无变化的监控期间，避免在每次轮询时都发出完整更新；仅汇总状态变化以及偶尔的心跳更新。
将推送确认、中间 CI 快照和评审操作更新仅视为进度更新；除非满足严格的停止条件，否则不要发出最终摘要或结束照看会话。
用户“监控”的请求不能通过几次示例轮询来满足；保持在循环中，直到达到严格的停止条件或用户明确中断。
评审修复提交 + 推送不是完成事件；立即在同一轮对话中恢复实时监控（--watch）并继续报告进度更新。
当 CI 首次针对当前 SHA 全部变为绿色时，发出一次性的庆祝性进度更新（不要在每次绿色轮询时重复）。首选风格：🚀 CI 全部通过！33/33 成功。仍在监视等待评审批准。
除非监视器终端已发出/确认了严格的停止条件，否则不要在监视器终端仍在运行时发送最终摘要；否则继续提供进度更新。
最终的 PR SHA
CI 状态摘要
可合并性 / 冲突状态
推送的修复
使用的偶发性重试循环
剩余未解决的故障或评审评论

启发式方法和决策树：.codex/skills/babysit-pr/references/heuristics.md
监视器使用的 GitHub CLI/API 详细信息：.codex/skills/babysit-pr/references/github-api-notes.md

🇺🇸English

PR Babysitter

Objective

Babysit a PR persistently until one of these terminal outcomes occurs:

The PR is merged or closed.
CI is successful, there are no unaddressed review comments surfaced by the watcher, required review approval is not blocking merge, and there are no potential merge conflicts (PR is mergeable / not reporting conflict risk).
A situation requires user help (for example CI infrastructure issues, repeated flaky failures after retry budget is exhausted, permission problems, or ambiguity that cannot be resolved safely).

Do not stop merely because a single snapshot returns idle while checks are still pending.

Inputs

Accept any of the following:

No PR argument: infer the PR from the current branch (--pr auto)
PR number
PR URL

Core Workflow

When the user asks to "monitor"/"watch"/"babysit" a PR, start with the watcher's continuous mode (--watch) unless you are intentionally doing a one-shot diagnostic snapshot.
Run the watcher script to snapshot PR/CI/review state (or consume each streamed snapshot from --watch).
Inspect the actions list in the JSON response.
If diagnose_ci_failure is present, inspect failed run logs and classify the failure.
If the failure is likely caused by the current branch, patch code locally, commit, and push.
If process_review_comment is present, inspect surfaced review items and decide whether to address them.
If a review item is actionable and correct, patch code locally, commit, and push.
If the failure is likely flaky/unrelated and retry_failed_checks is present, rerun failed jobs with --retry-failed-now.
If both actionable review feedback and retry_failed_checks are present, prioritize review feedback first; a new commit will retrigger CI, so avoid rerunning flaky checks on the old SHA unless you intentionally defer the review change.
On every loop, verify mergeability / merge-conflict status (for example via gh pr view) in addition to CI and review state.
After any push or rerun action, immediately return to step 1 and continue polling on the updated SHA/state.
If you had been using --watch before pausing to patch/commit/push, relaunch --watch yourself in the same turn immediately after the push (do not wait for the user to re-invoke the skill).
Repeat polling until the PR is green + review-clean + mergeable, stop_pr_closed appears, or a user-help-required blocker is reached.
Maintain terminal/session ownership: while babysitting is active, keep consuming watcher output in the same turn; do not leave a detached --watch process running and then end the turn as if monitoring were complete.

Commands

One-shot snapshot

python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --once

Continuous watch (JSONL)

python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --watch

Trigger flaky retry cycle (only when watcher indicates)

python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr auto --retry-failed-now

Explicit PR target

python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr <number-or-url> --once

CI Failure Classification

Use gh commands to inspect failed runs before deciding to rerun.

gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha
gh run view <run-id> --log-failed

Prefer treating failures as branch-related when logs point to changed code (compile/test/lint/typecheck/snapshots/static analysis in touched areas).

Prefer treating failures as flaky/unrelated when logs show transient infra/external issues (timeouts, runner provisioning failures, registry/network outages, GitHub Actions infra errors).

If classification is ambiguous, perform one manual diagnosis attempt before choosing rerun.

Read .codex/skills/babysit-pr/references/heuristics.md for a concise checklist.

Review Comment Handling

The watcher surfaces review items from:

PR issue comments
Inline review comments
Review submissions (COMMENT / APPROVED / CHANGES_REQUESTED)

It intentionally surfaces Codex reviewer bot feedback (for example comments/reviews from chatgpt-codex-connector[bot]) in addition to human reviewer feedback. Most unrelated bot noise should still be ignored. For safety, the watcher only auto-surfaces trusted human review authors (for example repo OWNER/MEMBER/COLLABORATOR, plus the authenticated operator) and approved review bots such as Codex. On a fresh watcher state file, existing pending review feedback may be surfaced immediately (not only comments that arrive after monitoring starts). This is intentional so already-open review comments are not missed.

When you agree with a comment and it is actionable:

Patch code locally.
Commit with codex: address PR review feedback (#<n>).
Push to the PR head branch.
Resume watching on the new SHA immediately (do not stop after reporting the push).
If monitoring was running in --watch mode, restart --watch immediately after the push in the same turn; do not wait for the user to ask again.

If you disagree or the comment is non-actionable/already addressed, record it as handled by continuing the watcher loop (the script de-duplicates surfaced items via state after surfacing them). If a code review comment/thread is already marked as resolved in GitHub, treat it as non-actionable and safely ignore it unless new unresolved follow-up feedback appears.

Git Safety Rules

Work only on the PR head branch.
Avoid destructive git commands.
Do not switch branches unless necessary to recover context.
Before editing, check for unrelated uncommitted changes. If present, stop and ask the user.
After each successful fix, commit and git push, then re-run the watcher.
If you interrupted a live --watch session to make the fix, restart --watch immediately after the push in the same turn.
Do not run multiple concurrent --watch processes for the same PR/state file; keep one watcher session active and reuse it until it stops or you intentionally restart it.
A push is not a terminal outcome; continue the monitoring loop unless a strict stop condition is met.

Commit message defaults:

codex: fix CI failure on PR #<n>
codex: address PR review feedback (#<n>)

Monitoring Loop Pattern

Use this loop in a live Codex session:

Run --once.
Read actions.
First check whether the PR is now merged or otherwise closed; if so, report that terminal state and stop polling immediately.
Check CI summary, new review items, and mergeability/conflict status.
Diagnose CI failures and classify branch-related vs flaky/unrelated.
Process actionable review comments before flaky reruns when both are present; if a review fix requires a commit, push it and skip rerunning failed checks on the old SHA.
Retry failed checks only when retry_failed_checks is present and you are not about to replace the current SHA with a review/CI fix commit.
If you pushed a commit or triggered a rerun, report the action briefly and continue polling (do not stop).
After a review-fix push, proactively restart continuous monitoring (--watch) in the same turn unless a strict stop condition has already been reached.
If everything is passing, mergeable, not blocked on required review approval, and there are no unaddressed review items, report success and stop.
If blocked on a user-help-required issue (infra outage, exhausted flaky retries, unclear reviewer request, permissions), report the blocker and stop.
Otherwise sleep according to the polling cadence below and repeat.

When the user explicitly asks to monitor/watch/babysit a PR, prefer --watch so polling continues autonomously in one command. Use repeated --once snapshots only for debugging, local testing, or when the user explicitly asks for a one-shot check. Do not stop to ask the user whether to continue polling; continue autonomously until a strict stop condition is met or the user explicitly interrupts. Do not hand control back to the user after a review-fix push just because a new SHA was created; restarting the watcher and re-entering the poll loop is part of the same babysitting task. If a --watch process is still running and no strict stop condition has been reached, the babysitting task is still in progress; keep streaming/consuming watcher output instead of ending the turn.

Polling Cadence

Use adaptive polling and continue monitoring even after CI turns green:

While CI is not green (pending/running/queued or failing): poll every 1 minute.
After CI turns green: start at every 1 minute, then back off exponentially when there is no change (for example 1m, 2m, 4m, 8m, 16m, 32m), capping at every 1 hour.
Reset the green-state polling interval back to 1 minute whenever anything changes (new commit/SHA, check status changes, new review comments, mergeability changes, review decision changes).
If CI stops being green again (new commit, rerun, or regression): return to 1-minute polling.
If any poll shows the PR is merged or otherwise closed: stop polling immediately and report the terminal state.

Stop Conditions (Strict)

Stop only when one of the following is true:

PR merged or closed (stop as soon as a poll/snapshot confirms this).
PR is ready to merge: CI succeeded, no surfaced unaddressed review comments, not blocked on required review approval, and no merge conflict risk.
User intervention is required and Codex cannot safely proceed alone.

Keep polling when:

actions contains only idle but checks are still pending.
CI is still running/queued.
Review state is quiet but CI is not terminal.
CI is green but mergeability is unknown/pending.
CI is green and mergeable, but the PR is still open and you are waiting for possible new review comments or merge-conflict changes per the green-state cadence.
The PR is green but blocked on review approval (REVIEW_REQUIRED / similar); continue polling on the green-state cadence and surface any new review comments without asking for confirmation to keep watching.

Output Expectations

Provide concise progress updates while monitoring and a final summary that includes:

During long unchanged monitoring periods, avoid emitting a full update on every poll; summarize only status changes plus occasional heartbeat updates.
Treat push confirmations, intermediate CI snapshots, and review-action updates as progress updates only; do not emit the final summary or end the babysitting session unless a strict stop condition is met.
A user request to "monitor" is not satisfied by a couple of sample polls; remain in the loop until a strict stop condition or an explicit user interruption.
A review-fix commit + push is not a completion event; immediately resume live monitoring (--watch) in the same turn and continue reporting progress updates.
When CI first transitions to all green for the current SHA, emit a one-time celebratory progress update (do not repeat it on every green poll). Preferred style: 🚀 CI is all green! 33/33 passed. Still on watch for review approval.
Do not send the final summary while a watcher terminal is still running unless the watcher has emitted/confirmed a strict stop condition; otherwise continue with progress updates.
Final PR SHA
CI status summary
Mergeability / conflict status
Fixes pushed
Flaky retry cycles used
Remaining unresolved failures or review comments

References

Heuristics and decision tree: .codex/skills/babysit-pr/references/heuristics.md
GitHub CLI/API details used by the watcher: .codex/skills/babysit-pr/references/github-api-notes.md

Weekly Installs

298

Repository

openai/codex

GitHub Stars

67.1K

First Seen

Feb 23, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

opencode295

codex163

kimi-cli162

gemini-cli162

amp162

github-copilot162

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

140,500 周安装