npx skills add https://github.com/boshu2/agentops --skill council并行启动具有不同视角的评审员,整合为共识。适用于任何任务——验证、研究、头脑风暴。
/council --quick validate recent # 快速内联检查
/council validate this plan # 验证(2个智能体)
/council brainstorm caching approaches # 头脑风暴
/council validate the implementation # 验证(此处触发映射的批判)
/council research kubernetes upgrade strategies # 研究
/council research the CI/CD pipeline bottlenecks # 研究(此处触发映射的分析)
/council --preset=security-audit validate the auth system # 预设角色
/council --deep --explorers=3 research upgrade automation # 深度 + 探索者
/council --debate validate the auth system # 对抗性两轮评审
/council --deep --debate validate the migration plan # 彻底 + 辩论
/council # 从上下文推断
Council 独立工作——无需 RPI 工作流,无需棘轮链,无需 ao CLI。除了初始安装外,无需任何设置。
| 模式 | 智能体数量 |
|---|
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 执行后端 |
|---|
| 用例 |
|---|
--quick | 0(内联) | 自身 | 快速单智能体检查,不启动 |
| 默认 | 2 | 运行时原生(优先使用 Codex 子智能体;Claude 团队作为后备) | 独立评审员(无视角标签) |
--deep | 3 | 运行时原生 | 彻底评审 |
--mixed | 3+3 | 运行时原生 + Codex CLI | 跨供应商共识 |
--debate | 2+ | 运行时原生 | 对抗性优化(2轮) |
/council --quick validate recent # 内联单智能体检查,不启动
/council recent # 2个运行时原生评审员
/council --deep recent # 3个运行时原生评审员
/council --mixed recent # 运行时原生 + Codex CLI
Council 需要一个能够并行启动子智能体并且(对于 --debate)在智能体之间发送消息的运行时。使用您的运行时提供的任何多智能体原语。如果未检测到多智能体能力,则回退到 --quick(内联单智能体)。
必需的能力:
--quick 外的所有模式都需要)--debate 需要)技能描述的是做什么,而不是调用哪个工具。关于能力契约,请参阅 skills/shared/SKILL.md。
检测到您的后端后,请阅读相应的参考文档以获取具体的启动/等待/消息/清理示例:
../shared/references/claude-code-latest-features.md../shared/references/backend-claude-teams.md../shared/references/backend-codex-subagents.md../shared/references/backend-background-tasks.md--quick) → ../shared/references/backend-inline.md关于委员会特定的启动流程(阶段、超时、输出收集),另请参阅 references/cli-spawning.md。
--debate对于评审员可能意见不一的高风险或模糊评审,请使用 --debate:
对于预期会达成共识的常规验证,请跳过 --debate。辩论会增加 R2 延迟(评审员保持活动状态,并通过后端消息传递处理第二轮)。
不兼容性:
--quick 和 --debate 不能组合使用。--quick 以内联方式运行,不启动子智能体;--debate 需要多智能体轮次。如果两者都传递,则报错退出:"Error: --quick and --debate are incompatible."--debate 仅支持验证模式。头脑风暴和研究不产生 PASS/WARN/FAIL 裁决。如果组合使用,则报错退出:"Error: --debate is only supported with validate mode."| 类型 | 触发词 | 视角焦点 |
|---|---|---|
| validate | validate, check, review, assess, critique, feedback, improve | 这是正确的吗?有什么问题?可以如何改进? |
| brainstorm | brainstorm, explore, options, approaches | 有哪些替代方案?优缺点是什么? |
| research | research, investigate, deep dive, explore deeply, analyze, examine, evaluate, compare | 我们能发现什么?有哪些属性、权衡和结构? |
自然语言有效——技能会根据您的提示推断任务类型。
当模式为 validate 且目标是计划/规范/契约(或包含边界规则、状态转换或一致性表)时,评审员在返回 PASS 之前必须应用此检查门:
此检查门的裁决策略:
WARN。WARN。FAIL。评审员将所有分析写入输出文件。发送给负责人的消息仅包含一个最小的完成信号:{"type":"verdict","verdict":"...","confidence":"...","file":"..."}。负责人在整合期间读取输出文件。这可以防止 N 个评审员通过 SendMessage 用 N 份完整报告撑爆负责人的上下文窗口。
整合作为负责人内联运行——没有单独的主席智能体。负责人使用 Read 工具顺序读取每个评审员的输出文件并进行综合。
┌─────────────────────────────────────────────────────────────────┐
│ Phase 1: Build Packet (JSON) │
│ - Task type (validate/brainstorm/research) │
│ - Target description │
│ - Context (files, diffs, prior decisions) │
│ - Perspectives to assign │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase 1a: Select spawn backend │
│ codex_subagents | claude_teams | background_fallback │
│ Team lead = spawner (this agent) │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────┴─────────────────┐
▼ ▼
┌───────────────────────┐ ┌───────────────────────┐
│ RUNTIME-NATIVE JUDGES│ │ CODEX AGENTS │
│ (spawn_agent or teams)│ │ (Bash tool, parallel)│
│ │ │ Agent 1 (independent │
│ Agent 1 (independent │ │ or with preset) │
│ or with preset) │ │ Agent 2 │
│ Agent 2 │ │ Agent 3 │
│ Agent 3 (--deep only)│ │ (--mixed only) │
│ (--deep/--mixed only)│ │ │
│ │ │ Output: JSON + MD │
│ Write files, then │ │ Files: .agents/ │
│ wait()/SendMessage to │ │ council/codex-* │
│ lead │ │ │
│ Files: .agents/ │ └───────────────────────┘
│ council/claude-* │ │
└───────────────────────┘ │
│ │
└─────────────────┬─────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase 2: Consolidation (Team Lead — inline, no extra agent) │
│ - Receive MINIMAL completion signals (verdict + file path) │
│ - Read each judge's output file with Read tool │
│ - If schema_version is missing from a judge's output, treat │
│ as version 0 (backward compatibility) │
│ - Compute consensus verdict │
│ - Identify shared findings │
│ - Surface disagreements with attribution │
│ - Generate Markdown report for human │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase 3: Cleanup │
│ - Cleanup backend resources (close_agent / TeamDelete / none) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Output: Markdown Council Report │
│ - Consensus: PASS/WARN/FAIL │
│ - Shared findings │
│ - Disagreements (if any) │
│ - Recommendations │
└─────────────────────────────────────────────────────────────────┘
| 故障 | 行为 |
|---|---|
| N 个智能体中的 1 个超时 | 使用 N-1 个继续,并在报告中注明 |
| 所有 Codex CLI 智能体失败 | 仅使用运行时原生评审员继续,注明降级 |
| 所有智能体失败 | 返回错误,建议重试 |
| Codex CLI 未安装 | 跳过 Codex CLI 评审员,仅继续使用运行时评审员(警告用户) |
| 无多智能体能力 | 回退到 --quick(内联单智能体评审) |
| 无智能体消息传递 | --debate 不可用,仅限单轮评审 |
| 输出目录缺失 | 自动创建 .agents/council/ |
超时:每个智能体 120 秒(可通过 --timeout=N 以秒为单位配置)。
最低法定人数: 至少 1 个智能体必须响应才能构成有效的委员会。如果 0 个智能体响应,则返回错误。
--quick。--debate。--mixed): 检查 which codex,测试模型可用性,测试 --output-schema 支持。不可用时降级混合模式。judges * (1 + explorers) <= MAX_AGENTS (12)mkdir -p .agents/council--quick)单智能体内联验证。无子进程启动,无 Task 工具,无 Codex。当前智能体使用与完整委员会相同的输出模式进行结构化的自我评审。
何时使用: 常规检查、实施中的完整性检查、提交前的快速扫描。
执行: 收集上下文(文件、差异)-> 使用委员会 output_schema(裁决、置信度、发现、建议)内联执行结构化自我评审 -> 将报告写入 .agents/council/YYYY-MM-DD-quick-<target>.md,标记为 Mode: quick (single-agent)。
限制: 无跨视角分歧,无跨供应商洞察,置信度上限较低。不适用于安全审计或架构决策。
发送给每个智能体的数据包。文件内容以内联方式包含——智能体在数据包中接收实际的代码/计划文本,而不仅仅是路径。这确保了 Claude 和 Codex 智能体都能分析,而无需文件访问权限。
如果 .agents/ao/environment.json 存在,请将其包含在上下文数据包中,以便评审员可以推理可用的工具和环境状态。
评审员提示边界:
.agents/ 引用。.agents/ 目录。评审员仅基于委员会数据包操作。{
"council_packet": {
"version": "1.0",
"mode": "validate | brainstorm | research",
"target": "Implementation of user authentication system",
"context": {
"files": [
{
"path": "src/auth/jwt.py",
"content": "<file contents inlined here>"
},
{
"path": "src/auth/middleware.py",
"content": "<file contents inlined here>"
}
],
"diff": "git diff output if applicable",
"spec": {
"source": "bead na-0042 | plan doc | none",
"content": "The spec/bead description text (optional — included when wrapper provides it)"
},
"prior_decisions": [
"Using JWT, not sessions",
"Refresh tokens required"
],
"empirical_results": "(optional) test output, CLI flag verification, or Wave 0 findings — include when evaluating feasibility"
},
"perspective": "skeptic (only when --preset or --perspectives used)",
"perspective_description": "What could go wrong? (only when --preset or --perspectives used)",
"output_schema": {
"verdict": "PASS | WARN | FAIL",
"confidence": "HIGH | MEDIUM | LOW",
"key_insight": "Single sentence summary",
"findings": [
{
"severity": "critical | significant | minor",
"category": "security | architecture | performance | style",
"description": "What was found",
"location": "file:line if applicable",
"recommendation": "How to address",
"fix": "Specific action to resolve this finding",
"why": "Root cause or rationale",
"ref": "File path, spec anchor, or doc reference"
}
],
"recommendation": "Concrete next step",
"schema_version": 2
}
}
}
当评估实施可行性时(例如,“这个 CLI 标志能用吗?”、“这些工具能共存吗?”),请始终在 context.empirical_results 中包含经验测试结果。基于假设进行推理的评审员会产生错误的裁决——一个 Codex 评审员曾因数据包中没有 Wave 0 测试输出而对 -s read-only 给出了错误的 FAIL。规则是:先运行实验,然后让评审员评估证据。
包装器技能(/vibe、/pre-mortem)在委员会目标涉及工具行为、标志组合或运行时兼容性时,应包含相关的测试输出。
视角与预设: 使用
Read工具查看skills/council/references/personas.md以获取角色定义、预设配置和自定义视角详细信息。
自动升级: 当 --preset 或 --perspectives 指定的视角数量超过当前评审员数量时,自动将评审员数量升级以匹配。--count 标志会覆盖自动升级。
命名视角为每个评审员分配一个特定的观点。传递 --perspectives="a,b,c" 使用自由格式名称,或使用 --perspectives-file=<path> 指定包含焦点描述的 YAML 文件:
/council --perspectives="security-auditor,performance-critic,simplicity-advocate" validate src/auth/
/council --perspectives-file=.agents/perspectives/api-review.yaml validate src/api/
--perspectives-file 的 YAML 格式:
perspectives:
- name: security-auditor
focus: Find security vulnerabilities and trust boundary violations
- name: performance-critic
focus: Identify performance bottlenecks and scaling risks
标志优先级: --perspectives/--perspectives-file 覆盖 --preset 视角。--count 始终覆盖评审员数量。没有 --count 时,评审员数量自动升级以匹配视角数量。
有关所有内置预设及其视角定义,请参阅 references/personas.md。
探索者详情: 使用
Read工具查看skills/council/references/explorers.md以获取探索者架构、提示、子问题生成和超时配置。
摘要: 评审员可以启动探索者子智能体(--explorers=N,最多 5 个)进行并行深度研究。总智能体数 = judges * (1 + explorers),上限为 MAX_AGENTS=12。
--debate)辩论协议: 使用
Read工具查看skills/council/references/debate-protocol.md以获取完整的辩论执行流程、R1 到 R2 的裁决注入、超时处理和成本分析。
摘要: 两轮对抗性评审。R1 产生独立裁决。R2 通过后端消息传递(send_input 或 SendMessage)发送其他评审员的裁决,进行换位思考和修订。仅支持验证模式。
智能体提示: 使用
Read工具查看skills/council/references/agent-prompts.md以获取评审员提示(默认和基于视角的)、整合提示和辩论 R2 消息模板。
| 条件 | 裁决 |
|---|---|
| 全部 PASS | PASS |
| 任何 FAIL | FAIL |
| 混合 PASS/WARN | WARN |
| 全部 WARN | WARN |
分歧处理:
DISAGREE 解决: 当供应商意见不一致时,启动者会展示双方的立场和推理,并交由用户决定。没有自动的决胜机制——跨供应商分歧是值得人工关注的信号。
报告模板: 使用
Read工具查看skills/council/references/output-format.md以获取完整的报告模板(验证、头脑风暴、研究)和辩论报告补充(裁决变化、收敛检测)。
所有报告都写入 .agents/council/YYYY-MM-DD-<type>-<target>.md。
最低法定人数: 1 个智能体。推荐: 80% 的评审员。超时时,使用剩余的评审员继续,并在报告中注明。用户取消时,关闭所有评审员并生成带有 INCOMPLETE 标记的部分报告。
| 变量 | 默认值 | 描述 |
|---|---|---|
COUNCIL_TIMEOUT | 120 | 智能体超时时间(秒) |
COUNCIL_CODEX_MODEL | gpt-5.3-codex | 覆盖 --mixed 的 Codex 模型。显式设置以固定 Codex 评审员行为;省略则使用用户配置的默认值。 |
COUNCIL_CLAUDE_MODEL | sonnet | 评审员使用的 Claude 模型(默认 sonnet——对于高风险任务,通过 --profile=thorough 使用 opus) |
COUNCIL_EXPLORER_MODEL | sonnet | 探索者子智能体使用的模型 |
COUNCIL_EXPLORER_TIMEOUT | 60 | 探索者超时时间(秒) |
COUNCIL_R2_TIMEOUT | 90 | 发送辩论消息后,等待 R2 辩论完成的最长时间。比 R1 短,因为评审员已有上下文。 |
| 标志 | 描述 |
|---|---|
--deep | 3 个 Claude 智能体而非 2 个 |
--mixed | 添加 3 个 Codex 智能体 |
--debate | 启用对抗性辩论轮次(通过后端消息传递进行 2 轮,相同智能体)。与 --quick 不兼容。 |
--timeout=N | 覆盖超时时间(秒)(默认:120) |
--perspectives="a,b,c" | 自定义视角名称(每个名称将评审员的系统提示设置为采用该观点) |
--perspectives-file=<path> | 从 YAML 文件加载命名视角(见下文命名视角) |
--preset=<name> | 内置角色预设(security-audit, architecture, research, ops, code-review, plan-review, doc-review, retrospective, product, developer-experience) |
--count=N | 覆盖每个供应商的智能体数量(例如,--count=4 = 4 个 Claude,或使用 --mixed 时为 4+4)。受 MAX_AGENTS=12 上限限制。 |
--explorers=N | 每个评审员的探索者子智能体数量(默认:0,最大:5)。最大有效值取决于评审员数量。总智能体数上限为 12。 |
--explorer-model=M | 覆盖探索者模型(默认:sonnet) |
--technique=<name> | 头脑风暴技术(scamper, six-hats, reverse)。不区分大小写。仅适用于头脑风暴模式——与验证/研究模式组合会报错。如果省略,则进行非结构化头脑风暴(当前行为)。见 references/brainstorm-techniques.md。 |
--profile=<name> | 模型质量配置文件(thorough, balanced, fast)。如果名称无法识别则报错。被 COUNCIL_CLAUDE_MODEL 环境变量(最高优先级)覆盖,然后被显式的 --count/--deep/--mixed 覆盖。见 references/model-profiles.md。 |
CLI 启动: 使用
Read工具查看skills/council/references/cli-spawning.md以获取团队设置、Claude/Codex 智能体启动、并行执行、辩论 R2 命令、清理和模型选择。
/council validate recent # 2 个评审员,最近的提交
/council --deep --preset=architecture research the auth system # 3 个具有架构角色的评审员
/council --mixed validate this plan # 3 个 Claude + 3 个 Codex
/council --deep --explorers=3 research upgrade patterns # 12 个智能体(3 个评审员 x 4)
/council --preset=security-audit --deep validate the API # 攻击者、防御者、合规性、Web 安全
/council --preset=doc-review validate README.md # 4 个文档评审员,具有命名视角
/council brainstorm caching strategies for the API # 2 个评审员探索选项
/council --technique=scamper brainstorm API improvements # 结构化的 SCAMPER 头脑风暴
/council --technique=six-hats brainstorm migration strategy # 并行视角头脑风暴
/council --profile=thorough validate the security architecture # opus, 3 个评审员, 120s 超时
/council --profile=fast validate recent # haiku, 2 个评审员, 60s 超时
/council research Redis vs Memcached for session storage # 2 个评审员评估权衡
/council validate the implementation plan in PLAN.md # 结构化计划反馈
/council --preset=doc-review validate docs/ARCHITECTURE.md # 4 个文档评审员
/council --perspectives="security-auditor,perf-critic" validate src/ # 命名视角
/council --perspectives-file=.agents/perspectives/custom.yaml validate # 从文件加载视角
用户说: /council --quick validate recent
发生的情况:
.agents/council/YYYY-MM-DD-quick-<target>.md,标记为 Mode: quick (single-agent)结果: 用于常规验证的快速完整性检查(无跨视角洞察或辩论)。
用户说: /council --debate validate the auth system
发生的情况:
.agents/council/结果: 带有换位思考和修订的两轮评审,适用于高风险决策。
用户说: /council --mixed --explorers=2 research Kubernetes upgrade strategies
发生的情况:
结果: 具有深度探索的跨供应商研究,总智能体数上限为 12。
| 问题 | 原因 | 解决方案 |
|---|---|---|
| "Error: --quick and --debate are incompatible" | 同时传递了两个标志 | 使用 --quick 进行快速内联检查 或 使用 --debate 进行多轮评审,不要同时使用 |
| "Error: --debate is only supported with validate mode" | 将辩论标志与头脑风暴/研究模式一起使用 | 移除 --debate 或切换到验证模式——头脑风暴/研究没有 PASS/FAIL 裁决 |
| 委员会启动的智能体数量少于预期 | --explorers=N 超过 MAX_AGENTS (12) | 智能体自动缩放评审员数量。检查报告头部以获取实际评审员数量。减少 --explorers 或使用 --count 手动设置评审员数量 |
| 在 --mixed 模式下跳过 Codex 评审员 | Codex CLI 不在 PATH 中 | 安装 Codex CLI (brew install codex)。模型使用用户配置的默认值——无需特定模型。 |
.agents/council/ 中没有输出文件 | 权限错误或磁盘已满 | 使用 ls -ld .agents/council/ 检查目录权限。委员会会自动创建缺失的目录。 |
| 智能体在 120 秒后超时 | 文件读取缓慢或网络问题 | 使用 --timeout=300 增加超时时间,或检查 COUNCIL_TIMEOUT 环境变量。默认:120 秒。 |
/council 取代了旧的 judge 技能。迁移:
| 旧命令 | 新命令 |
|---|---|
| judge recent | /council validate recent |
| judge 2 opus | /council recent(默认) |
| judge 3 opus | /council --deep recent |
judge 技能已弃用。请使用 /council。
Council 使用您的运行时提供的任何多智能体原语。每个评审员都是一个并行的子智能体,将输出写入文件,并向负责人发送一个最小的完成信号。
--debate 标志实现了审议协议模式:
独立评估 → 证据交换 → 立场修订 → 收敛分析
Council 保持新鲜上下文隔离(Ralph Wiggum 模式),但有一个有记录的例外:
--debate 在 R1 和 R2 之间重用评审员上下文。 这是有意为之的。评审员在单个原子性的委员会调用中持续存在——它们不会在单独的委员会调用之间持续存在。理由如下:
没有 --debate 时,委员会完全符合 Ralph 模式:每个评审员都是新启动的,执行一次,写入输出,然后终止。
如果未检测到多智能体能力,委员会将回退到 --quick(内联单智能体评审)。如果智能体消息传递不可用,--debate 将降级为单轮评审,并在报告中注明。
约定:council-YYYYMMDD-<target>(例如,council-20260206-auth-system)。
评审员名称:独立评审员使用 judge-{N}(例如,judge-1、judge-2),或在使用预设/视角时使用 judge-{perspective}(例如,judge-error-paths、judge-feasibility)。在 Codex 和 Claude 后端使用相同的逻辑名称。
skills/vibe/SKILL.md — 复杂性 + 委员会进行代码验证(发现规范时使用 --preset=code-review)skills/pre-mortem/SKILL.md — 计划验证(使用 --preset=plan-review,始终 3 个评审员)skills/post-mortem/SKILL.md — 工作收尾(使用 --preset=retrospective,始终 3 个评审员 + 回顾)skills/swarm/SKILL.md — 多智能体编排skills/standards/SKILL.md — 特定语言的编码标准skills/research/SKILL.md — 代码库探索(与委员会研究模式互补)每周安装次数
1.4K
仓库
GitHub 星标数
197
首次出现
2026年2月5日
安全审计
安装于
opencode1.2K
codex1.2K
github-copilot1.2K
gemini-cli1.2K
kimi-cli1.2K
amp1.2K
Spawn parallel judges with different perspectives, consolidate into consensus. Works for any task — validation, research, brainstorming.
/council --quick validate recent # fast inline check
/council validate this plan # validation (2 agents)
/council brainstorm caching approaches # brainstorm
/council validate the implementation # validation (critique triggers map here)
/council research kubernetes upgrade strategies # research
/council research the CI/CD pipeline bottlenecks # research (analyze triggers map here)
/council --preset=security-audit validate the auth system # preset personas
/council --deep --explorers=3 research upgrade automation # deep + explorers
/council --debate validate the auth system # adversarial 2-round review
/council --deep --debate validate the migration plan # thorough + debate
/council # infers from context
Council works independently — no RPI workflow, no ratchet chain, no ao CLI required. Zero setup beyond initial install.
| Mode | Agents | Execution Backend | Use Case |
|---|---|---|---|
--quick | 0 (inline) | Self | Fast single-agent check, no spawning |
| default | 2 | Runtime-native (Codex sub-agents preferred; Claude teams fallback) | Independent judges (no perspective labels) |
--deep | 3 | Runtime-native | Thorough review |
--mixed | 3+3 | Runtime-native + Codex CLI | Cross-vendor consensus |
--debate |
/council --quick validate recent # inline single-agent check, no spawning
/council recent # 2 runtime-native judges
/council --deep recent # 3 runtime-native judges
/council --mixed recent # runtime-native + Codex CLI
Council requires a runtime that can spawn parallel subagents and (for --debate) send messages between agents. Use whatever multi-agent primitives your runtime provides. If no multi-agent capability is detected, fall back to --quick (inline single-agent).
Required capabilities:
--quick)--debate)Skills describe WHAT to do, not WHICH tool to call. See skills/shared/SKILL.md for the capability contract.
After detecting your backend, read the matching reference for concrete spawn/wait/message/cleanup examples:
../shared/references/claude-code-latest-features.md../shared/references/backend-claude-teams.md../shared/references/backend-codex-subagents.md../shared/references/backend-background-tasks.md--quick) → ../shared/references/backend-inline.mdSee also references/cli-spawning.md for council-specific spawning flow (phases, timeouts, output collection).
--debateUse --debate for high-stakes or ambiguous reviews where judges are likely to disagree:
Skip --debate for routine validation where consensus is expected. Debate adds R2 latency (judges stay alive and process a second round via backend messaging).
Incompatibilities:
--quick and --debate cannot be combined. --quick runs inline with no spawning; --debate requires multi-agent rounds. If both are passed, exit with error: "Error: --quick and --debate are incompatible."--debate is only supported with validate mode. Brainstorm and research do not produce PASS/WARN/FAIL verdicts. If combined, exit with error: "Error: --debate is only supported with validate mode."| Type | Trigger Words | Perspective Focus |
|---|---|---|
| validate | validate, check, review, assess, critique, feedback, improve | Is this correct? What's wrong? What could be better? |
| brainstorm | brainstorm, explore, options, approaches | What are the alternatives? Pros/cons? |
| research | research, investigate, deep dive, explore deeply, analyze, examine, evaluate, compare | What can we discover? What are the properties, trade-offs, and structure? |
Natural language works — the skill infers task type from your prompt.
When mode is validate and the target is a plan/spec/contract (or contains boundary rules, state transitions, or conformance tables), judges must apply this gate before returning PASS:
Verdict policy for this gate:
WARN.WARN.FAIL.Judges write ALL analysis to output files. Messages to the lead contain ONLY a minimal completion signal: {"type":"verdict","verdict":"...","confidence":"...","file":"..."}. The lead reads output files during consolidation. This prevents N judges from exploding the lead's context window with N full reports via SendMessage.
Consolidation runs inline as the lead — no separate chairman agent. The lead reads each judge's output file sequentially with the Read tool and synthesizes.
┌─────────────────────────────────────────────────────────────────┐
│ Phase 1: Build Packet (JSON) │
│ - Task type (validate/brainstorm/research) │
│ - Target description │
│ - Context (files, diffs, prior decisions) │
│ - Perspectives to assign │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase 1a: Select spawn backend │
│ codex_subagents | claude_teams | background_fallback │
│ Team lead = spawner (this agent) │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────┴─────────────────┐
▼ ▼
┌───────────────────────┐ ┌───────────────────────┐
│ RUNTIME-NATIVE JUDGES│ │ CODEX AGENTS │
│ (spawn_agent or teams)│ │ (Bash tool, parallel)│
│ │ │ Agent 1 (independent │
│ Agent 1 (independent │ │ or with preset) │
│ or with preset) │ │ Agent 2 │
│ Agent 2 │ │ Agent 3 │
│ Agent 3 (--deep only)│ │ (--mixed only) │
│ (--deep/--mixed only)│ │ │
│ │ │ Output: JSON + MD │
│ Write files, then │ │ Files: .agents/ │
│ wait()/SendMessage to │ │ council/codex-* │
│ lead │ │ │
│ Files: .agents/ │ └───────────────────────┘
│ council/claude-* │ │
└───────────────────────┘ │
│ │
└─────────────────┬─────────────────┘
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase 2: Consolidation (Team Lead — inline, no extra agent) │
│ - Receive MINIMAL completion signals (verdict + file path) │
│ - Read each judge's output file with Read tool │
│ - If schema_version is missing from a judge's output, treat │
│ as version 0 (backward compatibility) │
│ - Compute consensus verdict │
│ - Identify shared findings │
│ - Surface disagreements with attribution │
│ - Generate Markdown report for human │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase 3: Cleanup │
│ - Cleanup backend resources (close_agent / TeamDelete / none) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Output: Markdown Council Report │
│ - Consensus: PASS/WARN/FAIL │
│ - Shared findings │
│ - Disagreements (if any) │
│ - Recommendations │
└─────────────────────────────────────────────────────────────────┘
| Failure | Behavior |
|---|---|
| 1 of N agents times out | Proceed with N-1, note in report |
| All Codex CLI agents fail | Proceed with runtime-native judges only, note degradation |
| All agents fail | Return error, suggest retry |
| Codex CLI not installed | Skip Codex CLI judges, continue with runtime judges only (warn user) |
| No multi-agent capability | Fall back to --quick (inline single-agent review) |
| No agent messaging | --debate unavailable, single-round review only |
| Output dir missing | Create .agents/council/ automatically |
Timeout: 120s per agent (configurable via --timeout=N in seconds).
Minimum quorum: At least 1 agent must respond for a valid council. If 0 agents respond, return error.
--quick.--debate.which codex, test model availability, test --output-schema support. Downgrade mixed mode when unavailable.judges * (1 + explorers) <= MAX_AGENTS (12)mkdir -p .agents/council--quick)Single-agent inline validation. No subprocess spawning, no Task tool, no Codex. The current agent performs a structured self-review using the same output schema as a full council.
When to use: Routine checks, mid-implementation sanity checks, pre-commit quick scan.
Execution: Gather context (files, diffs) -> perform structured self-review inline using the council output_schema (verdict, confidence, findings, recommendation) -> write report to .agents/council/YYYY-MM-DD-quick-<target>.md labeled as Mode: quick (single-agent).
Limitations: No cross-perspective disagreement, no cross-vendor insights, lower confidence ceiling. Not suitable for security audits or architecture decisions.
The packet sent to each agent. File contents are included inline — agents receive the actual code/plan text in the packet, not just paths. This ensures both Claude and Codex agents can analyze without needing file access.
If .agents/ao/environment.json exists, include it in the context packet so judges can reason about available tools and environment state.
Judge prompt boundary:
Do NOT include .agents/ references in judge prompts.
Do NOT instruct judges to search .agents/ directories. Judges operate on the council packet only.
{ "council_packet": { "version": "1.0", "mode": "validate | brainstorm | research", "target": "Implementation of user authentication system", "context": { "files": [ { "path": "src/auth/jwt.py", "content": "<file contents inlined here>" }, { "path": "src/auth/middleware.py", "content": "<file contents inlined here>" } ], "diff": "git diff output if applicable", "spec": { "source": "bead na-0042 | plan doc | none", "content": "The spec/bead description text (optional — included when wrapper provides it)" }, "prior_decisions": [ "Using JWT, not sessions", "Refresh tokens required" ], "empirical_results": "(optional) test output, CLI flag verification, or Wave 0 findings — include when evaluating feasibility" }, "perspective": "skeptic (only when --preset or --perspectives used)", "perspective_description": "What could go wrong? (only when --preset or --perspectives used)", "output_schema": { "verdict": "PASS | WARN | FAIL", "confidence": "HIGH | MEDIUM | LOW", "key_insight": "Single sentence summary", "findings": [ { "severity": "critical | significant | minor", "category": "security | architecture | performance | style", "description": "What was found", "location": "file:line if applicable", "recommendation": "How to address", "fix": "Specific action to resolve this finding", "why": "Root cause or rationale", "ref": "File path, spec anchor, or doc reference" } ], "recommendation": "Concrete next step", "schema_version": 2 } } }
When evaluating implementation feasibility (e.g., "will this CLI flag work?", "can these tools coexist?"), always include empirical test results in context.empirical_results. Judges reasoning from assumptions produce false verdicts — a Codex judge once gave a false FAIL on -s read-only because Wave 0 test output was not in the packet. The rule: run the experiment first, then let judges evaluate the evidence.
Wrapper skills (/vibe, /pre-mortem) should include relevant test output when the council target involves tooling behavior, flag combinations, or runtime compatibility.
Perspectives & Presets: Use
Readtool onskills/council/references/personas.mdfor persona definitions, preset configurations, and custom perspective details.
Auto-Escalation: When --preset or --perspectives specifies more perspectives than the current judge count, automatically escalate judge count to match. The --count flag overrides auto-escalation.
Named perspectives assign each judge a specific viewpoint. Pass --perspectives="a,b,c" for free-form names, or --perspectives-file=<path> for YAML with focus descriptions:
/council --perspectives="security-auditor,performance-critic,simplicity-advocate" validate src/auth/
/council --perspectives-file=.agents/perspectives/api-review.yaml validate src/api/
YAML format for --perspectives-file:
perspectives:
- name: security-auditor
focus: Find security vulnerabilities and trust boundary violations
- name: performance-critic
focus: Identify performance bottlenecks and scaling risks
Flag priority: --perspectives/--perspectives-file override --preset perspectives. --count always overrides judge count. Without --count, judge count auto-escalates to match perspective count.
See references/personas.md for all built-in presets and their perspective definitions.
Explorer Details: Use
Readtool onskills/council/references/explorers.mdfor explorer architecture, prompts, sub-question generation, and timeout configuration.
Summary: Judges can spawn explorer sub-agents (--explorers=N, max 5) for parallel deep-dive research. Total agents = judges * (1 + explorers), capped at MAX_AGENTS=12.
--debate)Debate Protocol: Use
Readtool onskills/council/references/debate-protocol.mdfor full debate execution flow, R1-to-R2 verdict injection, timeout handling, and cost analysis.
Summary: Two-round adversarial review. R1 produces independent verdicts. R2 sends other judges' verdicts via backend messaging (send_input or SendMessage) for steel-manning and revision. Only supported with validate mode.
Agent Prompts: Use
Readtool onskills/council/references/agent-prompts.mdfor judge prompts (default and perspective-based), consolidation prompt, and debate R2 message template.
| Condition | Verdict |
|---|---|
| All PASS | PASS |
| Any FAIL | FAIL |
| Mixed PASS/WARN | WARN |
| All WARN | WARN |
Disagreement handling:
DISAGREE resolution: When vendors disagree, the spawner presents both positions with reasoning and defers to the user. No automatic tie-breaking — cross-vendor disagreement is a signal worth human attention.
Report Templates: Use
Readtool onskills/council/references/output-format.mdfor full report templates (validate, brainstorm, research) and debate report additions (verdict shifts, convergence detection).
All reports write to .agents/council/YYYY-MM-DD-<type>-<target>.md.
Minimum quorum: 1 agent. Recommended: 80% of judges. On timeout, proceed with remaining judges and note in report. On user cancellation, shutdown all judges and generate partial report with INCOMPLETE marker.
| Variable | Default | Description |
|---|---|---|
COUNCIL_TIMEOUT | 120 | Agent timeout in seconds |
COUNCIL_CODEX_MODEL | gpt-5.3-codex | Override Codex model for --mixed. Set explicitly to pin Codex judge behavior; omit to use user's configured default. |
COUNCIL_CLAUDE_MODEL | sonnet | Claude model for judges (sonnet default — use opus for high-stakes via --profile=thorough) |
COUNCIL_EXPLORER_MODEL | sonnet | Model for explorer sub-agents |
| Flag | Description |
|---|---|
--deep | 3 Claude agents instead of 2 |
--mixed | Add 3 Codex agents |
--debate | Enable adversarial debate round (2 rounds via backend messaging, same agents). Incompatible with --quick. |
--timeout=N | Override timeout in seconds (default: 120) |
--perspectives="a,b,c" | Custom perspective names (each name sets the judge's system prompt to adopt that viewpoint) |
CLI Spawning: Use
Readtool onskills/council/references/cli-spawning.mdfor team setup, Claude/Codex agent spawning, parallel execution, debate R2 commands, cleanup, and model selection.
/council validate recent # 2 judges, recent commits
/council --deep --preset=architecture research the auth system # 3 judges with architecture personas
/council --mixed validate this plan # 3 Claude + 3 Codex
/council --deep --explorers=3 research upgrade patterns # 12 agents (3 judges x 4)
/council --preset=security-audit --deep validate the API # attacker, defender, compliance, web-security
/council --preset=doc-review validate README.md # 4 doc judges with named perspectives
/council brainstorm caching strategies for the API # 2 judges explore options
/council --technique=scamper brainstorm API improvements # structured SCAMPER brainstorm
/council --technique=six-hats brainstorm migration strategy # parallel perspectives brainstorm
/council --profile=thorough validate the security architecture # opus, 3 judges, 120s timeout
/council --profile=fast validate recent # haiku, 2 judges, 60s timeout
/council research Redis vs Memcached for session storage # 2 judges assess trade-offs
/council validate the implementation plan in PLAN.md # structured plan feedback
/council --preset=doc-review validate docs/ARCHITECTURE.md # 4 doc review judges
/council --perspectives="security-auditor,perf-critic" validate src/ # named perspectives
/council --perspectives-file=.agents/perspectives/custom.yaml validate # perspectives from file
User says: /council --quick validate recent
What happens:
.agents/council/YYYY-MM-DD-quick-<target>.md labeled Mode: quick (single-agent)Result: Fast sanity check for routine validation (no cross-perspective insights or debate).
User says: /council --debate validate the auth system
What happens:
.agents/council/Result: Two-round review with steel-manning and revision, useful for high-stakes decisions.
User says: /council --mixed --explorers=2 research Kubernetes upgrade strategies
What happens:
Result: Cross-vendor research with deep exploration, capped at 12 total agents.
| Problem | Cause | Solution |
|---|---|---|
| "Error: --quick and --debate are incompatible" | Both flags passed together | Use --quick for fast inline check OR --debate for multi-round review, not both |
| "Error: --debate is only supported with validate mode" | Debate flag used with brainstorm/research | Remove --debate or switch to validate mode — brainstorming/research have no PASS/FAIL verdicts |
| Council spawns fewer agents than expected | --explorers=N exceeds MAX_AGENTS (12) | Agent auto-scales judge count. Check report header for actual judge count. Reduce --explorers or use --count to manually set judges |
/council replaces the old judge skill. Migration:
| Old | New |
|---|---|
| judge recent | /council validate recent |
| judge 2 opus | /council recent (default) |
| judge 3 opus | /council --deep recent |
The judge skill is deprecated. Use /council.
Council uses whatever multi-agent primitives your runtime provides. Each judge is a parallel subagent that writes output to a file and sends a minimal completion signal to the lead.
The --debate flag implements the deliberation protocol pattern:
Independent assessment → evidence exchange → position revision → convergence analysis
Council maintains fresh-context isolation (Ralph Wiggum pattern) with one documented exception:
--debate reuses judge context across R1 and R2. This is intentional. Judges persist within a single atomic council invocation — they do NOT persist across separate council calls. The rationale:
Without --debate, council is fully Ralph-compliant: each judge is a fresh spawn, executes once, writes output, and terminates.
If no multi-agent capability is detected, council falls back to --quick (inline single-agent review). If agent messaging is unavailable, --debate degrades to single-round review with a note in the report.
Convention: council-YYYYMMDD-<target> (e.g., council-20260206-auth-system).
Judge names: judge-{N} for independent judges (e.g., judge-1, judge-2), or judge-{perspective} when using presets/perspectives (e.g., judge-error-paths, judge-feasibility). Use the same logical names across both Codex and Claude backends.
skills/vibe/SKILL.md — Complexity + council for code validation (uses --preset=code-review when spec found)skills/pre-mortem/SKILL.md — Plan validation (uses --preset=plan-review, always 3 judges)skills/post-mortem/SKILL.md — Work wrap-up (uses --preset=retrospective, always 3 judges + retro)skills/swarm/SKILL.md — Multi-agent orchestrationskills/standards/SKILL.md — Language-specific coding standardsskills/research/SKILL.md — Codebase exploration (complementary to council research mode)Weekly Installs
1.4K
Repository
GitHub Stars
197
First Seen
Feb 5, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode1.2K
codex1.2K
github-copilot1.2K
gemini-cli1.2K
kimi-cli1.2K
amp1.2K
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
102,200 周安装
| 2+ |
| Runtime-native |
| Adversarial refinement (2 rounds) |
COUNCIL_EXPLORER_TIMEOUT | 60 | Explorer timeout in seconds |
COUNCIL_R2_TIMEOUT | 90 | Maximum wait time for R2 debate completion after sending debate messages. Shorter than R1 since judges already have context. |
--perspectives-file=<path> |
| Load named perspectives from a YAML file (see Named Perspectives below) |
--preset=<name> | Built-in persona preset (security-audit, architecture, research, ops, code-review, plan-review, doc-review, retrospective, product, developer-experience) |
--count=N | Override agent count per vendor (e.g., --count=4 = 4 Claude, or 4+4 with --mixed). Subject to MAX_AGENTS=12 cap. |
--explorers=N | Explorer sub-agents per judge (default: 0, max: 5). Max effective value depends on judge count. Total agents capped at 12. |
--explorer-model=M | Override explorer model (default: sonnet) |
--technique=<name> | Brainstorm technique (scamper, six-hats, reverse). Case-insensitive. Only applicable to brainstorm mode — error if combined with validate/research. If omitted, unstructured brainstorm (current behavior). See references/brainstorm-techniques.md. |
--profile=<name> | Model quality profile (thorough, balanced, fast). Error if unrecognized name. Overridden by COUNCIL_CLAUDE_MODEL env var (highest priority), then by explicit --count/--deep/--mixed. See references/model-profiles.md. |
| Codex judges skipped in --mixed mode | Codex CLI not on PATH | Install Codex CLI (brew install codex). Model uses user's configured default — no specific model required. |
No output files in .agents/council/ | Permission error or disk full | Check directory permissions with ls -ld .agents/council/. Council auto-creates missing dirs. |
| Agent timeout after 120s | Slow file reads or network issues | Increase timeout with --timeout=300 or check COUNCIL_TIMEOUT env var. Default: 120s. |