代码评审与质量分析工具 - 多智能体并行评审，提升开发效率与代码质量

reflexion%3Acritique by neolabhq/context-engineering-kit

242 周安装量

699 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/neolabhq/context-engineering-kit --skill reflexion:critique

开发自动化代码质量

🇨🇳中文介绍

工作评审命令

本次评审为仅报告模式 - 发现的问题将呈现给用户考虑，不会自动修复。

您的工作流程

第一阶段：上下文收集

开始评审前，先了解已完成的工作：

确定要评审的工作范围：
- 如果提供了参数：使用参数来识别特定文件、提交或对话上下文
- 如果未提供参数：评审最近的对话历史和文件变更
- 如果范围不明确，询问用户："我应该评审什么工作？（最近的变更、特定功能、整个对话等）"
捕获相关上下文：
- 原始需求或用户请求
- 已修改或创建的文件
- 实施过程中做出的决策
- 任何约束或假设

总结范围以供确认：

📋 评审范围：
- 原始请求：[摘要]
- 变更的文件：[列表]
- 采取的方法：[简要描述]

即将开始多智能体评审...

第二阶段：独立评审员评审（并行）

使用 Task 工具并行生成三个专业评审员智能体。每个评审员独立工作，看不到其他评审员的评审结果。

评审员 1：需求验证员

智能体提示：

您是一名需求验证员，正在对已完成的工作进行彻底评审。

## 您的任务

评审以下工作并评估其与原始需求的一致性：

[上下文]
原始需求：{requirements}
已完成的工作：{变更摘要}
已修改的文件：{文件列表}
[/上下文]

## 您的流程（验证链）

1. **初步分析**：
   - 列出原始请求中的所有需求
   - 对照实施情况检查每个需求
   - 识别差距、过度交付或错位

2. **自我验证**：
   - 生成 3-5 个关于您分析的验证问题
   - 示例："我是否检查了需求中提到的边缘情况？"
   - 诚实地回答每个问题
   - 根据答案完善您的分析

3. **最终评审**：
   提供结构化输出：

   ### 需求一致性得分：X/10

   ### 需求覆盖情况：
   ✅ [已满足需求 1]
   ✅ [已满足需求 2]
   ⚠️ [部分满足需求 3] - [解释]
   ❌ [遗漏需求 4] - [解释]

   ### 已识别的差距：
   - [差距 1，严重性：关键/高/中/低]
   - [差距 2，严重性]

   ### 过度交付/范围蔓延：
   - [项目 1] - [这是好还是有问题？]

   ### 验证问题与答案：
   Q1：[问题]
   A1：[影响您评审的答案]
   ...

请具体、客观，并引用代码中的示例。

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

第三阶段：交叉评审与辩论

收到所有三位评审员的报告后：

综合发现：
- 识别一致领域
- 识别矛盾或分歧
- 注意任何评审中的空白
进行辩论环节（如果存在重大分歧）：
- 向评审员呈现相互矛盾的观点
- 要求每位评审员评审其他评审员的发现
- 示例："需求验证员认为方法过度设计，但解决方案架构师认为它适合规模扩展。请双方评审此分歧并提供理由。"
- 使用 Task 工具生成具有先前评审上下文的后续智能体
达成共识：
- 综合辩论结果
- 确定哪些观点更有依据
- 记录任何未解决的分歧，并注明"理性的人可能有不同看法"

第四阶段：生成共识报告

将所有发现汇编成一份全面、可操作的报告：

# 🔍 工作评审报告

## 执行摘要
[2-3 句话总结整体评估]

**整体质量得分**：X/10（三位评审员得分的平均值）

---

## 📊 评审员得分

| 评审员 | 得分 | 关键发现 |
|-------|-------|-------------|
| 需求验证员 | X/10 | [一行摘要] |
| 解决方案架构师 | X/10 | [一行摘要] |
| 代码质量评审员 | X/10 | [一行摘要] |

---

## ✅ 优势

[综合列表，说明哪些方面做得好，附具体示例]

1. **[优势 1]**
   - 来源：[哪位评审员注意到此点]
   - 证据：[具体示例]

---

## ⚠️ 问题与差距

### 关键问题
[需要立即关注的问题]

- **[问题 1]**
  - 识别者：[评审员姓名]
  - 位置：[文件:行号，如适用]
  - 影响：[解释]
  - 建议：[应做什么]

### 高优先级
[重要但不阻塞]

### 中优先级
[值得拥有的改进]

### 低优先级
[次要的润色项目]

---

## 🎯 需求一致性

[来自需求验证员的详细细分]

**已满足需求**：X/Y
**覆盖率**：Z%

[具体需求状态表]

---

## 🏗️ 解决方案架构

[来自解决方案架构师的关键见解]

**所选方法**：[简要描述]

**考虑的替代方法**：
1. [替代方法 1] - [为何所选方法更好/更差]
2. [替代方法 2] - [为何所选方法更好/更差]

**建议**：[坚持当前方法 / 考虑替代方法 X，因为...]

---

## 🔨 重构建议

[来自代码质量评审员的优先级列表]

### 高优先级重构

1. **[重构名称]**
   - 好处：[解释]
   - 工作量：[估算]
   - 重构前/后：[代码示例]

### 中优先级重构
[类似结构]

---

## 🤝 共识领域

[列出所有评审员一致同意的点]

- [共识点 1]
- [共识点 2]

---

## 💬 辩论领域

[如适用 - 评审员存在分歧的地方]

**辩论 1：[主题]**
- 需求验证员立场：[摘要]
- 解决方案架构师立场：[摘要]
- 解决方案：[达成的共识或"合理的分歧"]

---

## 📋 待办事项（按优先级排序）

基于评审，以下是建议的后续步骤：

**必须做**：
- [ ] [关键行动 1]
- [ ] [关键行动 2]

**应该做**：
- [ ] [高优先级行动 1]
- [ ] [高优先级行动 2]

**可以做**：
- [ ] [中优先级行动 1]
- [ ] [值得拥有的行动 2]

---

## 🎓 学习机会

[可以改进未来工作的经验教训]

- [学习点 1]
- [学习点 2]

---

## 📝 结论

[总结工作是否符合质量标准及关键要点的最终评估段落]

**结论**：✅ 准备发布 | ⚠️ 发布前需要改进 | ❌ 需要重大返工

---

*使用多智能体辩论 + LLM 作为评审员模式生成*
*评审日期：[时间戳]*

保持客观：基于证据而非偏好进行评估
保持具体：始终引用文件位置、行号和代码示例
保持建设性：将批评视为改进的机会
保持平衡：承认优势和劣势
保持可操作性：提供具体的建议和示例
考虑上下文：考虑项目约束、团队规模、时间线
避免偏见：不要在没有理由的情况下偏袒某些模式/风格

# 评审对话中的近期工作
/critique

# 评审特定文件
/critique src/feature.ts src/feature.test.ts

# 带特定焦点评审
/critique --focus=security

# 评审一个 git 提交
/critique HEAD~1..HEAD

这是一个仅报告命令 - 它不会进行更改
由于多智能体协调，评审可能需要 2-5 分钟
得分是相对于专业开发标准的
评审员之间的分歧是有价值的见解，而非失败
使用发现来指导未来的开发决策

2026 年 2 月 19 日

🇺🇸English

Work Critique Command

The review is report-only - findings are presented for user consideration without automatic fixes.

Your Workflow

Phase 1: Context Gathering

Before starting the review, understand what was done:

Identify the scope of work to review :
- If arguments provided: Use them to identify specific files, commits, or conversation context
- If no arguments: Review the recent conversation history and file changes
- Ask user if scope is unclear: "What work should I review? (recent changes, specific feature, entire conversation, etc.)"
Capture relevant context :
- Original requirements or user request
- Files that were modified or created
- Decisions made during implementation
- Any constraints or assumptions

Summarize scope for confirmation :

📋 Review Scope:
- Original request: [summary]
- Files changed: [list]
- Approach taken: [brief description]

Proceeding with multi-agent review...

Phase 2: Independent Judge Reviews (Parallel)

Use the Task tool to spawn three specialized judge agents in parallel. Each judge operates independently without seeing others' reviews.

Judge 1: Requirements Validator

Prompt for Agent:

You are a Requirements Validator conducting a thorough review of completed work.

## Your Task

Review the following work and assess alignment with original requirements:

[CONTEXT]
Original Requirements: {requirements}
Work Completed: {summary of changes}
Files Modified: {file list}
[/CONTEXT]

## Your Process (Chain-of-Verification)

1. **Initial Analysis**:
   - List all requirements from the original request
   - Check each requirement against the implementation
   - Identify gaps, over-delivery, or misalignments

2. **Self-Verification**:
   - Generate 3-5 verification questions about your analysis
   - Example: "Did I check for edge cases mentioned in requirements?"
   - Answer each question honestly
   - Refine your analysis based on answers

3. **Final Critique**:
   Provide structured output:

   ### Requirements Alignment Score: X/10

   ### Requirements Coverage:
   ✅ [Met requirement 1]
   ✅ [Met requirement 2]
   ⚠️ [Partially met requirement 3] - [explanation]
   ❌ [Missed requirement 4] - [explanation]

   ### Gaps Identified:
   - [gap 1 with severity: Critical/High/Medium/Low]
   - [gap 2 with severity]

   ### Over-Delivery/Scope Creep:
   - [item 1] - [is this good or problematic?]

   ### Verification Questions & Answers:
   Q1: [question]
   A1: [answer that influenced your critique]
   ...

Be specific, objective, and cite examples from the code.

Judge 2: Solution Architect

Prompt for Agent:

You are a Solution Architect evaluating the technical approach and design decisions.

## Your Task

Review the implementation approach and assess if it's optimal:

[CONTEXT]
Problem to Solve: {problem description}
Solution Implemented: {summary of approach}
Files Modified: {file list with brief description of changes}
[/CONTEXT]

## Your Process (Chain-of-Verification)

1. **Initial Evaluation**:
   - Analyze the chosen approach
   - Consider alternative approaches
   - Evaluate trade-offs and design decisions
   - Check for architectural patterns and best practices

2. **Self-Verification**:
   - Generate 3-5 verification questions about your evaluation
   - Example: "Am I being biased toward a particular pattern?"
   - Example: "Did I consider the project's existing architecture?"
   - Answer each question honestly
   - Adjust your evaluation based on answers

3. **Final Critique**:
   Provide structured output:

   ### Solution Optimality Score: X/10

   ### Approach Assessment:
   **Chosen Approach**: [brief description]
   **Strengths**:
   - [strength 1 with explanation]
   - [strength 2]

   **Weaknesses**:
   - [weakness 1 with explanation]
   - [weakness 2]

   ### Alternative Approaches Considered:
   1. **[Alternative 1]**
      - Pros: [list]
      - Cons: [list]
      - Recommendation: [Better/Worse/Equivalent to current approach]

   2. **[Alternative 2]**
      - Pros: [list]
      - Cons: [list]
      - Recommendation: [Better/Worse/Equivalent]

   ### Design Pattern Assessment:
   - Patterns used correctly: [list]
   - Patterns missing: [list with explanation why they'd help]
   - Anti-patterns detected: [list with severity]

   ### Scalability & Maintainability:
   - [assessment of how solution scales]
   - [assessment of maintainability]

   ### Verification Questions & Answers:
   Q1: [question]
   A1: [answer that influenced your critique]
   ...

Be objective and consider the context of the project (size, team, constraints).

Judge 3: Code Quality Reviewer

Prompt for Agent:

You are a Code Quality Reviewer assessing implementation quality and suggesting refactorings.

## Your Task

Review the code quality and identify refactoring opportunities:

[CONTEXT]
Files Changed: {file list}
Implementation Details: {code snippets or file contents as needed}
Project Conventions: {any known conventions from codebase}
[/CONTEXT]

## Your Process (Chain-of-Verification)

1. **Initial Review**:
   - Assess code readability and clarity
   - Check for code smells and complexity
   - Evaluate naming, structure, and organization
   - Look for duplication and coupling issues
   - Verify error handling and edge cases

2. **Self-Verification**:
   - Generate 3-5 verification questions about your review
   - Example: "Am I applying personal preferences vs. objective quality criteria?"
   - Example: "Did I consider the existing codebase style?"
   - Answer each question honestly
   - Refine your review based on answers

3. **Final Critique**:
   Provide structured output:

   ### Code Quality Score: X/10

   ### Quality Assessment:
   **Strengths**:
   - [strength 1 with specific example]
   - [strength 2]

   **Issues Found**:
   - [issue 1] - Severity: [Critical/High/Medium/Low]
     - Location: [file:line]
     - Example: [code snippet]

   ### Refactoring Opportunities:

   1. **[Refactoring 1 Name]** - Priority: [High/Medium/Low]
      - Current code:
        ```
        [code snippet]
        ```
      - Suggested refactoring:
        ```
        [improved code]
        ```
      - Benefits: [explanation]
      - Effort: [Small/Medium/Large]

   2. **[Refactoring 2]**
      - [same structure]

   ### Code Smells Detected:
   - [smell 1] at [location] - [explanation and impact]
   - [smell 2]

   ### Complexity Analysis:
   - High complexity areas: [list with locations]
   - Suggested simplifications: [list]

   ### Verification Questions & Answers:
   Q1: [question]
   A1: [answer that influenced your critique]
   ...

Provide specific, actionable feedback with code examples.

Implementation Note : Use the Task tool with subagent_type="general-purpose" to spawn these three agents in parallel, each with their respective prompt and context.

Phase 3: Cross-Review & Debate

After receiving all three judge reports:

Synthesize the findings :
- Identify areas of agreement
- Identify contradictions or disagreements
- Note gaps in any review
Conduct debate session (if significant disagreements exist):
- Present conflicting viewpoints to judges
- Ask each judge to review the other judges' findings
- Example: "Requirements Validator says approach is overengineered, but Solution Architect says it's appropriate for scale. Please both review this disagreement and provide reasoning."
- Use Task tool to spawn follow-up agents that have context of previous reviews
Reach consensus :
- Synthesize the debate outcomes
- Identify which viewpoints are better supported
- Document any unresolved disagreements with "reasonable people may disagree" notation

Phase 4: Generate Consensus Report

Compile all findings into a comprehensive, actionable report:

# 🔍 Work Critique Report

## Executive Summary
[2-3 sentences summarizing overall assessment]

**Overall Quality Score**: X/10 (average of three judge scores)

---

## 📊 Judge Scores

| Judge | Score | Key Finding |
|-------|-------|-------------|
| Requirements Validator | X/10 | [one-line summary] |
| Solution Architect | X/10 | [one-line summary] |
| Code Quality Reviewer | X/10 | [one-line summary] |

---

## ✅ Strengths

[Synthesized list of what was done well, with specific examples]

1. **[Strength 1]**
   - Source: [which judge(s) noted this]
   - Evidence: [specific example]

---

## ⚠️ Issues & Gaps

### Critical Issues
[Issues that need immediate attention]

- **[Issue 1]**
  - Identified by: [judge name]
  - Location: [file:line if applicable]
  - Impact: [explanation]
  - Recommendation: [what to do]

### High Priority
[Important but not blocking]

### Medium Priority
[Nice to have improvements]

### Low Priority
[Minor polish items]

---

## 🎯 Requirements Alignment

[Detailed breakdown from Requirements Validator]

**Requirements Met**: X/Y
**Coverage**: Z%

[Specific requirements table with status]

---

## 🏗️ Solution Architecture

[Key insights from Solution Architect]

**Chosen Approach**: [brief description]

**Alternative Approaches Considered**:
1. [Alternative 1] - [Why chosen approach is better/worse]
2. [Alternative 2] - [Why chosen approach is better/worse]

**Recommendation**: [Stick with current / Consider alternative X because...]

---

## 🔨 Refactoring Recommendations

[Prioritized list from Code Quality Reviewer]

### High Priority Refactorings

1. **[Refactoring Name]**
   - Benefit: [explanation]
   - Effort: [estimate]
   - Before/After: [code examples]

### Medium Priority Refactorings
[similar structure]

---

## 🤝 Areas of Consensus

[List where all judges agreed]

- [Agreement 1]
- [Agreement 2]

---

## 💬 Areas of Debate

[If applicable - where judges disagreed]

**Debate 1: [Topic]**
- Requirements Validator position: [summary]
- Solution Architect position: [summary]
- Resolution: [consensus reached or "reasonable disagreement"]

---

## 📋 Action Items (Prioritized)

Based on the critique, here are recommended next steps:

**Must Do**:
- [ ] [Critical action 1]
- [ ] [Critical action 2]

**Should Do**:
- [ ] [High priority action 1]
- [ ] [High priority action 2]

**Could Do**:
- [ ] [Medium priority action 1]
- [ ] [Nice to have action 2]

---

## 🎓 Learning Opportunities

[Lessons that could improve future work]

- [Learning 1]
- [Learning 2]

---

## 📝 Conclusion

[Final assessment paragraph summarizing whether the work meets quality standards and key takeaways]

**Verdict**: ✅ Ready to ship | ⚠️ Needs improvements before shipping | ❌ Requires significant rework

---

*Generated using Multi-Agent Debate + LLM-as-a-Judge pattern*
*Review Date: [timestamp]*

Important Guidelines

Be Objective : Base assessments on evidence, not preferences
Be Specific : Always cite file locations, line numbers, and code examples
Be Constructive : Frame criticism as opportunities for improvement
Be Balanced : Acknowledge both strengths and weaknesses
Be Actionable : Provide concrete recommendations with examples
Consider Context : Account for project constraints, team size, timelines
Avoid Bias : Don't favor certain patterns/styles without justification

Usage Examples

# Review recent work from conversation
/critique

# Review specific files
/critique src/feature.ts src/feature.test.ts

# Review with specific focus
/critique --focus=security

# Review a git commit
/critique HEAD~1..HEAD

Notes

This is a report-only command - it does not make changes
The review may take 2-5 minutes due to multi-agent coordination
Scores are relative to professional development standards
Disagreements between judges are valuable insights, not failures
Use findings to inform future development decisions

Weekly Installs

242

Repository

neolabhq/contex…ring-kit

GitHub Stars

699

First Seen

Feb 19, 2026

Installed on

opencode236

codex234

github-copilot234

gemini-cli233

kimi-cli231

amp231

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

140,500 周安装

代码评审与质量分析工具 - 多智能体并行评审，提升开发效率与代码质量

🇨🇳中文介绍

工作评审命令

您的工作流程

第一阶段：上下文收集

第二阶段：独立评审员评审（并行）

评审员 1：需求验证员

相关 Skills

评审员 2：解决方案架构师

评审员 3：代码质量评审员

第三阶段：交叉评审与辩论

第四阶段：生成共识报告

重要准则

使用示例

说明

🇺🇸English

Work Critique Command

Your Workflow

Phase 1: Context Gathering

Phase 2: Independent Judge Reviews (Parallel)

Judge 1: Requirements Validator

Judge 2: Solution Architect

Judge 3: Code Quality Reviewer

Phase 3: Cross-Review & Debate

Phase 4: Generate Consensus Report

Important Guidelines

Usage Examples

Notes

最新 Skills