verify by yonatangross/orchestkit
npx skills add https://github.com/yonatangross/orchestkit --skill verify包含钩子
此技能使用 Claude 钩子,可以自动响应事件执行代码。安装前请仔细审查。
使用并行专用代理进行综合验证,提供细致的评分(0-10分制)和改进建议。
/ork:verify authentication flow
/ork:verify --model=opus user profile feature
/ork:verify --scope=backend database migrations
SCOPE = "$ARGUMENTS" # 完整的参数字符串,例如 "authentication flow"
SCOPE_TOKEN = "$ARGUMENTS[0]" # 用于标志检测的第一个标记(例如 "--scope=backend")
# $ARGUMENTS[0], $ARGUMENTS[1] 等用于索引访问 (CC 2.1.59)
# 模型覆盖检测 (CC 2.1.72)
MODEL_OVERRIDE = None
for token in "$ARGUMENTS".split():
if token.startswith("--model="):
MODEL_OVERRIDE = token.split("=", 1)[1] # "opus", "sonnet", "haiku"
SCOPE = SCOPE.replace(token, "").strip()
当设置了 MODEL_OVERRIDE 时,通过 model=MODEL_OVERRIDE 将其传递给所有 Agent() 调用。接受符号名称(opus、、)或完整 ID(),遵循 CC 2.1.74。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
sonnethaikuclaude-opus-4-6Opus 4.6 : 代理使用原生自适应思维(无需 MCP 顺序思维)。扩展的 128K 输出支持全面的验证报告。
根据 /effort 级别调整验证深度:
| 工作量级别 | 运行的阶段 | 代理数量 | 输出 |
|---|---|---|---|
| 低 | 仅运行测试 → 通过/失败 | 0 个代理 | 快速检查 |
| 中 | 测试 + 代码质量 + 安全性 | 3 个代理 | 分数 + 主要问题 |
| 高(默认) | 所有 8 个阶段 + 视觉捕获 | 6-7 个代理 | 完整报告 + 评分 |
覆盖: 明确的用户选择(例如,“完整验证”)会覆盖
/effort的降级调整。
在创建任务之前,澄清验证范围:
AskUserQuestion(
questions=[{
"question": "此验证的范围是什么?",
"header": "范围",
"options": [
{"label": "完整验证(推荐)", "description": "所有测试 + 安全性 + 代码质量 + 视觉 + 评分", "markdown": "```\n完整验证(10 个阶段)\n─────────────────────────────\n 7 个并行代理:\n ┌────────────┐ ┌────────────┐\n │ 代码 │ │ 安全性 │\n │ 质量 │ │ 审计员 │\n ├────────────┤ ├────────────┤\n │ 测试 │ │ 后端 │\n │ 生成器 │ │ 架构师 │\n ├────────────┤ ├────────────┤\n │ 前端 │ │ 性能 │\n │ 开发者 │ │ 工程师 │\n ├────────────┤ └────────────┘\n │ 视觉 │\n │ 捕获 │ → gallery.html\n └────────────┘\n ▼\n 综合评分 (0-10)\n 8 个维度 + 等级\n + 视觉画廊\n```"},
{"label": "仅测试", "description": "运行单元 + 集成 + 端到端测试", "markdown": "```\n仅测试\n──────────\n npm test ──▶ 结果\n ┌─────────────────────┐\n │ 单元测试 ✓/✗ │\n │ 集成测试 ✓/✗ │\n │ 端到端测试 ✓/✗ │\n │ 覆盖率 NN% │\n └─────────────────────┘\n 跳过:安全性、质量、UI\n 输出:通过/失败 + 覆盖率\n```"},
{"label": "安全审计", "description": "专注于安全漏洞", "markdown": "```\n安全审计\n──────────────\n security-auditor 代理:\n ┌─────────────────────────┐\n │ OWASP Top 10 ✓/✗ │\n │ 依赖项 CVE ✓/✗ │\n │ 密钥扫描 ✓/✗ │\n │ 认证流程审查 ✓/✗ │\n │ 输入验证 ✓/✗ │\n └─────────────────────────┘\n 输出:安全评分 0-10\n + 漏洞列表\n```"},
{"label": "代码质量", "description": "代码检查、类型、复杂度分析", "markdown": "```\n代码质量\n────────────\n code-quality-reviewer 代理:\n ┌─────────────────────────┐\n │ 代码检查错误 N │\n │ 类型覆盖率 NN% │\n │ 圈复杂度 N.N │\n │ 死代码 N │\n │ 模式违规 N │\n └─────────────────────────┘\n 输出:质量评分 0-10\n + 重构建议\n```"},
{"label": "快速检查", "description": "仅运行测试,跳过详细分析", "markdown": "```\n快速检查(约 1 分钟)\n────────────────────\n 运行测试 ──▶ 通过/失败\n\n 输出:\n ├── 测试结果\n ├── 构建状态\n └── 代码检查状态\n 无代理,无评分,\n 无报告生成\n```"}
],
"multiSelect": true
}]
)
根据答案调整工作流:
加载详细信息:Read("${CLAUDE_SKILL_DIR}/references/orchestration-mode.md") 以获取环境变量检查逻辑、代理团队与任务工具比较以及模式选择规则。
根据编排模式参考,选择代理团队(网状结构 —— 验证者共享发现)或任务工具(星型结构 —— 所有报告给负责人)。
ToolSearch(query="select:mcp__memory__search_nodes")
Write(".claude/chain/capabilities.json", { memory, timestamp })
Read(".claude/chain/state.json") # 如果存在则恢复
验证完成后,写入结果:
Write(".claude/chain/verify-results.json", JSON.stringify({
"phase": "verify", "skill": "verify",
"timestamp": now(), "status": "completed",
"outputs": {
"tests_passed": N, "tests_failed": N,
"coverage": "87%", "security_scan": "clean"
}
}))
可选地安排验证后监控:
# 防护:在无头/CI 环境中跳过 cron (CLAUDE_CODE_DISABLE_CRON)
# 如果设置了环境变量 CLAUDE_CODE_DISABLE_CRON,则运行单次检查
CronCreate(
schedule="0 8 * * *",
prompt="每日回归检查:npm test。
如果连续 7 次通过 → CronDelete。
如果失败 → 发送包含详细信息的警报。"
)
# 创建主验证任务
TaskCreate(
subject="验证 [功能名称] 实现",
description="包含细致评分的综合验证",
activeForm="正在验证 [功能名称] 实现"
)
# 为 8 阶段流程创建子任务
phases = ["运行代码质量检查", "执行安全审计",
"验证测试覆盖率", "验证 API", "检查 UI/UX",
"计算评分", "生成建议", "编译报告"]
for phase in phases:
TaskCreate(subject=phase, activeForm=f"{phase}中")
加载详细信息:Read("${CLAUDE_SKILL_DIR}/references/verification-phases.md") 以获取完整的阶段详情、代理生成定义、代理团队替代方案以及团队解散。
| 阶段 | 活动 | 输出 |
|---|---|---|
| 1. 上下文收集 | Git diff、提交历史 | 变更摘要 |
| 2. 并行代理分发 | 6 个代理进行评估 | 0-10 分评分 |
| 2.5 视觉捕获 | 截图路由、AI 视觉评估 | 画廊 + 视觉评分 |
| 3. 测试执行 | 后端 + 前端测试 | 覆盖率数据 |
| 4. 细致评分 | 综合评分计算 | 等级 (A-F) |
| 5. 改进建议 | 工作量与影响分析 | 优先级列表 |
| 6. 替代方案比较 | 比较方法(可选) | 推荐 |
| 7. 指标跟踪 | 趋势分析 | 历史数据 |
| 8. 报告编译 | 证据工件 + gallery.html | 最终报告 |
| 8.5 代理化循环 | 用户注释,ui-feedback 修复 | 前后差异 |
| 代理 | 关注点 | 输出 |
|---|---|---|
| code-quality-reviewer | 代码检查、类型、模式 | 质量 0-10 |
| security-auditor | OWASP、密钥、CVE | 安全性 0-10 |
| test-generator | 覆盖率、测试质量 | 覆盖率 0-10 |
| backend-system-architect | API 设计、异步 | API 0-10 |
| frontend-ui-developer | React 19、Zod、a11y | UI 0-10 |
| python-performance-engineer | 延迟、资源、扩展性 | 性能 0-10 |
在一条消息中使用 run_in_background=True 和 max_turns=25 启动所有代理。
一旦完成,立即输出每个代理的分数 —— 不要等待所有 6-7 个代理:
安全性: 8.2/10 — 未发现关键漏洞
代码质量: 7.5/10 — 识别出 3 个复杂度热点
[...其余代理仍在运行...]
这为用户提供了多代理验证的实时可见性。如果任何维度的分数低于 security_minimum 阈值(默认为 5.0),立即将其标记为阻塞项 —— 用户可以在不等待其余代理的情况下提前终止。
加载详细信息:Read("${CLAUDE_SKILL_DIR}/references/visual-capture.md") 以获取自动检测、路由发现、截图捕获和 AI 视觉评估。
摘要:自动检测项目框架,启动开发服务器,发现路由,使用 agent-browser 截取每个路由的屏幕截图,使用 Claude 视觉进行评估,生成包含 base64 嵌入图像的自包含 gallery.html。
输出:verification-output/{timestamp}/gallery.html —— 在浏览器中打开以查看所有带有 AI 评估、评分和注释差异的截图。
优雅降级:如果未检测到前端或服务器无法启动,则跳过视觉捕获并发出警告 —— 从不阻塞验证。
加载详细信息:Read("${CLAUDE_SKILL_DIR}/references/visual-capture.md")(阶段 8.5 部分)以获取代理化循环工作流。
触发条件:仅在配置了代理化 MCP 时。为用户提供注释实时 UI 的选择。ui-feedback 代理处理注释,重新截图显示前后对比。
加载 Read("${CLAUDE_PLUGIN_ROOT}/skills/quality-gates/references/unified-scoring-framework.md") 以获取维度、权重、等级阈值和改进优先级。加载 Read("${CLAUDE_SKILL_DIR}/references/quality-model.md") 以获取验证特定的扩展(视觉维度)。加载 Read("${CLAUDE_SKILL_DIR}/references/grading-rubric.md") 以获取每个代理的评分标准。
加载详细信息:Read("${CLAUDE_SKILL_DIR}/rules/evidence-collection.md") 以获取 Git 命令、测试执行模式、指标跟踪和验证后反馈。
加载详细信息:Read("${CLAUDE_SKILL_DIR}/references/policy-as-code.md") 以获取配置。
在 .claude/policies/verification-policy.json 中定义验证规则:
{
"thresholds": {
"composite_minimum": 6.0,
"security_minimum": 7.0,
"coverage_minimum": 70
},
"blocking_rules": [
{"dimension": "security", "below": 5.0, "action": "block"}
]
}
加载详细信息:Read("${CLAUDE_SKILL_DIR}/references/report-template.md") 以获取完整格式。摘要:
# 功能验证报告
**综合评分:[N.N]/10**(等级:[字母])
## 裁决
**[准备合并 | 建议改进 | 已阻塞]**
按需使用 Read("${CLAUDE_SKILL_DIR}/references/<file>") 加载:
| 文件 | 内容 |
|---|---|
verification-phases.md | 8 阶段工作流、代理生成定义、代理团队模式 |
visual-capture.md | 阶段 2.5 + 8.5:截图捕获、AI 视觉、画廊生成、代理化循环 |
quality-model.md | 评分维度和权重(8 个统一维度) |
grading-rubric.md | 每个代理的评分标准 |
report-template.md | 包含视觉证据部分的完整报告格式 |
alternative-comparison.md | 方法比较模板 |
orchestration-mode.md | 代理团队与任务工具 |
policy-as-code.md | 验证策略配置 |
verification-checklist.md | 飞行前检查清单 |
按需使用 Read("${CLAUDE_SKILL_DIR}/rules/<file>") 加载:
| 文件 | 内容 |
|---|---|
scoring-rubric.md | 综合评分、等级、裁决 |
evidence-collection.md | 证据收集和测试模式 |
ork:implement - 包含验证的完整实现ork:review-pr - PR 特定验证testing-unit / testing-integration / testing-e2e - 测试执行模式ork:quality-gates - 质量门模式browser-tools - 用于视觉捕获的浏览器自动化版本: 4.2.0(2026年3月)—— 为增量代理评分添加了渐进式输出
每周安装数
77
代码仓库
GitHub 星标数
132
首次出现
2026年1月22日
安全审计
安装于
gemini-cli70
opencode70
codex69
github-copilot68
claude-code65
cursor64
Contains Hooks
This skill uses Claude hooks which can execute code automatically in response to events. Review carefully before installing.
Comprehensive verification using parallel specialized agents with nuanced grading (0-10 scale) and improvement suggestions.
/ork:verify authentication flow
/ork:verify --model=opus user profile feature
/ork:verify --scope=backend database migrations
SCOPE = "$ARGUMENTS" # Full argument string, e.g., "authentication flow"
SCOPE_TOKEN = "$ARGUMENTS[0]" # First token for flag detection (e.g., "--scope=backend")
# $ARGUMENTS[0], $ARGUMENTS[1] etc. for indexed access (CC 2.1.59)
# Model override detection (CC 2.1.72)
MODEL_OVERRIDE = None
for token in "$ARGUMENTS".split():
if token.startswith("--model="):
MODEL_OVERRIDE = token.split("=", 1)[1] # "opus", "sonnet", "haiku"
SCOPE = SCOPE.replace(token, "").strip()
Pass MODEL_OVERRIDE to all Agent() calls via model=MODEL_OVERRIDE when set. Accepts symbolic names (opus, sonnet, haiku) or full IDs (claude-opus-4-6) per CC 2.1.74.
Opus 4.6 : Agents use native adaptive thinking (no MCP sequential-thinking needed). Extended 128K output supports comprehensive verification reports.
Scale verification depth based on /effort level:
| Effort Level | Phases Run | Agents | Output |
|---|---|---|---|
| low | Run tests only → pass/fail | 0 agents | Quick check |
| medium | Tests + code quality + security | 3 agents | Score + top issues |
| high (default) | All 8 phases + visual capture | 6-7 agents | Full report + grades |
Override: Explicit user selection (e.g., "Full verification") overrides
/effortdownscaling.
BEFORE creating tasks , clarify verification scope:
AskUserQuestion(
questions=[{
"question": "What scope for this verification?",
"header": "Scope",
"options": [
{"label": "Full verification (Recommended)", "description": "All tests + security + code quality + visual + grades", "markdown": "```\nFull Verification (10 phases)\n─────────────────────────────\n 7 parallel agents:\n ┌────────────┐ ┌────────────┐\n │ Code │ │ Security │\n │ Quality │ │ Auditor │\n ├────────────┤ ├────────────┤\n │ Test │ │ Backend │\n │ Generator │ │ Architect │\n ├────────────┤ ├────────────┤\n │ Frontend │ │ Performance│\n │ Developer │ │ Engineer │\n ├────────────┤ └────────────┘\n │ Visual │\n │ Capture │ → gallery.html\n └────────────┘\n ▼\n Composite Score (0-10)\n 8 dimensions + Grade\n + Visual Gallery\n```"},
{"label": "Tests only", "description": "Run unit + integration + e2e tests", "markdown": "```\nTests Only\n──────────\n npm test ──▶ Results\n ┌─────────────────────┐\n │ Unit tests ✓/✗ │\n │ Integration ✓/✗ │\n │ E2E ✓/✗ │\n │ Coverage NN% │\n └─────────────────────┘\n Skip: security, quality, UI\n Output: Pass/fail + coverage\n```"},
{"label": "Security audit", "description": "Focus on security vulnerabilities", "markdown": "```\nSecurity Audit\n──────────────\n security-auditor agent:\n ┌─────────────────────────┐\n │ OWASP Top 10 ✓/✗ │\n │ Dependency CVEs ✓/✗ │\n │ Secrets scan ✓/✗ │\n │ Auth flow review ✓/✗ │\n │ Input validation ✓/✗ │\n └─────────────────────────┘\n Output: Security score 0-10\n + vulnerability list\n```"},
{"label": "Code quality", "description": "Lint, types, complexity analysis", "markdown": "```\nCode Quality\n────────────\n code-quality-reviewer agent:\n ┌─────────────────────────┐\n │ Lint errors N │\n │ Type coverage NN% │\n │ Cyclomatic complex N.N │\n │ Dead code N │\n │ Pattern violations N │\n └─────────────────────────┘\n Output: Quality score 0-10\n + refactor suggestions\n```"},
{"label": "Quick check", "description": "Just run tests, skip detailed analysis", "markdown": "```\nQuick Check (~1 min)\n────────────────────\n Run tests ──▶ Pass/Fail\n\n Output:\n ├── Test results\n ├── Build status\n └── Lint status\n No agents, no grading,\n no report generation\n```"}
],
"multiSelect": true
}]
)
Based on answer, adjust workflow:
Load details: Read("${CLAUDE_SKILL_DIR}/references/orchestration-mode.md") for env var check logic, Agent Teams vs Task Tool comparison, and mode selection rules.
Choose Agent Teams (mesh -- verifiers share findings) or Task tool (star -- all report to lead) based on the orchestration mode reference.
ToolSearch(query="select:mcp__memory__search_nodes")
Write(".claude/chain/capabilities.json", { memory, timestamp })
Read(".claude/chain/state.json") # resume if exists
After verification completes, write results:
Write(".claude/chain/verify-results.json", JSON.stringify({
"phase": "verify", "skill": "verify",
"timestamp": now(), "status": "completed",
"outputs": {
"tests_passed": N, "tests_failed": N,
"coverage": "87%", "security_scan": "clean"
}
}))
Optionally schedule post-verification monitoring:
# Guard: Skip cron in headless/CI (CLAUDE_CODE_DISABLE_CRON)
# if env CLAUDE_CODE_DISABLE_CRON is set, run a single check instead
CronCreate(
schedule="0 8 * * *",
prompt="Daily regression check: npm test.
If 7 consecutive passes → CronDelete.
If failures → alert with details."
)
# Create main verification task
TaskCreate(
subject="Verify [feature-name] implementation",
description="Comprehensive verification with nuanced grading",
activeForm="Verifying [feature-name] implementation"
)
# Create subtasks for 8-phase process
phases = ["Run code quality checks", "Execute security audit",
"Verify test coverage", "Validate API", "Check UI/UX",
"Calculate grades", "Generate suggestions", "Compile report"]
for phase in phases:
TaskCreate(subject=phase, activeForm=f"{phase}ing")
Load details: Read("${CLAUDE_SKILL_DIR}/references/verification-phases.md") for complete phase details, agent spawn definitions, Agent Teams alternative, and team teardown.
| Phase | Activities | Output |
|---|---|---|
| 1. Context Gathering | Git diff, commit history | Changes summary |
| 2. Parallel Agent Dispatch | 6 agents evaluate | 0-10 scores |
| 2.5 Visual Capture | Screenshot routes, AI vision eval | Gallery + visual score |
| 3. Test Execution | Backend + frontend tests | Coverage data |
| 4. Nuanced Grading | Composite score calculation | Grade (A-F) |
| 5. Improvement Suggestions | Effort vs impact analysis | Prioritized list |
| 6. Alternative Comparison | Compare approaches (optional) | Recommendation |
| Agent | Focus | Output |
|---|---|---|
| code-quality-reviewer | Lint, types, patterns | Quality 0-10 |
| security-auditor | OWASP, secrets, CVEs | Security 0-10 |
| test-generator | Coverage, test quality | Coverage 0-10 |
| backend-system-architect | API design, async | API 0-10 |
| frontend-ui-developer | React 19, Zod, a11y | UI 0-10 |
| python-performance-engineer | Latency, resources, scaling | Performance 0-10 |
Launch ALL agents in ONE message with run_in_background=True and max_turns=25.
Output each agent's score as soon as it completes — don't wait for all 6-7 agents:
Security: 8.2/10 — No critical vulnerabilities found
Code Quality: 7.5/10 — 3 complexity hotspots identified
[...remaining agents still running...]
This gives users real-time visibility into multi-agent verification. If any dimension scores below the security_minimum threshold (default 5.0), flag it as a blocker immediately — the user can terminate early without waiting for remaining agents.
Load details: Read("${CLAUDE_SKILL_DIR}/references/visual-capture.md") for auto-detection, route discovery, screenshot capture, and AI vision evaluation.
Summary : Auto-detects project framework, starts dev server, discovers routes, uses agent-browser to screenshot each route, evaluates with Claude vision, generates self-contained gallery.html with base64-embedded images.
Output : verification-output/{timestamp}/gallery.html — open in browser to see all screenshots with AI evaluations, scores, and annotation diffs.
Graceful degradation : If no frontend detected or server won't start, skips visual capture with a warning — never blocks verification.
Load details: Read("${CLAUDE_SKILL_DIR}/references/visual-capture.md") (Phase 8.5 section) for agentation loop workflow.
Trigger : Only when agentation MCP is configured. Offers user the choice to annotate the live UI. ui-feedback agent processes annotations, re-screenshots show before/after.
Load Read("${CLAUDE_PLUGIN_ROOT}/skills/quality-gates/references/unified-scoring-framework.md") for dimensions, weights, grade thresholds, and improvement prioritization. Load Read("${CLAUDE_SKILL_DIR}/references/quality-model.md") for verify-specific extensions (Visual dimension). Load Read("${CLAUDE_SKILL_DIR}/references/grading-rubric.md") for per-agent scoring criteria.
Load details: Read("${CLAUDE_SKILL_DIR}/rules/evidence-collection.md") for git commands, test execution patterns, metrics tracking, and post-verification feedback.
Load details: Read("${CLAUDE_SKILL_DIR}/references/policy-as-code.md") for configuration.
Define verification rules in .claude/policies/verification-policy.json:
{
"thresholds": {
"composite_minimum": 6.0,
"security_minimum": 7.0,
"coverage_minimum": 70
},
"blocking_rules": [
{"dimension": "security", "below": 5.0, "action": "block"}
]
}
Load details: Read("${CLAUDE_SKILL_DIR}/references/report-template.md") for full format. Summary:
# Feature Verification Report
**Composite Score: [N.N]/10** (Grade: [LETTER])
## Verdict
**[READY FOR MERGE | IMPROVEMENTS RECOMMENDED | BLOCKED]**
Load on demand with Read("${CLAUDE_SKILL_DIR}/references/<file>"):
| File | Content |
|---|---|
verification-phases.md | 8-phase workflow, agent spawn definitions, Agent Teams mode |
visual-capture.md | Phase 2.5 + 8.5: screenshot capture, AI vision, gallery generation, agentation loop |
quality-model.md | Scoring dimensions and weights (8 unified) |
grading-rubric.md | Per-agent scoring criteria |
report-template.md | Full report format with visual evidence section |
alternative-comparison.md |
Load on demand with Read("${CLAUDE_SKILL_DIR}/rules/<file>"):
| File | Content |
|---|---|
scoring-rubric.md | Composite scoring, grades, verdicts |
evidence-collection.md | Evidence gathering and test patterns |
ork:implement - Full implementation with verificationork:review-pr - PR-specific verificationtesting-unit / testing-integration / testing-e2e - Test execution patternsork:quality-gates - Quality gate patternsbrowser-tools - Browser automation for visual captureVersion: 4.2.0 (March 2026) — Added progressive output for incremental agent scores
Weekly Installs
77
Repository
GitHub Stars
132
First Seen
Jan 22, 2026
Security Audits
Gen Agent Trust HubFailSocketPassSnykPass
Installed on
gemini-cli70
opencode70
codex69
github-copilot68
claude-code65
cursor64
Skills CLI 使用指南:AI Agent 技能包管理器安装与管理教程
44,900 周安装
| Trend analysis |
| Historical data |
| 8. Report Compilation | Evidence artifacts + gallery.html | Final report |
| 8.5 Agentation Loop | User annotates, ui-feedback fixes | Before/after diffs |
| Approach comparison template |
orchestration-mode.md | Agent Teams vs Task Tool |
policy-as-code.md | Verification policy configuration |
verification-checklist.md | Pre-flight checklist |