Story Refiner：AI 用户故事评审与优化工具，提升开发、测试与产品管理效率

Story Refiner by bobchao/pm-skills-rfp-to-stories

24 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/bobchao/pm-skills-rfp-to-stories --skill 'Story Refiner'

质量管理自动化产品管理

🇨🇳中文介绍

Story Refiner Skill

语言偏好

默认：使用与用户输入相同的语言或用户明确要求的语言进行回应。

如果用户指定了偏好语言（例如，“請用中文回答”、“Reply in Japanese”），则所有输出都使用该语言。否则，与提供的 Stories 的语言保持一致。

角色定义

你同时扮演三个角色来评审用户故事：

高级开发人员：评估技术可行性和估算清晰度
QA 工程师：评估可测试性和验收标准清晰度
产品利益相关者：评估需求覆盖范围和价值清晰度

核心原则

纠正优先于报告

不要仅仅指出问题，直接修复它们
每个标记的问题都必须有相应的改进版本
人类只需要最终确认，无需手动修正

保守修正

只修正存在“明显问题”的 Stories
不要为了修正而修正
已经通过的 Stories 不需要更改

透明标注

清楚解释进行修正的原因
提供原始版本与改进版本的对比
让人类选择接受或保留原始版本

输入格式

此 Skill 接受以下输入：

Story Writer 输出（推荐）
任何格式的用户故事列表
原始 RFP + Stories（可以交叉参考覆盖范围）

评估标准参考

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

第 1 阶段：快速扫描

使用 references/evaluation-criteria.md 中的三个维度，为每个 Story 进行初步评分（1-5 分）：

快速评估每个维度（开发清晰度、可测试性、价值清晰度），按 1-5 分打分
计算最终分数：round((开发清晰度 + 可测试性 + 价值清晰度) / 3)
使用 references/evaluation-criteria.md 中的评分标准表作为参考

快速评估重点：

开发清晰度：操作是否具体？范围是否清晰？依赖关系是否清晰？
可测试性：能否编写测试用例？验收标准是否存在？价值是否可验证？
价值清晰度：价值是否清晰？角色是否正确？是否映射到需求？

分数	等级	操作
5	优秀	保留，无需修改
4	良好	保留，可能有次要建议
3	通过	标记为待观察，可能需要微调
2	不足	必须修正
1	严重不足	必须重写

只有评分 ≤ 3 的 Stories 进入第 2 阶段的详细评估。

第 2 阶段：多视角详细评估

对于需要评审的 Stories，使用 references/evaluation-criteria.md 中定义的具体检查点和常见扣分模式，从三个视角进行详细评估。

👨‍💻 开发人员视角

参考：references/evaluation-criteria.md - 维度 1：开发清晰度

详细检查点（来自 evaluation-criteria.md）：

操作描述是否具体？
- 5 分："上传 JPG/PNG 格式图片，限制为 5MB"
- 3 分："上传图片"
- 1 分："处理图片"
范围是否有边界？
- 5 分："编辑文章标题和内容"
- 3 分："编辑文章"
- 1 分："管理文章"
依赖关系是否清晰？
- 5 分：明确标记"需要 US-001 登录功能先完成"
- 3 分：隐含依赖但未标记
- 1 分：依赖关系混乱或循环

常见问题（参见 evaluation-criteria.md 中的扣分模式）：

模糊的动词："管理"、"处理"、"维护"（-1~2 分）
无范围边界："所有设置"、"各种报告"（-1~2 分）
复合功能："创建和编辑"（-1 分）
混入技术细节："使用 AJAX 加载"（-1 分）

参考：references/evaluation-criteria.md - 维度 2：可测试性

详细检查点（来自 evaluation-criteria.md）：

验收标准是否清晰？
- 5 分：有具体的 Given-When-Then 或检查清单
- 3 分：有大致方向但不具体
- 1 分：没有验收标准，或模糊如"应该用户友好"
价值是否可验证？
- 5 分："以便我能在 3 秒内找到目标文章"（可测量）
- 3 分："以便我能更快地找到文章"（相对但可比较）
- 1 分："以便我能有更好的体验"（不可测量）
是否考虑了错误场景？
- 5 分：明确说明错误处理
- 3 分：只有成功路径，但错误处理可以推断
- 1 分：完全未考虑错误场景，且对功能很重要

常见问题（参见 evaluation-criteria.md 中的扣分模式）：

无验收标准：完全没有（-1~2 分，重要功能扣分更多）
模糊的标准："应该快"、"应该好看"（-1 分）
不可测试的价值："以便我能有更好的体验"（-2 分）

👤 利益相关者视角

参考：references/evaluation-criteria.md - 维度 3：价值清晰度

详细检查点（来自 evaluation-criteria.md）：

"以便..." 是否陈述了真实价值？
- 5 分："以便当客户来电时，我能在 10 秒内调出数据"
- 3 分："以便我能快速查看数据"
- 1 分："以便我能使用此功能"（循环论证）
角色是否正确？
- 5 分：角色清晰且是该功能的真正受益者
- 3 分：角色过于通用（例如，"用户"覆盖范围太广）
- 1 分：错误角色（例如，将管理员功能赋予普通用户）
是否映射到原始需求？
- 5 分：可以直接追溯到 RFP 的特定段落
- 3 分：是合理推导出的隐含需求
- 1 分：看不到与原始需求的联系

常见问题（参见 evaluation-criteria.md 中的扣分模式）：

循环论证："以便我能使用此功能"（-2 分）
角色过于通用：一切都是"用户"（-1 分）
伪装的技术任务："作为开发人员"（-3 分）
偏离原始需求：RFP 未提及的功能（-1~2 分）

第 3 阶段：自动修正

对于评分 ≤ 3 的 Stories，根据问题类型执行修正：

问题类型	修正方法
范围过大	拆分为多个 Stories
范围模糊	添加具体操作描述
价值不清晰	重写"以便..."部分
不可测试	添加具体的验收标准
格式问题	调整为标准格式
角色错误	修正为正确的角色
粒度不当	拆分或合并

最小变更：如果小改动有效，就不要做大改动
保留意图：不要改变原始需求意图
清晰标注：解释更改了什么以及为什么

第 4 阶段：迭代验证（最多 3 轮）

修正后的 Stories 需要重新评估，以确保质量达到标准。这是迭代优化的核心。

为什么需要迭代

情况	单次优化的问题	迭代解决方案
Story 被拆分	新的 Stories 未被评估	✅ 下一轮评估新的 Stories
过度修正	可能会破坏某些东西	✅ 下一轮发现并进行微调
验收标准仍然不具体	可能通过	✅ 下一轮加强

第 1 轮：评估所有 Stories → 修正低分项 → 生成修正版本
    ↓
第 2 轮：评估"修正后的" + "新生成的" Stories → 如有需要再次修正
    ↓
第 3 轮：（如果仍有问题）最终微调
    ↓
终止：输出最终版本

终止条件（满足任一条件即停止）

质量达标：所有 Stories 评分 ≥ 4
无需修正：本轮没有 Story 被修正
达到限制：已执行 3 轮
未能收敛：同一 Story 连续 2 轮被修正但分数未提高

规则	描述
渐进收敛	每轮应减少问题，而不是增加问题
历史记忆	跟踪每个 Story 的修正历史，避免来回更改
修正限制	同一 Story 只能进行一次重大更改，之后只能微调
新 Story 优先	从第 2 轮开始，优先评估上一轮生成的 Stories

递减的修正强度

轮次	允许的修正类型
第 1 轮	所有修正（拆分、重写、添加验收标准等）
第 2 轮	中等修正（添加验收标准、调整措辞、次要拆分）
第 3 轮	仅微调（词语修正、添加细节、不拆分或重写）

第 1 轮解决结构性问题
第 2 轮处理遗漏和微调
第 3 轮只是收尾，避免无限修改

每轮结束时记录：

### 第 N 轮优化摘要

| 指标 | 值 |
|--------|-------|
| 评估的 Stories 数 | XX |
| 进行的修正数 | XX |
| 新增（来自拆分） | XX |
| 平均分数提升 | +X.X |

**本轮修正**：
- US-XXX: [修正摘要]
- US-XXX: [修正摘要]

**继续？**: [是/否，原因]

# Story 优化报告

## 📊 优化摘要

### 总体结果
- 原始 Story 数量：XX
- 最终 Story 数量：XX（包括拆分新增的）
- 优化轮次：X / 3
- 终止原因：[质量达标 / 无需修正 / 达到限制]

### 每轮统计
| 轮次 | 评估数 | 修正数 | 新增数 | 平均分数 |
|-------|-----------|-----------|-------|---------------|
| 第 1 轮 | XX | XX | XX | X.X |
| 第 2 轮 | XX | XX | XX | X.X |
| ... | ... | ... | ... | ... |

## 🔄 优化历史
[每轮修正摘要，可折叠]

## ✅ 最终通过的 Stories
[评分 ≥ 4 的 Stories]

## 🔧 修正过的 Stories
[原始 → 最终版本对比，注明修正轮次]

## ➕ 拆分生成的新 Stories
[拆分产生的新 Stories]

## 🗑️ 建议移除的 Stories
[不符合需求或重复的 Stories]

## 📋 最终 Story 列表
[完整的整合列表，可供使用]

### 🔧 US-XXX: [标题]

**原始版本**：
> 作为 [角色]，我想要 [操作]，以便 [价值]。

**问题诊断**：
- 🧪 QA 视角：验收标准不清晰，无法编写测试
- 👨‍💻 开发人员视角：范围包含多个独立功能

**修正方法**：拆分为两个 Stories + 添加验收标准

**改进版本**：

**US-XXX-A**：作为 [角色]，我想要 [操作 A]，以便 [价值]。
- 验收标准：
  - [ ] 条件 1
  - [ ] 条件 2

**US-XXX-B**：作为 [角色]，我想要 [操作 B]，以便 [价值]。
- 验收标准：
  - [ ] 条件 1

---

情况 1：大量 Stories 需要修正（>50%）

这可能表明 Story Writer 阶段存在系统性问题：

不要逐个修正（效率太低）
识别常见问题模式
提出系统性建议
建议重新运行 Story Writer

情况 2：发现缺失的功能

如果与 RFP 对比发现 Stories 未覆盖的功能：

标记为"建议添加"
生成建议的 Story
注明来源（源自 RFP 的哪一部分）

情况 3：发现重复的 Stories

标记重复项
建议保留哪个（或合并）
解释判断依据

情况 4：Story 质量优秀

如果所有 Stories 评分 ≥ 4：

简要确认"质量良好，无需修正"
可以提供次要优化建议（非强制）
直接输出最终列表

参考 assets/refine-example.md 获取完整的输出示例。

评估标准：references/evaluation-criteria.md - 定义了所有三个维度的详细评分标准
输出示例：assets/refine-example.md - 完整的优化报告示例

与其他 Skills 的集成

[rfp-analyzer] → [story-writer] → [story-refiner] → 最终输出

用法：在 Story Writer 生成用户故事草稿后，使用 Story Refiner 评估质量并自动修正低分 Stories。这是一个单独的步骤，当需要优化时应明确调用。

通过阈值：≥ 4 分
必须修正：≤ 2 分
观察区：3 分（可选修正）

当用户要求"严格检查"或项目风险较高时：

通过阈值：5 分
必须修正：≤ 3 分
所有 Stories 必须有验收标准

当用户要求"快速通过"或项目是 MVP/POC 时：

通过阈值：≥ 3 分
仅修正 ≤ 1 分的严重问题
验收标准可选

完成优化后，确认以下项目：

所有 ≤ 2 分的 Stories 都已修正或重写
修正后的 Stories 符合 INVEST 原则
拆分生成的新 Stories 有正确的编号
最终列表没有重复项
所有原始需求覆盖范围得以保留
清晰标注了哪些是原始版本，哪些是改进版本
终止原因是合理的（不是因为达到限制而强制停止）
没有 Story 在多轮中被来回更改

迭代优化 vs. 单次优化

何时使用迭代（默认）

正式项目
Story 数量 > 10
有拆分操作
质量要求较高

何时使用单次优化

当用户明确表示"快速优化"或"仅一次"时：

MVP/POC 项目
时间紧迫
Story 数量 < 10
一般质量要求

为什么限制为 3 轮

经验法则：大多数问题在 2 轮内解决
收益递减：第 3 轮及以后的修正通常是吹毛求疵
避免过度工程：无限优化可能偏离原始需求
时间成本：每轮都需要处理时间

如果 3 轮后仍有大量低分 Stories：

输出当前结果并附上标注
建议返回 Story Writer 重新生成
分析 RFP 本身是否存在系统性问题

🇺🇸English

Story Refiner Skill

Language Preference

Default : Respond in the same language as the user's input or as explicitly requested by the user.

If the user specifies a preferred language (e.g., "請用中文回答", "Reply in Japanese"), use that language for all outputs. Otherwise, match the language of the provided Stories.

Role Definition

You simultaneously play three roles to review User Stories:

Senior Developer : Evaluates technical feasibility and estimation clarity
QA Engineer : Evaluates testability and acceptance criteria clarity
Product Stakeholder : Evaluates requirement coverage and value clarity

Core Principles

Correction Over Reporting

Don't just point out problems, directly fix them
Every flagged issue must have a corresponding improved version
Humans only need final confirmation, not manual correction

Conservative Correction

Only correct Stories with "obvious problems"
Don't correct for the sake of correcting
Stories that already pass don't need changes

Transparent Annotation

Clearly explain why corrections were made
Provide original vs. improved version comparison
Let humans choose to accept or keep original version

Input Format

This Skill accepts the following inputs:

Story Writer output (recommended)
Any format User Stories list
Original RFP + Stories (can cross-reference coverage)

Evaluation Criteria Reference

All scoring and evaluation must follow the standards defined inreferences/evaluation-criteria.md.

This document defines:

Three scoring dimensions (Development Clarity, Testability, Value Clarity)
Detailed scoring criteria for each dimension (1-5 points)
Specific checkpoints and common deduction patterns
Final score calculation method

Important : Both Quick Scan (Phase 1) and Detailed Evaluation (Phase 2) use these same criteria, with different levels of depth.

Evaluation Flow

Phase 1: Quick Scan

Score each Story initially (1-5 points) using the three dimensions from references/evaluation-criteria.md:

Scoring Method :

Quickly assess each dimension (Development Clarity, Testability, Value Clarity) on a 1-5 scale
Calculate final score: round((Development Clarity + Testability + Value Clarity) / 3)
Use the scoring criteria tables in references/evaluation-criteria.md as reference

Quick Assessment Focus :

Development Clarity: Is action specific? Scope clear? Dependencies clear?
Testability: Can write test cases? Acceptance criteria present? Value verifiable?
Value Clarity: Value clear? Role correct? Maps to requirements?

Score	Level	Action
5	Excellent	Keep, no modification
4	Good	Keep, may have minor suggestions
3	Passing	Mark for observation, may need minor adjustments
2	Insufficient	Must correct
1	Severely insufficient	Must rewrite

Only Stories scoring ≤ 3 enter Phase 2 detailed evaluation.

Phase 2: Multi-Perspective Detailed Evaluation

For Stories needing review, perform detailed evaluation from three perspectives using the Specific Checkpoints and Common Deduction Patterns defined in references/evaluation-criteria.md.

👨‍💻 Developer Perspective

Reference : references/evaluation-criteria.md - Dimension 1: Development Clarity

Detailed Checkpoints (from evaluation-criteria.md):

Is action description specific?
- 5 points: "Upload JPG/PNG format images, limited to 5MB"
- 3 points: "Upload images"
- 1 point: "Handle images"
Does scope have boundaries?
- 5 points: "Edit article title and content"
- 3 points: "Edit article"
- 1 point: "Manage articles"
Are dependencies clear?
- 5 points: Clearly marked "requires US-001 login feature completed first"
- 3 points: Implied dependency but not marked
- 1 point: Confusing or circular dependencies

Common Problems (see evaluation-criteria.md for deduction patterns):

Vague verbs: "manage", "handle", "maintain" (-1~2 points)
No scope boundary: "all settings", "various reports" (-1~2 points)
Compound features: "create and edit" (-1 point)
Technical details mixed in: "load using AJAX" (-1 point)

🧪 QA Perspective

Reference : references/evaluation-criteria.md - Dimension 2: Testability

Detailed Checkpoints (from evaluation-criteria.md):

Are acceptance criteria clear?
- 5 points: Has specific Given-When-Then or checklist
- 3 points: Has general direction but not specific
- 1 point: No acceptance criteria, or vague like "should be user-friendly"
Is value verifiable?
- 5 points: "so that I can find target article within 3 seconds" (measurable)
- 3 points: "so that I can find articles faster" (relative but comparable)
- 1 point: "so that I can have a better experience" (not measurable)
Are error scenarios considered?
- 5 points: Clearly states error handling
- 3 points: Only happy path, but error handling can be inferred
- 1 point: Error scenarios completely unconsidered, and important to feature

Common Problems (see evaluation-criteria.md for deduction patterns):

No acceptance criteria: None at all (-1~2 points, important features deduct more)
Vague criteria: "should be fast", "should look good" (-1 point)
Untestable value: "so that I can have better experience" (-2 points)

👤 Stakeholder Perspective

Reference : references/evaluation-criteria.md - Dimension 3: Value Clarity

Detailed Checkpoints (from evaluation-criteria.md):

Does "so that..." state real value?
- 5 points: "so that I can pull up data within 10 seconds when customer calls"
- 3 points: "so that I can quickly view data"
- 1 point: "so that I can use this feature" (circular reasoning)
Is role correct?
- 5 points: Role is clear and is the true beneficiary of this feature
- 3 points: Role too generic (e.g., "user" covers too much)
- 1 point: Wrong role (e.g., giving admin feature to regular user)
Maps to original requirements?
- 5 points: Can directly trace to a specific RFP paragraph
- 3 points: Is reasonably derived implied requirement
- 1 point: Can't see connection to original requirements

Common Problems (see evaluation-criteria.md for deduction patterns):

Circular reasoning: "so that I can use this feature" (-2 points)
Role too generic: Everything is "user" (-1 point)
Technical task disguised: "As a developer" (-3 points)
Deviates from original requirements: Features RFP didn't mention (-1~2 points)

Phase 3: Auto-Correction

For Stories scoring ≤ 3, execute corrections based on problem type:

Correction Strategies

Problem Type	Correction Method
Scope too large	Split into multiple Stories
Scope vague	Add specific operation description
Value unclear	Rewrite "so that..." part
Not testable	Add specific acceptance criteria
Format issue	Adjust to standard format
Wrong role	Correct to proper role
Improper granularity	Split or merge

Correction Principles

Minimum change : If small change works, don't make big changes
Preserve intent : Don't change original requirement intent
Clear annotation : Explain what was changed and why

Phase 4: Iterative Validation (Max 3 Rounds)

Corrected Stories need re-evaluation to ensure quality meets standards. This is the core of iterative refinement.

Why Iteration Is Needed

Situation	Single-Pass Refinement Problem	Iterative Solution
Story is split	New Stories aren't evaluated	✅ Next round evaluates new Stories
Over-correction	Might break something	✅ Next round catches and fine-tunes
Acceptance criteria still not specific	Passes through	✅ Next round strengthens

Iteration Flow

Round 1: Evaluate all Stories → Correct low-scoring items → Produce corrected version
    ↓
Round 2: Evaluate "corrected" + "newly generated" Stories → Correct again if needed
    ↓
Round 3: (If still issues) Final fine-tuning
    ↓
Terminate: Output final version

Termination Conditions (Stop when any is met)

Quality achieved : All Stories score ≥ 4
No corrections needed : This round had no Story corrections
Limit reached : Already executed 3 rounds
Convergence failed : Same Story corrected 2 rounds in a row but score didn't improve

Iteration Rules

Rule	Description
Progressive convergence	Each round should reduce problems, not increase them
History memory	Track each Story's correction history, avoid back-and-forth changes
Correction limit	Same Story can only be majorly changed once, then only fine-tuned
New Story priority	From round 2, prioritize evaluating Stories generated in previous round

Decreasing Correction Intensity

Round	Allowed Correction Types
Round 1	All corrections (split, rewrite, add acceptance criteria, etc.)
Round 2	Moderate corrections (add acceptance criteria, adjust wording, minor splits)
Round 3	Fine-tuning only (word corrections, add details, no splitting or rewriting)

This design ensures:

Round 1 solves structural problems
Round 2 handles omissions and fine-tuning
Round 3 is just wrap-up, avoiding infinite modification

Iteration Summary Output

Record at end of each round:

### Round N Refinement Summary

| Metric | Value |
|--------|-------|
| Stories Evaluated | XX |
| Corrections Made | XX |
| New (from splits) | XX |
| Average Score Improvement | +X.X |

**This Round's Corrections**:
- US-XXX: [Correction summary]
- US-XXX: [Correction summary]

**Continue?**: [Yes/No, reason]

Output Format

Structure Overview

# Story Refinement Report

## 📊 Refinement Summary

### Overall Results
- Original Story Count: XX
- Final Story Count: XX (including split additions)
- Refinement Rounds: X / 3
- Termination Reason: [Quality achieved / No corrections needed / Limit reached]

### Per-Round Statistics
| Round | Evaluated | Corrected | Added | Average Score |
|-------|-----------|-----------|-------|---------------|
| Round 1 | XX | XX | XX | X.X |
| Round 2 | XX | XX | XX | X.X |
| ... | ... | ... | ... | ... |

## 🔄 Refinement History
[Per-round correction summaries, collapsible]

## ✅ Final Passing Stories
[Stories scoring ≥ 4]

## 🔧 Corrected Stories
[Original → Final version comparison, noting correction round]

## ➕ Split-Generated Stories
[New Stories from splits]

## 🗑️ Recommended for Removal
[Stories not matching requirements or duplicates]

## 📋 Final Story List
[Complete integrated list, ready for use]

Correction Detail Format

### 🔧 US-XXX: [Title]

**Original Version**:
> As a [role], I want [action], so that [value].

**Problem Diagnosis**:
- 🧪 QA Perspective: Acceptance criteria unclear, can't write tests
- 👨‍💻 Developer Perspective: Scope includes multiple independent features

**Correction Method**: Split into two Stories + add acceptance criteria

**Improved Version**:

**US-XXX-A**: As a [role], I want [action A], so that [value].
- Acceptance Criteria:
  - [ ] Condition 1
  - [ ] Condition 2

**US-XXX-B**: As a [role], I want [action B], so that [value].
- Acceptance Criteria:
  - [ ] Condition 1

---

Special Situation Handling

Situation 1: Large Number of Stories Need Correction (>50%)

This may indicate systematic issues in Story Writer phase:

Don't correct one by one (too inefficient)
Identify common problem patterns
Propose systematic suggestions
Recommend re-running Story Writer

Situation 2: Discovered Missing Features

If comparing to RFP reveals features not covered by Stories:

Mark as "recommended addition"
Produce suggested Story
Mark source (derived from which part of RFP)

Situation 3: Discovered Duplicate Stories

Mark duplicate items
Recommend which to keep (or merge)
Explain judgment basis

Situation 4: Story Quality Is Excellent

If all Stories score ≥ 4:

Briefly confirm "Quality is good, no corrections needed"
Can provide minor optimization suggestions (not mandatory)
Directly output final list

Output Example

Refer to assets/refine-example.md for complete output example.

Reference Documents

Evaluation Criteria : references/evaluation-criteria.md - Defines detailed scoring standards for all three dimensions
Output Example : assets/refine-example.md - Complete refinement report example

Integration with Other Skills

Standard Flow

[rfp-analyzer] → [story-writer] → [story-refiner] → Final output

Usage : After Story Writer produces User Stories draft, use Story Refiner to evaluate quality and automatically correct low-scoring Stories. This is a separate step that should be called explicitly when refinement is needed.

Quality Threshold Settings

Default Threshold

Pass threshold: ≥ 4 points
Must correct: ≤ 2 points
Observation zone: 3 points (optional correction)

Strict Mode

When user requests "strict check" or project risk is higher:

Pass threshold: 5 points
Must correct: ≤ 3 points
All Stories must have acceptance criteria

Lenient Mode

When user requests "quick pass" or project is MVP/POC:

Pass threshold: ≥ 3 points
Only correct ≤ 1 point severe issues
Acceptance criteria optional

Checklist

After completing refinement, confirm the following items:

All Stories ≤ 2 points have been corrected or rewritten
Corrected Stories meet INVEST principles
Split-generated new Stories have proper numbering
Final list has no duplicates
All original requirement coverage preserved
Clear annotation of which are original vs. improved versions
Termination reason is reasonable (not forced stop from reaching limit)
No Story was changed back-and-forth across multiple rounds

Iterative vs. Single-Pass Refinement

When to Use Iterative (Default)

Formal projects
Story count > 10
Has split operations
Higher quality requirements

When to Use Single-Pass

When user explicitly says "quick refine" or "one pass only":

MVP/POC projects
Time pressure
Story count < 10
General quality requirements

Why 3 Round Limit

Rule of thumb : Most problems resolved within 2 rounds
Diminishing returns : Round 3+ corrections are usually nitpicking
Avoid over-engineering : Infinite refinement may drift from original requirements
Time cost : Each round requires processing time

If large numbers of low-scoring Stories remain after 3 rounds:

Output current results with annotations
Suggest returning to Story Writer to regenerate
Analyze whether RFP itself has systematic issues

Weekly Installs

Repository

bobchao/pm-skil…-stories

GitHub Stars

First Seen

Jan 1, 1970

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Skills CLI 使用指南：AI Agent 技能包管理器安装与管理教程

29,800 周安装

Story Refiner：AI 用户故事评审与优化工具，提升开发、测试与产品管理效率

🇨🇳中文介绍

Story Refiner Skill

语言偏好

角色定义

核心原则

纠正优先于报告

保守修正

透明标注

输入格式

评估标准参考

相关 Skills

评估流程

第 1 阶段：快速扫描

第 2 阶段：多视角详细评估

👨‍💻 开发人员视角

🧪 QA 视角

👤 利益相关者视角

第 3 阶段：自动修正

修正策略

修正原则

第 4 阶段：迭代验证（最多 3 轮）

为什么需要迭代

迭代流程

终止条件（满足任一条件即停止）

迭代规则

递减的修正强度

迭代摘要输出

输出格式

结构概述

修正详情格式

特殊情况处理

情况 1：大量 Stories 需要修正（>50%）

情况 2：发现缺失的功能

情况 3：发现重复的 Stories

情况 4：Story 质量优秀

输出示例

参考文档

与其他 Skills 的集成

标准流程

质量阈值设置

默认阈值

严格模式

宽松模式

检查清单

迭代优化 vs. 单次优化

何时使用迭代（默认）

何时使用单次优化

为什么限制为 3 轮

🇺🇸English

Story Refiner Skill

Language Preference

Role Definition

Core Principles

Correction Over Reporting

Conservative Correction

Transparent Annotation

Input Format

Evaluation Criteria Reference

Evaluation Flow

Phase 1: Quick Scan

Phase 2: Multi-Perspective Detailed Evaluation

👨‍💻 Developer Perspective

🧪 QA Perspective

👤 Stakeholder Perspective

Phase 3: Auto-Correction

Correction Strategies

Correction Principles

Phase 4: Iterative Validation (Max 3 Rounds)

Why Iteration Is Needed

Iteration Flow

Termination Conditions (Stop when any is met)

Iteration Rules

Decreasing Correction Intensity

Iteration Summary Output

Output Format

Structure Overview

Correction Detail Format

Special Situation Handling

Situation 1: Large Number of Stories Need Correction (>50%)