规范优先开发：LLM辅助编码的结构化工作流程与最佳实践

spec-first by shipshitdev/library

74 周安装量

19 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/shipshitdev/library --skill spec-first

AI/机器学习方法论开发

🇨🇳中文介绍

规范优先开发

一种结构化的工作流程，用于 LLM 辅助编码，将实现推迟到决策明确之后。

何时激活此流程

"构建 X" 或 "创建 Y"（新功能/项目）
"实现..."（非平凡功能）
"添加一个功能，该功能..."（多步骤工作）
任何需要 3 个以上文件或需求不明确的请求

何时跳过此流程

少于 50 行的单文件更改
拼写错误修复、日志添加、配置调整
用户明确表示 "直接做" 或 "快速修复"

核心原则

在权衡明确之前推迟实现 — 使用对话来澄清约束、比较选项、揭示风险。然后才编写代码。
将模型视为打字速度无限的初级工程师 — 提供结构：清晰的接口、小任务、明确的验收标准。代码是廉价的；理解和正确性是稀缺的。
规范优于提示 — 对于任何非平凡的事情，创建一个持久的工件（规范文件），可以在不同会话中重新输入、比较和重用。
生成的代码是一次性的；测试不是 — 假设会重写。为易于替换而设计：小模块、最小耦合、清晰的接缝、强大的测试。
模型过于自信；现实是裁判 — 所有重要事项都通过执行来验证：测试、代码检查器、类型检查器、可重现的构建。

六阶段工作流程

阶段 A：界定问题（对话模式）

目标： 在实现之前做出决定。

有效的提示：

"列出 3 种可行的方法。从以下方面进行比较：复杂性、故障模式、可测试性、未来变更、首次演示时间。"
"你做了哪些假设？哪些是有风险的？"
"提出一个最小版本，以后可以删除而不会后悔。"

输出： 决策记录，保存到 .agents/DECISIONS/[feature-name].md

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

阶段 B：编写 spec.md（冻结决策）

目标： 将决策转化为明确的需求。

文件： .agents/SPECS/[feature-name].md

# [功能名称] 规范

## 目的
一段话：这是用来做什么的。

## 非目标
明确说明你不构建什么。

## 接口
输入/输出、数据类型、文件格式、API 端点、CLI 命令。

## 关键决策
库、架构、持久化选择、约束。

## 边界情况和故障模式
超时、重试、部分失败、无效输入、并发、幂等性。

## 验收标准
可测试语句的列表。避免使用"应该很快"。
更倾向于："在 M1 Mac 上，2 秒内处理 1k 个项目。"

## 测试计划
单元/集成边界、测试夹具、黄金文件、必须模拟什么。

阶段 C：生成 todo.md（规划模式）

目标： 分步检查清单，其中每个步骤都有一个验证命令。

文件： .agents/TODOS/[feature-name].md

# [功能名称] 待办事项

- [ ] 添加项目脚手架（构建/运行/测试命令）
  验证：`npm run build && npm test`

- [ ] 实现具有接口 Y 的模块 X
  验证：`npm test -- --grep "module X"`

- [ ] 为边界情况 A/B/C 添加测试
  验证：`npm test -- --grep "edge cases"`

- [ ] 连接集成
  验证：`npm run integration`

- [ ] 添加文档
  验证：`npm run docs && open docs/index.html`

每个项目必须可以独立检查。这可以防止"看起来正确"的进度。

阶段 D：执行更改（实现模式）

目标： 小改动、频繁验证、受控上下文。

每个步骤进行一次逻辑更改
一次只关注一个接口
每次更改后：运行验证命令，粘贴实际输出回来
尽早并经常提交

对于大型代码库：

仅提供相关文件以及规范/待办事项
如果总结仓库，只做一次并保存为可重用的工件

阶段 E：验证和审查（对抗模式）

目标： 迫使模型尝试破坏自己的工作。

"扮演一个敌对的审查者。寻找正确性错误，而不是风格问题。列出具体的失败场景。"
"根据这些验收标准，哪些实际上没有满足？请具体说明。"
"提出 5 个测试，如果实现错误，这些测试将失败。"

阶段 F：决定保留什么

目标： 保持系统易于删除和重写。

将"策略"（业务规则）与"机制"（I/O、数据库、HTTP）分开
优先选择可以移除而不会产生连锁反应的浅层抽象
在测试和测试夹具上的投入多于巧妙的架构

保存在 .agents/ 文件夹中（非项目根目录）：

.agents/
├── SPECS/
│   └── [feature-name].md    # 是什么/为什么/约束
├── TODOS/
│   └── [feature-name].md    # 步骤 + 验证命令
└── DECISIONS/
    └── [feature-name].md    # 权衡、被拒绝的选项、假设

命名： 使用功能/任务名称作为文件名（例如，user-auth.md、api-refactor.md）。

为什么使用 .agent 文件夹：

保持项目根目录整洁
分组所有 AI 辅助规划工件
与 task-prd-creator 和 ai-dev-loop 技能配合使用
跨会话持久化

智能体准备清单（IMPACT）

在运行自主/智能体执行之前，请验证：

维度	问题	如果否...
意图	你有验收标准和测试框架吗？	不要运行智能体
记忆	你有持久的工件（规范/待办事项）以便它可以恢复吗？	它会陷入混乱
规划	它能生成/更新带有检查点的计划吗？	它会糟糕地即兴发挥
权限	它能做的事情是否受到限制（编辑、测试、提交）？	风险太大
控制流	它是否根据工具输出来决定下一步？	它只是在生成代码块
工具	它是否拥有最低必要工具且没有多余工具？	攻击面太大

在有意义的检查点批准（待办事项项结束时、测试套件通过后），而不是每个微步骤。

权威型（用于正确性）：

Edit these files: [paths]
Interface: [exact signatures]
Acceptance criteria: [list]
Required tests: [list]
Don't change anything else.

选项和权衡（用于设计）：

Give me 3 options and a recommendation.
Make the recommendation conditional on constraints A/B/C.

上下文纪律（用于大型代码库）：

Only use the files I provided.
If you need more context, ask for a specific file and explain why.

使其可证明：

Add a test that fails on the buggy version and passes on the correct one.

当此技能激活时，生成：

SPEC-FIRST WORKFLOW

STAGE A - FRAMING:
[3 approaches with tradeoffs]
[Recommendation]

STAGE B - SPEC:
[Draft spec.md content]

STAGE C - TODO:
[Draft todo.md with verification commands]

Ready to proceed to Stage D (execution)?

🇺🇸English

Spec-First Development

A structured workflow for LLM-assisted coding that delays implementation until decisions are explicit.

When This Activates

"Build X" or "Create Y" (new features/projects)
"Implement..." (non-trivial functionality)
"Add a feature that..." (multi-step work)
Any request requiring 3+ files or unclear requirements

When to Skip

Single-file changes under 50 lines
Typo fixes, log additions, config tweaks
User explicitly says "just do it" or "quick fix"

Core Principles

Delay implementation until tradeoffs are explicit — Use conversation to clarify constraints, compare options, surface risks. Only then write code.
Treat the model like a junior engineer with infinite typing speed — Provide structure: clear interfaces, small tasks, explicit acceptance criteria. Code is cheap; understanding and correctness are scarce.
Specs beat prompts — For anything non-trivial, create a durable artifact (spec file) that can be re-fed, diffed, and reused across sessions.
Generated code is disposable; tests are not — Assume rewrites. Design for easy replacement: small modules, minimal coupling, clean seams, strong tests.
The model is over-confident; reality is the judge — Everything important gets verified by execution: tests, linters, typecheckers, reproducible builds.

The 6-Stage Workflow

Stage A: Frame the Problem (conversation mode)

Goal: Decide before you implement.

Prompts that work:

"List 3 viable approaches. Compare on: complexity, failure modes, testability, future change, time to first demo."
"What assumptions are you making? Which ones are risky?"
"Propose a minimal version that can be deleted later without regret."

Output: Decision notes for .agents/DECISIONS/[feature-name].md

Stage B: Write spec.md (freeze decisions)

Goal: Turn decisions into unambiguous requirements.

File: .agents/SPECS/[feature-name].md

# [Feature Name] Spec

## Purpose
One paragraph: what this is for.

## Non-Goals
Explicitly state what you are NOT building.

## Interfaces
Inputs/outputs, data types, file formats, API endpoints, CLI commands.

## Key Decisions
Libraries, architecture, persistence choices, constraints.

## Edge Cases and Failure Modes
Timeouts, retries, partial failures, invalid input, concurrency, idempotency.

## Acceptance Criteria
Bullet list of testable statements. Avoid "should be fast."
Prefer: "processes 1k items under 2s on M1 Mac."

## Test Plan
Unit/integration boundaries, fixtures, golden files, what must be mocked.

Stage C: Generate todo.md (planning mode)

Goal: Stepwise checklist where each step has a verification command.

File: .agents/TODOS/[feature-name].md

# [Feature Name] TODO

- [ ] Add project scaffolding (build/run/test commands)
  Verify: `npm run build && npm test`

- [ ] Implement module X with interface Y
  Verify: `npm test -- --grep "module X"`

- [ ] Add tests for edge cases A/B/C
  Verify: `npm test -- --grep "edge cases"`

- [ ] Wire integration
  Verify: `npm run integration`

- [ ] Add docs
  Verify: `npm run docs && open docs/index.html`

Each item must be independently checkable. This prevents "looks right" progress.

Stage D: Execute Changes (implementation mode)

Goal: Small diffs, frequent verification, controlled context.

Rules:

One logical change per step
Keep focus on one interface at a time
After each change: run verification command, paste actual output back
Commit early and often

For large codebases:

Provide only relevant files plus spec/todo
If summarizing repo, do it once and keep as reusable artifact

Stage E: Verify and Review (adversarial mode)

Goal: Force the model to try to break its own work.

Prompts:

"Act as a hostile reviewer. Find correctness bugs, not style nits. List concrete failing scenarios."
"Given these acceptance criteria, which are not actually satisfied? Be specific."
"Propose 5 tests that would fail if the implementation is wrong."

Stage F: Decide What Lasts

Goal: Keep the system easy to delete and rewrite.

Heuristics:

Keep "policy" (business rules) separate from "mechanism" (I/O, DB, HTTP)
Prefer shallow abstractions that can be removed without cascade
Invest in tests and fixtures more than clever architecture

The Three-File Convention

Keep in the .agents/ folder (not project root):

.agents/
├── SPECS/
│   └── [feature-name].md    # what/why/constraints
├── TODOS/
│   └── [feature-name].md    # steps + verification commands
└── DECISIONS/
    └── [feature-name].md    # tradeoffs, rejected options, assumptions

Naming: Use the feature/task name as the filename (e.g., user-auth.md, api-refactor.md).

Why .agent folder:

Keeps project root clean
Groups all AI-assisted planning artifacts
Works with task-prd-creator and ai-dev-loop skills
Persists across sessions

Agent Readiness Checklist (IMPACT)

Before running autonomous/agentic execution, verify:

Dimension	Question	If No...
Intent	Do you have acceptance criteria and a test harness?	Don't run agent
Memory	Do you have durable artifacts (spec/todo) so it can resume?	It will thrash
Planning	Can it produce/update a plan with checkpoints?	It will improvise badly
Authority	Is what it can do restricted (edit, test, commit)?	Too risky
Control Flow	Does it decide next step based on tool output?	It's just generating blobs
Tools	Does it have minimum necessary tooling and nothing extra?	Attack surface too large

Approve at meaningful checkpoints (end of todo item, after test suite passes), not every micro-step.

Prompt Patterns

Authoritarian (for correctness):

Edit these files: [paths]
Interface: [exact signatures]
Acceptance criteria: [list]
Required tests: [list]
Don't change anything else.

Options and tradeoffs (for design):

Give me 3 options and a recommendation.
Make the recommendation conditional on constraints A/B/C.

Context discipline (for large codebases):

Only use the files I provided.
If you need more context, ask for a specific file and explain why.

Make it provable:

Add a test that fails on the buggy version and passes on the correct one.

Output Format

When this skill activates, produce:

SPEC-FIRST WORKFLOW

STAGE A - FRAMING:
[3 approaches with tradeoffs]
[Recommendation]

STAGE B - SPEC:
[Draft spec.md content]

STAGE C - TODO:
[Draft todo.md with verification commands]

Ready to proceed to Stage D (execution)?

Weekly Installs

Repository

shipshitdev/library

GitHub Stars

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

codex56

opencode53

gemini-cli51

claude-code51

cursor49

github-copilot45

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

120,000 周安装