test-driven-development by obra/superpowers
npx skills add https://github.com/obra/superpowers --skill test-driven-development先写测试。看着它失败。编写最少的代码使其通过。
核心原则: 如果你没有看到测试失败,你就不知道它测试的是否正确。
违反规则的字面意思就是违反规则的精神。
总是使用:
例外情况(请咨询你的真人搭档):
想着“就这一次跳过 TDD”?停下。那是合理化借口。
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
先于测试写代码?删除它。重新开始。
没有例外:
根据测试重新实现。完毕。
digraph tdd_cycle {
rankdir=LR;
red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
verify_red [label="Verify fails\ncorrectly", shape=diamond];
green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
verify_green [label="Verify passes\nAll green", shape=diamond];
refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
next [label="Next", shape=ellipse];
red -> verify_red;
verify_red -> green [label="yes"];
verify_red -> red [label="wrong\nfailure"];
green -> verify_green;
verify_green -> refactor [label="yes"];
verify_green -> green [label="no"];
refactor -> verify_green [label="stay\ngreen"];
verify_green -> next;
next -> red;
}
<Good>
清晰的名字,测试真实行为,一件事
</Good>
<Bad>
```typescript
test('retry works', async () => {
const mock = jest.fn()
.mockRejectedValueOnce(new Error())
.mockRejectedValueOnce(new Error())
.mockResolvedValueOnce('success');
await retryOperation(mock);
expect(mock).toHaveBeenCalledTimes(3);
});
```
模糊的名字,测试的是模拟对象而不是代码
</Bad>
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
编写一个最小的测试,展示应该发生什么。
test('retries operation up to 3 times', async () => {
let attempts = 0;
const operation = async () => {
attempts++;
if (attempts < 3) throw new Error();
return 'success';
};
const result = await retryOperation(operation);
expect(result).toBe('success');
expect(attempts).toBe(3);
});
要求:
强制性的。永远不要跳过。
npm test path/to/test.test.ts
确认:
测试通过? 你正在测试现有行为。修复测试。
测试出错? 修复错误,重新运行直到它正确失败。
编写最简单的代码使测试通过。
不要添加功能、重构其他代码,或在测试范围之外进行“改进”。
强制性的。
npm test path/to/test.test.ts
确认:
测试失败? 修复代码,而不是测试。
其他测试失败? 立即修复。
仅在绿色之后进行:
保持测试为绿色。不要添加行为。
为下一个功能编写下一个失败的测试。
| 质量 | 好 | 坏 |
|---|---|---|
| 最小化 | 一件事。名字里有“和”?拆分它。 | test('validates email and domain and whitespace') |
| 清晰 | 名称描述行为 | test('test1') |
| 展示意图 | 展示期望的 API | 模糊了代码应该做什么 |
“我会在之后写测试来验证它是否工作”
在代码之后编写的测试会立即通过。立即通过证明不了任何东西:
测试先行迫使你看到测试失败,证明它确实测试了某些东西。
“我已经手动测试了所有边缘情况”
手动测试是临时的。你以为你测试了一切,但是:
自动化测试是系统性的。它们每次都按相同方式运行。
“删除 X 小时的工作是浪费的”
沉没成本谬误。时间已经过去了。你现在选择:
“浪费”在于保留你无法信任的代码。没有真实测试的工作代码是技术债务。
“TDD 是教条的,务实意味着适应”
TDD 就是务实的:
“务实”的捷径 = 在生产环境调试 = 更慢。
“事后测试能达到相同目标——重要的是精神不是仪式”
不。事后测试回答“这做了什么?” 测试先行回答“这应该做什么?”
事后测试受你的实现影响。你测试你构建的东西,而不是需要的东西。你验证你记得的边缘情况,而不是发现的边缘情况。
测试先行迫使你在实现前发现边缘情况。事后测试验证你是否记得了一切(你没有)。
30 分钟的事后测试 ≠ TDD。你得到了覆盖率,失去了测试有效的证明。
| 借口 | 现实 |
|---|---|
| “太简单了,不需要测试” | 简单的代码也会出错。测试只需 30 秒。 |
| “我之后会测试” | 测试立即通过证明不了任何东西。 |
| “事后测试能达到相同目标” | 事后测试 = “这做了什么?” 测试先行 = “这应该做什么?” |
| “已经手动测试过了” | 临时 ≠ 系统。没有记录,无法重新运行。 |
| “删除 X 小时是浪费的” | 沉没成本谬误。保留未经验证的代码是技术债务。 |
| “保留作为参考,先写测试” | 你会调整它。那就是事后测试。删除就是删除。 |
| “需要先探索一下” | 可以。扔掉探索,用 TDD 开始。 |
| “测试困难 = 设计不清晰” | 倾听测试。难以测试 = 难以使用。 |
| “TDD 会拖慢我” | TDD 比调试更快。务实 = 测试先行。 |
| “手动测试更快” | 手动测试无法证明边缘情况。你每次更改都要重新测试。 |
| “现有代码没有测试” | 你正在改进它。为现有代码添加测试。 |
所有这些都意味着:删除代码。用 TDD 重新开始。
错误: 接受空电子邮件
红
test('rejects empty email', async () => {
const result = await submitForm({ email: '' });
expect(result.error).toBe('Email required');
});
验证红
$ npm test
FAIL: expected 'Email required', got undefined
绿
function submitForm(data: FormData) {
if (!data.email?.trim()) {
return { error: 'Email required' };
}
// ...
}
验证绿
$ npm test
PASS
重构 如果需要,为多个字段提取验证逻辑。
在标记工作完成前:
无法勾选所有选项?你跳过了 TDD。重新开始。
| 问题 | 解决方案 |
|---|---|
| 不知道如何测试 | 写下期望的 API。先写断言。咨询你的真人搭档。 |
| 测试太复杂 | 设计太复杂。简化接口。 |
| 必须模拟一切 | 代码耦合度太高。使用依赖注入。 |
| 测试设置庞大 | 提取辅助函数。仍然复杂?简化设计。 |
发现错误?编写一个重现它的失败测试。遵循 TDD 循环。测试证明修复并防止回归。
永远不要在没有测试的情况下修复错误。
当添加模拟对象或测试工具时,请阅读 @testing-anti-patterns.md 以避免常见陷阱:
生产代码 → 测试存在且首先失败
否则 → 不是 TDD
未经你的真人搭档许可,没有例外。
每周安装量
32.1K
代码库
GitHub 星标数
107.7K
首次出现
2026 年 1 月 19 日
安全审计
安装于
opencode27.2K
codex26.0K
gemini-cli25.9K
github-copilot24.6K
cursor23.6K
kimi-cli22.5K
Write the test first. Watch it fail. Write minimal code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Violating the letter of the rules is violating the spirit of the rules.
Always:
Exceptions (ask your human partner):
Thinking "skip TDD just this once"? Stop. That's rationalization.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
Implement fresh from tests. Period.
digraph tdd_cycle {
rankdir=LR;
red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
verify_red [label="Verify fails\ncorrectly", shape=diamond];
green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
verify_green [label="Verify passes\nAll green", shape=diamond];
refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
next [label="Next", shape=ellipse];
red -> verify_red;
verify_red -> green [label="yes"];
verify_red -> red [label="wrong\nfailure"];
green -> verify_green;
verify_green -> refactor [label="yes"];
verify_green -> green [label="no"];
refactor -> verify_green [label="stay\ngreen"];
verify_green -> next;
next -> red;
}
Write one minimal test showing what should happen.
const result = await retryOperation(operation);
expect(result).toBe('success'); expect(attempts).toBe(3); });
Clear name, tests real behavior, one thing
</Good>
<Bad>
```typescript
test('retry works', async () => {
const mock = jest.fn()
.mockRejectedValueOnce(new Error())
.mockRejectedValueOnce(new Error())
.mockResolvedValueOnce('success');
await retryOperation(mock);
expect(mock).toHaveBeenCalledTimes(3);
});
Vague name, tests mock not code
Requirements:
MANDATORY. Never skip.
npm test path/to/test.test.ts
Confirm:
Test passes? You're testing existing behavior. Fix test.
Test errors? Fix error, re-run until it fails correctly.
Write simplest code to pass the test.
Don't add features, refactor other code, or "improve" beyond the test.
MANDATORY.
npm test path/to/test.test.ts
Confirm:
Test fails? Fix code, not test.
Other tests fail? Fix now.
After green only:
Keep tests green. Don't add behavior.
Next failing test for next feature.
| Quality | Good | Bad |
|---|---|---|
| Minimal | One thing. "and" in name? Split it. | test('validates email and domain and whitespace') |
| Clear | Name describes behavior | test('test1') |
| Shows intent | Demonstrates desired API | Obscures what code should do |
"I'll write tests after to verify it works"
Tests written after code pass immediately. Passing immediately proves nothing:
Test-first forces you to see the test fail, proving it actually tests something.
"I already manually tested all the edge cases"
Manual testing is ad-hoc. You think you tested everything but:
Automated tests are systematic. They run the same way every time.
"Deleting X hours of work is wasteful"
Sunk cost fallacy. The time is already gone. Your choice now:
The "waste" is keeping code you can't trust. Working code without real tests is technical debt.
"TDD is dogmatic, being pragmatic means adapting"
TDD IS pragmatic:
"Pragmatic" shortcuts = debugging in production = slower.
"Tests after achieve the same goals - it's spirit not ritual"
No. Tests-after answer "What does this do?" Tests-first answer "What should this do?"
Tests-after are biased by your implementation. You test what you built, not what's required. You verify remembered edge cases, not discovered ones.
Tests-first force edge case discovery before implementing. Tests-after verify you remembered everything (you didn't).
30 minutes of tests after ≠ TDD. You get coverage, lose proof tests work.
| Excuse | Reality |
|---|---|
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
| "I'll test after" | Tests passing immediately prove nothing. |
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. |
| "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. |
| "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. |
| "Need to explore first" | Fine. Throw away exploration, start with TDD. |
| "Test hard = design unclear" | Listen to test. Hard to test = hard to use. |
| "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. |
| "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. |
| "Existing code has no tests" | You're improving it. Add tests for existing code. |
All of these mean: Delete code. Start over with TDD.
Bug: Empty email accepted
RED
test('rejects empty email', async () => {
const result = await submitForm({ email: '' });
expect(result.error).toBe('Email required');
});
Verify RED
$ npm test
FAIL: expected 'Email required', got undefined
GREEN
function submitForm(data: FormData) {
if (!data.email?.trim()) {
return { error: 'Email required' };
}
// ...
}
Verify GREEN
$ npm test
PASS
REFACTOR Extract validation for multiple fields if needed.
Before marking work complete:
Can't check all boxes? You skipped TDD. Start over.
| Problem | Solution |
|---|---|
| Don't know how to test | Write wished-for API. Write assertion first. Ask your human partner. |
| Test too complicated | Design too complicated. Simplify interface. |
| Must mock everything | Code too coupled. Use dependency injection. |
| Test setup huge | Extract helpers. Still complex? Simplify design. |
Bug found? Write failing test reproducing it. Follow TDD cycle. Test proves fix and prevents regression.
Never fix bugs without a test.
When adding mocks or test utilities, read @testing-anti-patterns.md to avoid common pitfalls:
Production code → test exists and failed first
Otherwise → not TDD
No exceptions without your human partner's permission.
Weekly Installs
32.1K
Repository
GitHub Stars
107.7K
First Seen
Jan 19, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode27.2K
codex26.0K
gemini-cli25.9K
github-copilot24.6K
cursor23.6K
kimi-cli22.5K
97,600 周安装