测试驱动开发TDD完整指南：红绿重构循环、核心原则与实践方法

tdd%3Atest-driven-development by neolabhq/context-engineering-kit

327 周安装量

741 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/neolabhq/context-engineering-kit --skill tdd:test-driven-development

方法论代码质量测试

🇨🇳中文介绍

测试驱动开发 (TDD)

概述

先写测试。看着它失败。编写最少的代码使其通过。

核心原则： 如果你没有看到测试失败，你就不知道它是否测试了正确的东西。

违反规则的字面意思就是违反规则的精神。

何时使用

总是使用：

新功能
错误修复
重构
行为变更

例外情况（询问你的真人伙伴）：

一次性原型
生成的代码
配置文件

想着"就这一次跳过 TDD"？停下。那是合理化借口。

铁律

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

先于测试写代码？删除它。重新开始。

没有例外：

不要把它当作"参考"保留
不要在编写测试时"调整"它
不要看它
删除就是删除

完全根据测试重新实现。完毕。

红-绿-重构

digraph tdd_cycle {
    rankdir=LR;
    red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
    verify_red [label="Verify fails\ncorrectly", shape=diamond];
    green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
    verify_green [label="Verify passes\nAll green", shape=diamond];
    refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
    next [label="Next", shape=ellipse];

    red -> verify_red;
    verify_red -> green [label="yes"];
    verify_red -> red [label="wrong\nfailure"];
    green -> verify_green;
    verify_green -> refactor [label="yes"];
    verify_green -> green [label="no"];
    refactor -> verify_green [label="stay\ngreen"];
    verify_green -> next;
    next -> red;
}

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

红 - 编写失败的测试

编写一个最小的测试来展示应该发生什么。

const result = await retryOperation(operation);

expect(result).toBe('success'); expect(attempts).toBe(3); });

Clear name, tests real behavior, one thing
</Good>

<Bad>
```typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});

名称模糊，测试的是模拟而不是代码

一个行为
清晰的名称
真实代码（除非不可避免，否则不使用模拟）

验证红 - 看着它失败

强制性的。永远不要跳过。

npm test path/to/test.test.ts

测试失败（不是错误）
失败信息符合预期
失败是因为功能缺失（不是拼写错误）

测试通过？ 你正在测试现有行为。修复测试。

测试出错？ 修复错误，重新运行直到它正确失败。

绿 - 最少的代码

编写最简单的代码使测试通过。

不要添加功能、重构其他代码或在测试范围之外进行"改进"。

验证绿 - 看着它通过

npm test path/to/test.test.ts

测试通过
其他测试仍然通过
输出干净（没有错误、警告）

测试失败？ 修复代码，而不是测试。

其他测试失败？ 立即修复。

移除重复
改进命名
提取辅助函数

保持测试通过。不要添加行为。

为下一个功能编写下一个失败的测试。

质量	好的	差的
最小化	一件事。名称里有"和"？拆分它。	`test('validates email and domain and whitespace')`
清晰	名称描述行为	`test('test1')`
展示意图	演示期望的 API	模糊了代码应该做什么

为什么顺序很重要

"我会在之后写测试来验证它是否工作"

在代码之后编写的测试会立即通过。立即通过证明不了什么：

可能测试了错误的东西
可能测试的是实现，而不是行为
可能遗漏了你忘记的边缘情况
你从未看到它捕获到错误

测试先行迫使你看到测试失败，证明它确实测试了某些东西。

"我已经手动测试了所有边缘情况"

手动测试是临时的。你以为你测试了所有东西，但是：

没有记录你测试了什么
代码更改时无法重新运行
在压力下容易忘记情况
"我试的时候它工作了" ≠ 全面

自动化测试是系统性的。它们每次都以相同的方式运行。

"删除 X 小时的工作是浪费的"

沉没成本谬误。时间已经过去了。你现在选择：

删除并用 TDD 重写（再花 X 小时，高置信度）
保留它并在之后添加测试（30 分钟，低置信度，很可能有错误）

"浪费"在于保留你无法信任的代码。没有真实测试的工作代码是技术债务。

"TDD 是教条主义的，务实意味着适应"

TDD 就是务实的：

在提交前发现错误（比之后调试更快）
防止回归（测试立即捕获破坏）
记录行为（测试展示如何使用代码）
支持重构（自由更改，测试捕获破坏）

"务实"的捷径 = 在生产环境中调试 = 更慢。

"事后测试达到相同目标 - 这是精神而不是仪式"

不。事后测试回答"这个做了什么？" 测试先行回答"这个应该做什么？"

事后测试受你的实现影响。你测试你构建的东西，而不是需要的东西。你验证你记得的边缘情况，而不是发现的边缘情况。

测试先行迫使你在实现前发现边缘情况。事后测试验证你记得所有东西（你没有）。

30 分钟的事后测试 ≠ TDD。你得到了覆盖率，但失去了测试有效的证明。

常见的合理化借口

借口	现实
"太简单了，不需要测试"	简单的代码也会出错。测试只需 30 秒。
"我之后会测试"	测试立即通过证明不了什么。
"事后测试达到相同目标"	事后测试 = "这个做了什么？" 测试先行 = "这个应该做什么？"
"已经手动测试过了"	临时的 ≠ 系统性的。没有记录，无法重新运行。
"删除 X 小时是浪费的"	沉没成本谬误。保留未验证的代码是技术债务。
"保留作为参考，先写测试"	你会调整它。那就是事后测试。删除就是删除。
"需要先探索一下"	可以。丢弃探索，用 TDD 开始。
"测试困难 = 设计不清晰"	倾听测试。难以测试 = 难以使用。
"TDD 会拖慢我的速度"	TDD 比调试更快。务实 = 测试先行。
"手动测试更快"	手动测试不能证明边缘情况。你每次更改都需要重新测试。
"现有代码没有测试"	你正在改进它。为现有代码添加测试。

危险信号 - 停止并重新开始

先于测试写代码
在实现后写测试
测试立即通过
无法解释测试为什么失败
测试"稍后"添加
合理化"就这一次"
"我已经手动测试过了"
"事后测试达到相同目的"
"这是精神而不是仪式"
"保留作为参考"或"调整现有代码"
"已经花了 X 小时，删除是浪费的"
"TDD 是教条主义的，我是务实的"
"这个不同，因为..."

所有这些都意味着：删除代码。用 TDD 重新开始。

示例：错误修复

错误： 接受空电子邮件

test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});

$ npm test
FAIL: expected 'Email required', got undefined

function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // ...
}

重构如果需要，为多个字段提取验证。

在标记工作完成前：

每个新函数/方法都有一个测试
在实现前看到每个测试失败
每个测试因预期原因失败（功能缺失，不是拼写错误）
编写了最少的代码使每个测试通过
所有测试通过
输出干净（没有错误、警告）
测试使用真实代码（仅在不可避免时使用模拟）
覆盖了边缘情况和错误

不能勾选所有选项？你跳过了 TDD。重新开始。

问题	解决方案
不知道如何测试	写下期望的 API。先写断言。询问你的真人伙伴。
测试太复杂	设计太复杂。简化接口。
必须模拟所有东西	代码耦合太紧。使用依赖注入。
测试设置庞大	提取辅助函数。仍然复杂？简化设计。

发现错误？编写一个重现它的失败测试。遵循 TDD 循环。测试证明修复并防止回归。

永远不要在没有测试的情况下修复错误。

Production code → test exists and failed first
Otherwise → not TDD

没有你的真人伙伴的许可，没有例外。

测试必须验证真实行为，而不是模拟行为。模拟是一种隔离手段，而不是被测试的东西。

核心原则： 测试代码做了什么，而不是模拟做了什么。

遵循严格的 TDD 可以防止这些反模式。

1. NEVER test mock behavior
2. NEVER add test-only methods to production classes
3. NEVER mock without understanding dependencies

反模式 1：测试模拟行为

// ❌ BAD: Testing that the mock exists
test('renders sidebar', () => {
  render(<Page />);
  expect(screen.getByTestId('sidebar-mock')).toBeInTheDocument();
});

为什么这是错误的：

你在验证模拟是否工作，而不是组件是否工作
模拟存在时测试通过，不存在时测试失败
没有告诉你任何关于真实行为的信息

你的真人伙伴的纠正： "我们是在测试模拟的行为吗？"

// ✅ GOOD: Test real component or don't mock it
test('renders sidebar', () => {
  render(<Page />);  // Don't mock sidebar
  expect(screen.getByRole('navigation')).toBeInTheDocument();
});

// OR if sidebar must be mocked for isolation:
// Don't assert on the mock - test Page's behavior with sidebar present

BEFORE asserting on any mock element:
  Ask: "Am I testing real component behavior or just mock existence?"

  IF testing mock existence:
    STOP - Delete the assertion or unmock the component

  Test real behavior instead

反模式 2：生产代码中的仅测试方法

// ❌ BAD: destroy() only used in tests
class Session {
  async destroy() {  // Looks like production API!
    await this._workspaceManager?.destroyWorkspace(this.id);
    // ... cleanup
  }
}

// In tests
afterEach(() => session.destroy());

为什么这是错误的：

生产类被仅用于测试的代码污染
如果在生产中意外调用会很危险
违反了 YAGNI 和关注点分离原则
混淆了对象生命周期与实体生命周期

// ✅ GOOD: Test utilities handle test cleanup
// Session has no destroy() - it's stateless in production

// In test-utils/
export async function cleanupSession(session: Session) {
  const workspace = session.getWorkspaceInfo();
  if (workspace) {
    await workspaceManager.destroyWorkspace(workspace.id);
  }
}

// In tests
afterEach(() => cleanupSession(session));

BEFORE adding any method to production class:
  Ask: "Is this only used by tests?"

  IF yes:
    STOP - Don't add it
    Put it in test utilities instead

  Ask: "Does this class own this resource's lifecycle?"

  IF no:
    STOP - Wrong class for this method

反模式 3：在不理解的情况下模拟

// ❌ BAD: Mock breaks test logic
test('detects duplicate server', () => {
  // Mock prevents config write that test depends on!
  vi.mock('ToolCatalog', () => ({
    discoverAndCacheTools: vi.fn().mockResolvedValue(undefined)
  }));

  await addServer(config);
  await addServer(config);  // Should throw - but won't!
});

为什么这是错误的：

模拟的方法具有测试依赖的副作用（写入配置）
过度模拟以"安全起见"破坏了实际行为
测试因错误原因通过或神秘地失败

// ✅ GOOD: Mock at correct level
test('detects duplicate server', () => {
  // Mock the slow part, preserve behavior test needs
  vi.mock('MCPServerManager'); // Just mock slow server startup

  await addServer(config);  // Config written
  await addServer(config);  // Duplicate detected ✓
});

BEFORE mocking any method:
  STOP - Don't mock yet

  1. Ask: "What side effects does the real method have?"
  2. Ask: "Does this test depend on any of those side effects?"
  3. Ask: "Do I fully understand what this test needs?"

  IF depends on side effects:
    Mock at lower level (the actual slow/external operation)
    OR use test doubles that preserve necessary behavior
    NOT the high-level method the test depends on

  IF unsure what test depends on:
    Run test with real implementation FIRST
    Observe what actually needs to happen
    THEN add minimal mocking at the right level

  Red flags:
    - "I'll mock this to be safe"
    - "This might be slow, better mock it"
    - Mocking without understanding the dependency chain

反模式 4：不完整的模拟

// ❌ BAD: Partial mock - only fields you think you need
const mockResponse = {
  status: 'success',
  data: { userId: '123', name: 'Alice' }
  // Missing: metadata that downstream code uses
};

// Later: breaks when code accesses response.metadata.requestId

为什么这是错误的：

部分模拟隐藏了结构假设 - 你只模拟了你了解的字段
下游代码可能依赖于你没有包含的字段 - 静默失败
测试通过但集成失败 - 模拟不完整，真实 API 完整
虚假信心 - 测试没有证明任何关于真实行为的信息

铁律： 模拟现实中存在的完整数据结构，而不仅仅是你当前测试使用的字段。

// ✅ GOOD: Mirror real API completeness
const mockResponse = {
  status: 'success',
  data: { userId: '123', name: 'Alice' },
  metadata: { requestId: 'req-789', timestamp: 1234567890 }
  // All fields real API returns
};

BEFORE creating mock responses:
  Check: "What fields does the real API response contain?"

  Actions:
    1. Examine actual API response from docs/examples
    2. Include ALL fields system might consume downstream
    3. Verify mock matches real response schema completely

  Critical:
    If you're creating a mock, you must understand the ENTIRE structure
    Partial mocks fail silently when code depends on omitted fields

  If uncertain: Include all documented fields

反模式 5：集成测试作为事后想法

✅ Implementation complete
❌ No tests written
"Ready for testing"

为什么这是错误的：

测试是实施的一部分，不是可选的后续工作
TDD 本应发现这一点
没有测试就不能声称完成

TDD cycle:
1. Write failing test
2. Implement to pass
3. Refactor
4. THEN claim complete

当模拟变得过于复杂时

模拟设置比测试逻辑更长
模拟所有东西以使测试通过
模拟缺少真实组件具有的方法
模拟更改时测试中断

你的真人伙伴的问题： "我们这里需要使用模拟吗？"

考虑： 使用真实组件的集成测试通常比复杂的模拟更简单

TDD 防止这些反模式

为什么 TDD 有帮助：

先写测试 → 迫使你思考你实际在测试什么
看着它失败 → 确认测试测试的是真实行为，而不是模拟
最小化实现 → 防止仅测试方法混入
真实依赖 → 在模拟之前，你看到测试实际需要什么

如果你在测试模拟行为，你就违反了 TDD - 你在没有先看到测试针对真实代码失败的情况下添加了模拟。

反模式	修复方法
断言模拟元素	测试真实组件或取消模拟
生产代码中的仅测试方法	移动到测试工具中
在不理解的情况下模拟	先理解依赖关系，最小化模拟
不完整的模拟	完全镜像真实 API
测试作为事后想法	TDD - 测试先行
过度复杂的模拟	考虑集成测试

断言检查 *-mock 测试 ID
方法仅在测试文件中调用
模拟设置占测试的 50% 以上
移除模拟时测试失败
无法解释为什么需要模拟
模拟"只是为了安全起见"

模拟是隔离的工具，而不是要测试的东西。

如果 TDD 揭示你正在测试模拟行为，你就走错了路。

修复方法：测试真实行为，或者质疑你为什么需要模拟。

🇺🇸English

Test-Driven Development (TDD)

Overview

Write the test first. Watch it fail. Write minimal code to pass.

Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.

Violating the letter of the rules is violating the spirit of the rules.

When to Use

Always:

New features
Bug fixes
Refactoring
Behavior changes

Exceptions (ask your human partner):

Throwaway prototypes
Generated code
Configuration files

Thinking "skip TDD just this once"? Stop. That's rationalization.

The Iron Law

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Write code before the test? Delete it. Start over.

No exceptions:

Don't keep it as "reference"
Don't "adapt" it while writing tests
Don't look at it
Delete means delete

Implement fresh from tests. Period.

Red-Green-Refactor

digraph tdd_cycle {
    rankdir=LR;
    red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
    verify_red [label="Verify fails\ncorrectly", shape=diamond];
    green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
    verify_green [label="Verify passes\nAll green", shape=diamond];
    refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
    next [label="Next", shape=ellipse];

    red -> verify_red;
    verify_red -> green [label="yes"];
    verify_red -> red [label="wrong\nfailure"];
    green -> verify_green;
    verify_green -> refactor [label="yes"];
    verify_green -> green [label="no"];
    refactor -> verify_green [label="stay\ngreen"];
    verify_green -> next;
    next -> red;
}

RED - Write Failing Test

Write one minimal test showing what should happen.

const result = await retryOperation(operation);

expect(result).toBe('success'); expect(attempts).toBe(3); });

Clear name, tests real behavior, one thing
</Good>

<Bad>
```typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});

Vague name, tests mock not code

Requirements:

One behavior
Clear name
Real code (no mocks unless unavoidable)

Verify RED - Watch It Fail

MANDATORY. Never skip.

npm test path/to/test.test.ts

Confirm:

Test fails (not errors)
Failure message is expected
Fails because feature missing (not typos)

Test passes? You're testing existing behavior. Fix test.

Test errors? Fix error, re-run until it fails correctly.

GREEN - Minimal Code

Write simplest code to pass the test.

Don't add features, refactor other code, or "improve" beyond the test.

Verify GREEN - Watch It Pass

MANDATORY.

npm test path/to/test.test.ts

Confirm:

Test passes
Other tests still pass
Output pristine (no errors, warnings)

Test fails? Fix code, not test.

Other tests fail? Fix now.

REFACTOR - Clean Up

After green only:

Remove duplication
Improve names
Extract helpers

Keep tests green. Don't add behavior.

Repeat

Next failing test for next feature.

Good Tests

Quality	Good	Bad
Minimal	One thing. "and" in name? Split it.	`test('validates email and domain and whitespace')`
Clear	Name describes behavior	`test('test1')`
Shows intent	Demonstrates desired API	Obscures what code should do

Why Order Matters

"I'll write tests after to verify it works"

Tests written after code pass immediately. Passing immediately proves nothing:

Might test wrong thing
Might test implementation, not behavior
Might miss edge cases you forgot
You never saw it catch the bug

Test-first forces you to see the test fail, proving it actually tests something.

"I already manually tested all the edge cases"

Manual testing is ad-hoc. You think you tested everything but:

No record of what you tested
Can't re-run when code changes
Easy to forget cases under pressure
"It worked when I tried it" ≠ comprehensive

Automated tests are systematic. They run the same way every time.

"Deleting X hours of work is wasteful"

Sunk cost fallacy. The time is already gone. Your choice now:

Delete and rewrite with TDD (X more hours, high confidence)
Keep it and add tests after (30 min, low confidence, likely bugs)

The "waste" is keeping code you can't trust. Working code without real tests is technical debt.

"TDD is dogmatic, being pragmatic means adapting"

TDD IS pragmatic:

Finds bugs before commit (faster than debugging after)
Prevents regressions (tests catch breaks immediately)
Documents behavior (tests show how to use code)
Enables refactoring (change freely, tests catch breaks)

"Pragmatic" shortcuts = debugging in production = slower.

"Tests after achieve the same goals - it's spirit not ritual"

No. Tests-after answer "What does this do?" Tests-first answer "What should this do?"

Tests-after are biased by your implementation. You test what you built, not what's required. You verify remembered edge cases, not discovered ones.

Tests-first force edge case discovery before implementing. Tests-after verify you remembered everything (you didn't).

30 minutes of tests after ≠ TDD. You get coverage, lose proof tests work.

Common Rationalizations

Excuse	Reality
"Too simple to test"	Simple code breaks. Test takes 30 seconds.
"I'll test after"	Tests passing immediately prove nothing.
"Tests after achieve same goals"	Tests-after = "what does this do?" Tests-first = "what should this do?"
"Already manually tested"	Ad-hoc ≠ systematic. No record, can't re-run.
"Deleting X hours is wasteful"	Sunk cost fallacy. Keeping unverified code is technical debt.
"Keep as reference, write tests first"	You'll adapt it. That's testing after. Delete means delete.
"Need to explore first"	Fine. Throw away exploration, start with TDD.
"Test hard = design unclear"	Listen to test. Hard to test = hard to use.
"TDD will slow me down"	TDD faster than debugging. Pragmatic = test-first.
"Manual test faster"	Manual doesn't prove edge cases. You'll re-test every change.
"Existing code has no tests"	You're improving it. Add tests for existing code.

Red Flags - STOP and Start Over

Code before test
Test after implementation
Test passes immediately
Can't explain why test failed
Tests added "later"
Rationalizing "just this once"
"I already manually tested it"
"Tests after achieve the same purpose"
"It's about spirit not ritual"
"Keep as reference" or "adapt existing code"
"Already spent X hours, deleting is wasteful"
"TDD is dogmatic, I'm being pragmatic"
"This is different because..."

All of these mean: Delete code. Start over with TDD.

Example: Bug Fix

Bug: Empty email accepted

RED

test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});

Verify RED

$ npm test
FAIL: expected 'Email required', got undefined

GREEN

function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // ...
}

Verify GREEN

$ npm test
PASS

REFACTOR Extract validation for multiple fields if needed.

Verification Checklist

Before marking work complete:

Every new function/method has a test
Watched each test fail before implementing
Each test failed for expected reason (feature missing, not typo)
Wrote minimal code to pass each test
All tests pass
Output pristine (no errors, warnings)
Tests use real code (mocks only if unavoidable)
Edge cases and errors covered

Can't check all boxes? You skipped TDD. Start over.

When Stuck

Problem	Solution
Don't know how to test	Write wished-for API. Write assertion first. Ask your human partner.
Test too complicated	Design too complicated. Simplify interface.
Must mock everything	Code too coupled. Use dependency injection.
Test setup huge	Extract helpers. Still complex? Simplify design.

Debugging Integration

Bug found? Write failing test reproducing it. Follow TDD cycle. Test proves fix and prevents regression.

Never fix bugs without a test.

Final Rule

Production code → test exists and failed first
Otherwise → not TDD

No exceptions without your human partner's permission.

Testing Anti-Patterns

Overview

Tests must verify real behavior, not mock behavior. Mocks are a means to isolate, not the thing being tested.

Core principle: Test what the code does, not what the mocks do.

Following strict TDD prevents these anti-patterns.

The Iron Laws

1. NEVER test mock behavior
2. NEVER add test-only methods to production classes
3. NEVER mock without understanding dependencies

Anti-Pattern 1: Testing Mock Behavior

The violation:

// ❌ BAD: Testing that the mock exists
test('renders sidebar', () => {
  render(<Page />);
  expect(screen.getByTestId('sidebar-mock')).toBeInTheDocument();
});

Why this is wrong:

You're verifying the mock works, not that the component works
Test passes when mock is present, fails when it's not
Tells you nothing about real behavior

your human partner's correction: "Are we testing the behavior of a mock?"

The fix:

// ✅ GOOD: Test real component or don't mock it
test('renders sidebar', () => {
  render(<Page />);  // Don't mock sidebar
  expect(screen.getByRole('navigation')).toBeInTheDocument();
});

// OR if sidebar must be mocked for isolation:
// Don't assert on the mock - test Page's behavior with sidebar present

Gate Function

BEFORE asserting on any mock element:
  Ask: "Am I testing real component behavior or just mock existence?"

  IF testing mock existence:
    STOP - Delete the assertion or unmock the component

  Test real behavior instead

Anti-Pattern 2: Test-Only Methods in Production

The violation:

// ❌ BAD: destroy() only used in tests
class Session {
  async destroy() {  // Looks like production API!
    await this._workspaceManager?.destroyWorkspace(this.id);
    // ... cleanup
  }
}

// In tests
afterEach(() => session.destroy());

Why this is wrong:

Production class polluted with test-only code
Dangerous if accidentally called in production
Violates YAGNI and separation of concerns
Confuses object lifecycle with entity lifecycle

The fix:

// ✅ GOOD: Test utilities handle test cleanup
// Session has no destroy() - it's stateless in production

// In test-utils/
export async function cleanupSession(session: Session) {
  const workspace = session.getWorkspaceInfo();
  if (workspace) {
    await workspaceManager.destroyWorkspace(workspace.id);
  }
}

// In tests
afterEach(() => cleanupSession(session));

Gate Function

BEFORE adding any method to production class:
  Ask: "Is this only used by tests?"

  IF yes:
    STOP - Don't add it
    Put it in test utilities instead

  Ask: "Does this class own this resource's lifecycle?"

  IF no:
    STOP - Wrong class for this method

Anti-Pattern 3: Mocking Without Understanding

The violation:

// ❌ BAD: Mock breaks test logic
test('detects duplicate server', () => {
  // Mock prevents config write that test depends on!
  vi.mock('ToolCatalog', () => ({
    discoverAndCacheTools: vi.fn().mockResolvedValue(undefined)
  }));

  await addServer(config);
  await addServer(config);  // Should throw - but won't!
});

Why this is wrong:

Mocked method had side effect test depended on (writing config)
Over-mocking to "be safe" breaks actual behavior
Test passes for wrong reason or fails mysteriously

The fix:

// ✅ GOOD: Mock at correct level
test('detects duplicate server', () => {
  // Mock the slow part, preserve behavior test needs
  vi.mock('MCPServerManager'); // Just mock slow server startup

  await addServer(config);  // Config written
  await addServer(config);  // Duplicate detected ✓
});

Gate Function

BEFORE mocking any method:
  STOP - Don't mock yet

  1. Ask: "What side effects does the real method have?"
  2. Ask: "Does this test depend on any of those side effects?"
  3. Ask: "Do I fully understand what this test needs?"

  IF depends on side effects:
    Mock at lower level (the actual slow/external operation)
    OR use test doubles that preserve necessary behavior
    NOT the high-level method the test depends on

  IF unsure what test depends on:
    Run test with real implementation FIRST
    Observe what actually needs to happen
    THEN add minimal mocking at the right level

  Red flags:
    - "I'll mock this to be safe"
    - "This might be slow, better mock it"
    - Mocking without understanding the dependency chain

Anti-Pattern 4: Incomplete Mocks

The violation:

// ❌ BAD: Partial mock - only fields you think you need
const mockResponse = {
  status: 'success',
  data: { userId: '123', name: 'Alice' }
  // Missing: metadata that downstream code uses
};

// Later: breaks when code accesses response.metadata.requestId

Why this is wrong:

Partial mocks hide structural assumptions - You only mocked fields you know about
Downstream code may depend on fields you didn't include - Silent failures
Tests pass but integration fails - Mock incomplete, real API complete
False confidence - Test proves nothing about real behavior

The Iron Rule: Mock the COMPLETE data structure as it exists in reality, not just fields your immediate test uses.

The fix:

// ✅ GOOD: Mirror real API completeness
const mockResponse = {
  status: 'success',
  data: { userId: '123', name: 'Alice' },
  metadata: { requestId: 'req-789', timestamp: 1234567890 }
  // All fields real API returns
};

Gate Function

BEFORE creating mock responses:
  Check: "What fields does the real API response contain?"

  Actions:
    1. Examine actual API response from docs/examples
    2. Include ALL fields system might consume downstream
    3. Verify mock matches real response schema completely

  Critical:
    If you're creating a mock, you must understand the ENTIRE structure
    Partial mocks fail silently when code depends on omitted fields

  If uncertain: Include all documented fields

Anti-Pattern 5: Integration Tests as Afterthought

The violation:

✅ Implementation complete
❌ No tests written
"Ready for testing"

Why this is wrong:

Testing is part of implementation, not optional follow-up
TDD would have caught this
Can't claim complete without tests

The fix:

TDD cycle:
1. Write failing test
2. Implement to pass
3. Refactor
4. THEN claim complete

When Mocks Become Too Complex

Warning signs:

Mock setup longer than test logic
Mocking everything to make test pass
Mocks missing methods real components have
Test breaks when mock changes

your human partner's question: "Do we need to be using a mock here?"

Consider: Integration tests with real components often simpler than complex mocks

TDD Prevents These Anti-Patterns

Why TDD helps:

Write test first → Forces you to think about what you're actually testing
Watch it fail → Confirms test tests real behavior, not mocks
Minimal implementation → No test-only methods creep in
Real dependencies → You see what the test actually needs before mocking

If you're testing mock behavior, you violated TDD - you added mocks without watching test fail against real code first.

Quick Reference

Anti-Pattern	Fix
Assert on mock elements	Test real component or unmock it
Test-only methods in production	Move to test utilities
Mock without understanding	Understand dependencies first, mock minimally
Incomplete mocks	Mirror real API completely
Tests as afterthought	TDD - tests first
Over-complex mocks	Consider integration tests

Red Flags

Assertion checks for *-mock test IDs
Methods only called in test files
Mock setup is >50% of test
Test fails when you remove mock
Can't explain why mock is needed
Mocking "just to be safe"

The Bottom Line

Mocks are tools to isolate, not things to test.

If TDD reveals you're testing mock behavior, you've gone wrong.

Fix: Test real behavior or question why you're mocking at all.

Weekly Installs

230

Repository

neolabhq/contex…ring-kit

GitHub Stars

699

First Seen

Feb 19, 2026

Installed on

opencode219

codex216

github-copilot216

gemini-cli215

amp213

kimi-cli213

Vue 3 调试指南：解决响应式、计算属性与监听器常见错误

11,400 周安装

测试驱动开发TDD完整指南：红绿重构循环、核心原则与实践方法

🇨🇳中文介绍

测试驱动开发 (TDD)

概述

何时使用

铁律

红-绿-重构

相关 Skills

红 - 编写失败的测试

验证红 - 看着它失败

绿 - 最少的代码

验证绿 - 看着它通过

重构 - 清理

重复

好的测试

为什么顺序很重要

常见的合理化借口

危险信号 - 停止并重新开始

示例：错误修复

验证清单

当遇到困难时

调试集成

最终规则

测试反模式

概述

铁律

反模式 1：测试模拟行为

门控函数

反模式 2：生产代码中的仅测试方法

门控函数

反模式 3：在不理解的情况下模拟

门控函数

反模式 4：不完整的模拟

门控函数

反模式 5：集成测试作为事后想法

当模拟变得过于复杂时

TDD 防止这些反模式

快速参考

危险信号

底线

🇺🇸English

Test-Driven Development (TDD)

Overview

When to Use

The Iron Law

Red-Green-Refactor

RED - Write Failing Test

Verify RED - Watch It Fail

GREEN - Minimal Code

Verify GREEN - Watch It Pass

REFACTOR - Clean Up

Repeat

Good Tests

Why Order Matters

Common Rationalizations

Red Flags - STOP and Start Over

Example: Bug Fix

Verification Checklist

When Stuck

Debugging Integration

Final Rule

Testing Anti-Patterns

Overview

The Iron Laws

Anti-Pattern 1: Testing Mock Behavior

Gate Function

Anti-Pattern 2: Test-Only Methods in Production

Gate Function

Anti-Pattern 3: Mocking Without Understanding

Gate Function

Anti-Pattern 4: Incomplete Mocks

Gate Function

Anti-Pattern 5: Integration Tests as Afterthought

When Mocks Become Too Complex

TDD Prevents These Anti-Patterns

Quick Reference

Red Flags

The Bottom Line

最新 Skills