⚠️

重要前提

安装AI Skills的关键前提是：必须科学上网，且开启TUN模式，这一点至关重要，直接决定安装能否顺利完成，在此郑重提醒三遍：科学上网，科学上网，科学上网。查看完整安装教程 →

Grove 测试技能指南：高效编写有价值的集成测试，提升代码质量与开发信心

grove-testing by autumnsgrove/groveengine

64 周安装量

2 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/autumnsgrove/groveengine --skill grove-testing

软件工程 JavaScript 框架测试

🇨🇳中文介绍

Grove 测试技能

何时激活

在以下情况激活此技能：

决定测试什么（而不仅仅是“如何”测试）时
为新的 Grove 功能编写测试时
审查现有测试的有效性时
被要求“添加测试”但没有具体指导时
评估测试是否提供真正价值时
重构导致大量测试失败时（这是不良测试的症状）

对于技术实现（Vitest 语法、模拟模式、断言），请同时使用 javascript-testing 技能。

测试理念

“编写测试。不要太多。主要是集成测试。” — Guillermo Rauch

这概括了 Grove 关于测试的一切信念：

编写测试 — 自动化测试是值得的。它们使你有信心进行重构，可以作为文档，并在用户发现问题之前捕获回归。

不要太多 — 测试的回报是递减的。目标不是覆盖率数字，而是信心。当你对发布有信心时，你就有了足够的测试。

主要是集成测试 — 集成测试能捕获真正的问题而不会变得脆弱。它们测试用户实际体验到的行为，而不是内部实现。

指导原则

“你的测试越接近软件被使用的方式，它们能给你的信心就越大。” — Kent C. Dodds (Testing Library)

问自己：当功能损坏时，这个测试会失败吗？ 如果是，它就有价值。如果它只在重构时失败，那它就是在测试实现细节。

什么使测试有价值

一个好的测试具有以下属性（Kent Beck 的测试期望特性）：

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

不应该测试什么

并非所有东西都需要测试。有些东西在被测试时会对你的代码库造成积极伤害。

内容	原因
琐碎代码	Getter、setter、没有逻辑的数据模型
框架行为	相信 SvelteKit 路由能正常工作
实现细节	内部状态、私有方法、CSS 类
一次性脚本	维护成本超过价值
易变的原型	需求不明确，代码将发生变化

内容	方法
配置	冒烟测试确保它能加载，而不是测试每个选项
第三方集成	在边界处模拟，测试你的代码的响应
视觉设计	快照测试或视觉回归测试，而不是单元测试

内容	原因
业务逻辑	应用程序的核心价值
面向用户的流程	用户实际体验到的内容
边界情况	错误状态、空状态、边界条件
错误修复	每个错误都应成为一个测试以防止回归

现代 JavaScript 测试遵循测试奖杯，而不是旧的测试金字塔：

                    ╭─────────╮
                    │   E2E   │  ← 少量：关键用户旅程
                    ╰────┬────╯
               ╭─────────┴─────────╮
               │   Integration     │  ← 大量：信心所在之处
               ╰─────────┬─────────╯
                  ╭──────┴──────╮
                  │    Unit     │  ← 一些：纯函数、算法
                  ╰──────┬──────╯
              ╭──────────┴──────────╮
              │   Static Analysis   │  ← TypeScript、ESLint（始终开启）
              ╰─────────────────────╯

静态分析（TypeScript、ESLint）

捕获拼写错误、类型错误、明显错误
零运行时成本，始终运行
这是你的第一道防线

纯函数、算法、工具函数
快速、隔离、易于调试
不要模拟所有东西——在可行的情况下测试真实行为

集成测试（最佳点）

多个单元协同工作
测试用户实际体验到的行为
比单元测试更健壮，比 E2E 测试更快
你的大部分测试应该在这里

E2E 测试（Playwright）

仅限关键用户旅程：登录、结账、核心流程
编写和维护成本高
保留给失败即意味着业务影响的流程

编写有效的测试

结构：准备-执行-断言

每个测试都应遵循此模式：

it("should reject invalid email during registration", async () => {
	// Arrange: 设置场景
	const invalidEmail = "not-an-email";

	// Act: 执行操作
	const result = await registerUser({ email: invalidEmail, password: "valid123" });

	// Assert: 检查结果
	expect(result.success).toBe(false);
	expect(result.error).toContain("email");
});

执行部分应该只有一行。如果不是，测试可能做得太多了。

命名：说明什么会损坏

测试名称应描述行为，而不是实现：

should reject registration with invalid email
should show error message when API fails
should preserve draft when navigating away

test email validation（关于它的什么？）
handleSubmit works（“works”是什么意思？）
test case 1（不）

每个测试应该有一个失败的原因。如果测试失败，你应该立即知道是什么坏了。

// Bad: 测试多件事
it('should handle registration', async () => {
    // Tests validation, API call, redirect, AND email sending
});

// Good: 聚焦的测试
it('should reject invalid email format', ...);
it('should call API with valid data', ...);
it('should redirect after successful registration', ...);
it('should send welcome email after registration', ...);

测试信任边界（Rootwork）

每个信任边界都应测试有效和无效数据：

表单操作： 提交缺失字段、错误类型、边界值 → 验证 parseFormData() 返回结构化错误（而非崩溃）

KV 读取： 模拟 KV 返回损坏/过期的 JSON → 验证 safeJsonParse() 回退到默认值

缓存读取： 模拟缓存服务返回意外形状 → 验证 createTypedCacheReader() 使用回退

Catch 块： 触发重定向和 HTTP 错误 → 验证 isRedirect()/isHttpError() 正确路由它们

参考：Rootwork (@autumnsgrove/lattice/server) 提供了 parseFormData、safeJsonParse、createTypedCacheReader、isRedirect 和 isHttpError。

实践中的集成测试

集成测试是 Grove 测试策略的核心。以下是如何写好它们。

测试用户行为，而非实现

// Bad: 测试实现
it("should set isLoading state to true", async () => {
	const { component } = render(LoginForm);
	await fireEvent.click(getByRole("button"));
	expect(component.isLoading).toBe(true); // Testing internal state!
});

// Good: 测试用户体验
it("should show loading indicator while logging in", async () => {
	render(LoginForm);
	await fireEvent.click(getByRole("button", { name: /sign in/i }));
	expect(getByRole("progressbar")).toBeInTheDocument();
});

使用无障碍查询

以用户查找元素的方式查询元素：

// 优先级顺序（从好到坏）：
getByRole("button", { name: /submit/i }); // 屏幕阅读器看到的方式
getByLabelText("Email"); // 表单字段
getByText("Welcome back"); // 可见文本
getByTestId("login-form"); // 最后的手段

模拟会降低对集成的信心。请谨慎使用：

// 过度模拟：虚假的信心
vi.mock("./api");
vi.mock("./validation");
vi.mock("./utils");
// 你在测试... 没有真实的东西

// 更好：在边界处模拟
vi.mock("./external-api"); // 模拟网络，而不是你的代码
// 让验证、工具函数等真实运行

经验法则： 如果你在模拟自己写的东西，请重新考虑。

中断的测试在告诉你一些事情。请倾听。

好的中断（预期的）

功能变更 — 测试捕获到行为发生了变化。更新测试。
错误修复 — 旧的测试是错误的。修复它。
需求变更 — 测试反映了旧的需求。更新它。

坏的中断（不良测试的症状）

重构了内部代码 — 测试与实现耦合。重写它。
更改了 CSS 类 — 测试在查询实现细节。使用无障碍查询。
重新排序了代码 — 测试依赖于执行顺序。使其独立于顺序。

如果重构经常导致测试中断，你的测试就是在测试错误的东西。

错误 → 测试管道

每个生产错误都应成为一个测试：

错误报告 — 用户无法使用某些商品结账
本地复现 — 找到确切的条件
编写失败的测试 — 捕获错误的条件
修复错误 — 测试现在通过
测试防止回归 — 错误永远不会再出现

这是最高价值的测试实践之一。它将痛苦转化为保护。

要避免的反模式

        ╭───────────────────────────╮
        │      Many E2E tests       │  ← 缓慢、脆弱、昂贵
        ╰───────────┬───────────────╯
              ╭─────┴─────╮
              │ Few int.  │
              ╰─────┬─────╯
                ╭───┴───╮
                │ Few   │
                │ unit  │
                ╰───────╯

这是本末倒置。E2E 测试很昂贵。集成测试能提供最佳的投资回报率。

// Testing implementation (bad)
expect(component.state.items).toHaveLength(3);
expect(handleClick).toHaveBeenCalledWith({ id: 1 });

// Testing behavior (good)
expect(getByRole("list").children).toHaveLength(3);
expect(getByText("Item added!")).toBeInTheDocument();

追求 100% 覆盖率会导致不良测试：

// 仅为了达到覆盖率而编写，提供零价值
it("should have properties", () => {
	const user = new User();
	expect(user.email).toBeDefined();
	expect(user.name).toBeDefined();
});

覆盖率是一个信号，而不是一个目标。高覆盖率加上不良测试比中等覆盖率加上良好测试更糟糕。

复杂的序列化输出
错误消息格式化
API 响应形状

快照不适用于：

UI 组件（每次样式更改都会中断）
任何带有时间戳或随机 ID 的东西
大型对象（没人会审查 500 行的快照差异）

Grove 测试工作流

当被要求添加测试时，请遵循此工作流：

这个功能为用户做什么？不是它如何实现——它提供什么价值？

2. 识别关键路径

如果这个功能失败，什么会损坏？这些就是你的测试用例。

3. 首先编写集成测试

从测试真实用户行为的测试开始。仅对复杂逻辑添加单元测试。

4. 保持测试靠近代码

src/
└── lib/
    └── features/
        └── auth/
            ├── login.ts
            ├── login.test.ts      ← 紧邻代码
            └── register.ts

5. 持续运行测试

npx vitest              # 开发期间的监视模式
npx vitest run          # CI 验证

情况	操作
新功能	为面向用户的行为编写集成测试
错误修复	首先编写复现错误的测试，然后修复
重构	运行现有测试；如果它们在安全更改时中断，那就是不良测试
“需要更多覆盖率”	为未覆盖的行为添加测试，而不是未覆盖的代码行
纯函数/算法	对其进行单元测试
API 端点	使用模拟的外部服务进行集成测试
UI 组件	使用 Testing Library 进行组件测试
关键用户流程	使用 Playwright 进行 E2E 测试

与其他技能的集成

将 javascript-testing 用于：

Vitest 配置语法
模拟模式和 API
断言参考
SvelteKit 特定的测试模式

编写测试描述时，遵循 Grove 的语调：

清晰、直接的名称
没有术语
说明用户体验到什么

在编写测试之前/之后运行代码检查和类型检查。静态分析能捕获与测试不同的错误。

在认为测试“完成”之前：

测试描述用户行为，而不是实现
每个测试都有一个清晰的失败原因
测试使用无障碍查询（getByRole、getByLabelText）
模拟仅限于外部边界
测试名称解释了失败时什么会损坏
没有对易变内容进行快照测试
错误修复包括回归测试
测试运行快速（秒级，而非分钟级）

好的测试让你有信心发布。这就是全部意义。

🇺🇸English

Grove Testing Skill

When to Activate

Activate this skill when:

Deciding what to test (not just how)
Writing tests for new Grove features
Reviewing existing tests for effectiveness
Asked to "add tests" without specific guidance
Evaluating whether tests are providing real value
Refactoring causes many tests to break (symptom of bad tests)

For technical implementation (Vitest syntax, mocking patterns, assertions), use the javascript-testing skill alongside this one.

The Testing Philosophy

"Write tests. Not too many. Mostly integration." — Guillermo Rauch

This captures everything Grove believes about testing:

Write tests — Automated tests are worthwhile. They enable confident refactoring, serve as documentation, and catch regressions before users do.

Not too many — Tests have diminishing returns. The goal isn't coverage numbers. It's confidence. When you feel confident shipping, you have enough tests.

Mostly integration — Integration tests catch real problems without being brittle. They test behavior users actually experience, not internal implementation.

The Guiding Principle

"The more your tests resemble the way your software is used, the more confidence they can give you." — Kent C. Dodds (Testing Library)

Ask yourself: Does this test fail when the feature breaks? If yes, it's valuable. If it only fails during refactors, it's testing implementation details.

What Makes a Test Valuable

A good test has these properties (Kent Beck's Test Desiderata):

Property	What It Means
Behavior-sensitive	Fails when actual functionality breaks
Structure-immune	Doesn't break when you refactor safely
Deterministic	Same result every time, no flakiness
Fast	Gives feedback in seconds, not minutes
Clear diagnosis	When it fails, you know exactly what broke
Cheap to write	Effort proportional to code complexity

The Confidence Test

Before writing a test, ask:

Would I notice if this broke in production? If yes, test it.
Would this test fail if the feature broke? If no, don't write it.
Does this test resemble how users interact with the feature? If no, reconsider.

What NOT to Test

Not everything needs tests. Some things actively harm your codebase when tested.

Skip Testing

What	Why
Trivial code	Getters, setters, data models with no logic
Framework behavior	Trust that SvelteKit routing works
Implementation details	Internal state, private methods, CSS classes
One-off scripts	Maintenance cost exceeds value
Volatile prototypes	Requirements unclear, code will change

Test Lightly

What	Approach
Configuration	Smoke test that it loads, not every option
Third-party integrations	Mock at boundaries, test your code's response
Visual design	Snapshot tests or visual regression, not unit tests

Test Thoroughly

What	Why
Business logic	Core value of the application
User-facing flows	What users actually experience
Edge cases	Error states, empty states, boundaries
Bug fixes	Every bug becomes a test to prevent regression

The Testing Trophy

Modern JavaScript testing follows the Testing Trophy, not the old Testing Pyramid:

                    ╭─────────╮
                    │   E2E   │  ← Few: critical user journeys
                    ╰────┬────╯
               ╭─────────┴─────────╮
               │   Integration     │  ← Many: this is where confidence lives
               ╰─────────┬─────────╯
                  ╭──────┴──────╮
                  │    Unit     │  ← Some: pure functions, algorithms
                  ╰──────┬──────╯
              ╭──────────┴──────────╮
              │   Static Analysis   │  ← TypeScript, ESLint (always on)
              ╰─────────────────────╯

What Each Layer Does

Static Analysis (TypeScript, ESLint)

Catches typos, type errors, obvious mistakes
Zero runtime cost, always running
This is your first line of defense

Unit Tests

Pure functions, algorithms, utilities
Fast, isolated, easy to debug
Don't mock everything—test real behavior where practical

Integration Tests (THE SWEET SPOT)

Multiple units working together
Tests behavior users actually experience
Less brittle than unit tests, faster than E2E
This is where most of your tests should live

E2E Tests (Playwright)

Critical user journeys only: login, checkout, core flows
Expensive to write and maintain
Reserve for flows where failure = business impact

Writing Effective Tests

Structure: Arrange-Act-Assert

Every test should follow this pattern:

it("should reject invalid email during registration", async () => {
	// Arrange: Set up the scenario
	const invalidEmail = "not-an-email";

	// Act: Do the thing
	const result = await registerUser({ email: invalidEmail, password: "valid123" });

	// Assert: Check the outcome
	expect(result.success).toBe(false);
	expect(result.error).toContain("email");
});

The Act section should be one line. If it's not, the test is probably doing too much.

Naming: Say What Breaks

Test names should describe the behavior, not the implementation:

Good names:

should reject registration with invalid email
should show error message when API fails
should preserve draft when navigating away

Bad names:

test email validation (what about it?)
handleSubmit works (what does "works" mean?)
test case 1 (no)

Test One Thing

Each test should have one reason to fail. If a test fails, you should immediately know what broke.

// Bad: Testing multiple things
it('should handle registration', async () => {
    // Tests validation, API call, redirect, AND email sending
});

// Good: Focused tests
it('should reject invalid email format', ...);
it('should call API with valid data', ...);
it('should redirect after successful registration', ...);
it('should send welcome email after registration', ...);

Testing Trust Boundaries (Rootwork)

Every trust boundary should have tests for both valid and invalid data:

Form actions: Submit with missing fields, wrong types, edge-case values → verify parseFormData() returns structured errors (not crashes)

KV reads: Mock KV returning corrupted/stale JSON → verify safeJsonParse() falls back to default

Cache reads: Mock cache service returning unexpected shapes → verify createTypedCacheReader() uses fallback

Catch blocks: Trigger redirects and HTTP errors → verify isRedirect()/isHttpError() route them correctly

Reference: Rootwork (@autumnsgrove/lattice/server) provides parseFormData, safeJsonParse, createTypedCacheReader, isRedirect, and isHttpError.

Integration Tests in Practice

Integration tests are the heart of Grove's testing strategy. Here's how to write them well.

Test User Behavior, Not Implementation

// Bad: Testing implementation
it("should set isLoading state to true", async () => {
	const { component } = render(LoginForm);
	await fireEvent.click(getByRole("button"));
	expect(component.isLoading).toBe(true); // Testing internal state!
});

// Good: Testing user experience
it("should show loading indicator while logging in", async () => {
	render(LoginForm);
	await fireEvent.click(getByRole("button", { name: /sign in/i }));
	expect(getByRole("progressbar")).toBeInTheDocument();
});

Use Accessible Queries

Query elements the way users find them:

// Priority order (best to worst):
getByRole("button", { name: /submit/i }); // How screen readers see it
getByLabelText("Email"); // Form fields
getByText("Welcome back"); // Visible text
getByTestId("login-form"); // Last resort

Don't Over-Mock

Mocks remove confidence in the integration. Use them sparingly:

// Over-mocked: False confidence
vi.mock("./api");
vi.mock("./validation");
vi.mock("./utils");
// You're testing... nothing real

// Better: Mock at boundaries
vi.mock("./external-api"); // Mock the network, not your code
// Let validation, utils, etc. run for real

Rule of thumb: If you're mocking something you wrote, reconsider.

When Tests Break

Tests that break are telling you something. Listen.

Good Breaks (Expected)

Feature changed — Test caught that behavior shifted. Update the test.
Bug fixed — Old test was wrong. Fix it.
Requirement changed — Test reflects old requirement. Update it.

Bad Breaks (Symptoms of Poor Tests)

Refactored internal code — Test was coupled to implementation. Rewrite it.
Changed CSS class — Test was querying implementation details. Use accessible queries.
Reordered code — Test depended on execution order. Make it order-independent.

If refactoring frequently breaks tests, your tests are testing the wrong things.

The Bug → Test Pipeline

Every production bug should become a test:

Bug reported — User can't check out with certain items
Reproduce locally — Find the exact conditions
Write failing test — Captures the bug's conditions
Fix the bug — Test now passes
Test prevents regression — Bug can never return

This is one of the highest-value testing practices. It turns pain into protection.

Anti-Patterns to Avoid

The Ice Cream Cone

        ╭───────────────────────────╮
        │      Many E2E tests       │  ← Slow, brittle, expensive
        ╰───────────┬───────────────╯
              ╭─────┴─────╮
              │ Few int.  │
              ╰─────┬─────╯
                ╭───┴───╮
                │ Few   │
                │ unit  │
                ╰───────╯

This is backwards. E2E tests are expensive. Integration tests give the best ROI.

Testing Implementation Details

// Testing implementation (bad)
expect(component.state.items).toHaveLength(3);
expect(handleClick).toHaveBeenCalledWith({ id: 1 });

// Testing behavior (good)
expect(getByRole("list").children).toHaveLength(3);
expect(getByText("Item added!")).toBeInTheDocument();

Coverage Theater

Chasing 100% coverage leads to bad tests:

// Written only to hit coverage, provides zero value
it("should have properties", () => {
	const user = new User();
	expect(user.email).toBeDefined();
	expect(user.name).toBeDefined();
});

Coverage is a signal , not a goal. High coverage with bad tests is worse than moderate coverage with good tests.

Snapshot Abuse

Snapshots are useful for:

Complex serialized output
Error message formatting
API response shapes

Snapshots are harmful for:

UI components (break on every style change)
Anything with timestamps or random IDs
Large objects (nobody reviews 500-line snapshot diffs)

The Grove Testing Workflow

When asked to add tests, follow this workflow:

1. Understand the Feature

What does this feature do for users? Not how it's implemented—what value does it provide?

2. Identify Critical Paths

What would break if this feature failed? Those are your test cases.

3. Write Integration Tests First

Start with tests that exercise real user behavior. Add unit tests only for complex logic.

4. Keep Tests Close to Code

src/
└── lib/
    └── features/
        └── auth/
            ├── login.ts
            ├── login.test.ts      ← Right next to the code
            └── register.ts

5. Run Tests Continuously

npx vitest              # Watch mode during development
npx vitest run          # CI verification

Quick Decision Guide

Situation	Action
New feature	Write integration tests for user-facing behavior
Bug fix	Write test that reproduces bug first, then fix
Refactoring	Run existing tests; if they break on safe changes, they're bad tests
"Need more coverage"	Add tests for uncovered behavior , not uncovered lines
Pure function/algorithm	Unit test it
API endpoint	Integration test with mocked external services
UI component	Component test with Testing Library
Critical user flow	E2E test with Playwright

Integration with Other Skills

javascript-testing

Use javascript-testing for:

Vitest configuration syntax
Mocking patterns and APIs
Assertion reference
SvelteKit-specific test patterns

grove-documentation

When writing test descriptions, follow Grove voice:

Clear, direct names
No jargon
Say what the user experiences

code-quality

Run linting and type checking before/after writing tests. Static analysis catches different bugs than tests do.

Self-Review Checklist

Before considering tests "done":

Tests describe user behavior, not implementation
Each test has one clear reason to fail
Tests use accessible queries (getByRole, getByLabelText)
Mocks are limited to external boundaries
Test names explain what breaks when they fail
No snapshot tests for volatile content
Bug fixes include regression tests
Tests run fast (seconds, not minutes)

Good tests let you ship with confidence. That's the whole point.

Weekly Installs

Repository

autumnsgrove/groveengine

GitHub Stars

First Seen

Feb 5, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

opencode48

gemini-cli48

codex48

github-copilot47

amp47

cline47

Vue 3 调试指南：解决响应式、计算属性与监听器常见错误

12,200 周安装

行为敏感	当实际功能损坏时失败
结构免疫	在你安全重构时不会中断
确定性	每次结果相同，没有不稳定性
快速	在几秒钟内给出反馈，而不是几分钟
诊断清晰	当它失败时，你确切知道是什么坏了
编写成本低	工作量与代码复杂度成比例