e2e-tests-studio by mastra-ai/mastra
npx skills add https://github.com/mastra-ai/mastra --skill e2e-tests-studio关键:测试必须验证产品功能是否正常工作,而不仅仅是 UI 元素是否渲染。
需要 Playwright MCP 服务器。如果 browser_navigate 工具不可用,请指示用户添加它:
claude mcp add playwright -- npx @playwright/mcp@latest
在编写任何测试之前,请回答以下问题:
将这些答案作为注释记录在测试文件中。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev
验证服务器是否在 http://localhost:4111 运行。
| 功能类别 | 测试内容 | 示例断言 |
|---|---|---|
| 智能体配置 | 配置更改影响智能体行为 | 发送消息 → 验证响应使用所选模型 |
| LLM 提供商选择 | 所选提供商在请求中使用 | 拦截 API 调用 → 验证请求负载中的提供商 |
| 工具执行 | 工具使用正确参数运行并返回结果 | 执行工具 → 验证输出匹配预期转换 |
| 工作流执行 | 步骤按顺序执行,数据在步骤间流动 | 运行工作流 → 验证每个步骤的输出馈送到下一步 |
| 聊天/流式传输 | 消息持久化,上下文在多轮对话中保持 | 多轮对话 → 验证上下文感知能力 |
| MCP 服务器工具 | 服务器工具可调用并返回数据 | 调用 MCP 工具 → 验证响应结构和内容 |
| 内存/持久化 | 数据在页面重新加载后保留 | 创建项目 → 重新加载 → 验证项目存在 |
| 错误处理 | 错误正确呈现给用户 | 触发错误条件 → 验证错误消息 + 恢复 |
import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';
/**
* 功能:[功能名称]
* 用户故事:作为用户,我希望[执行操作]以便[达成结果]
* 测试中的行为:[正在验证的特定行为]
*/
test.describe('[功能名称] - 行为测试', () => {
let page: Page;
test.beforeEach(async ({ browser }) => {
const context = await browser.newContext();
page = await context.newPage();
});
test.afterEach(async () => {
await resetStorage(page);
});
test('当[触发条件]时,应该[描述行为的动词]', async () => {
// 准备:设置前置条件
// - 导航到功能页面
// - 配置任何必需的状态
// 执行:执行触发行为的用户操作
// 断言:验证结果,而非 UI 状态
// - 检查数据持久化
// - 验证下游影响
// - 确认 API 调用正确进行
});
});
test('选择 LLM 提供商应将该提供商用于智能体响应', async () => {
// 准备
await page.goto('/agents/my-agent/chat');
// 拦截 API 以验证提供商
let capturedProvider: string | null = null;
await page.route('**/api/chat', route => {
const body = JSON.parse(route.request().postData() || '{}');
capturedProvider = body.provider;
route.continue();
});
// 执行:选择不同的提供商
await page.getByTestId('provider-selector').click();
await page.getByRole('option', { name: 'OpenAI' }).click();
// 发送消息以触发智能体
await page.getByTestId('chat-input').fill('Hello');
await page.getByTestId('send-button').click();
// 断言:验证使用了选定的提供商
await expect.poll(() => capturedProvider).toBe('openai');
});
test('创建的智能体应在页面重新加载后持久化', async () => {
// 准备
await page.goto('/agents');
const agentName = `Test Agent ${nanoid()}`;
// 执行:创建新智能体
await page.getByTestId('create-agent-button').click();
await page.getByTestId('agent-name-input').fill(agentName);
await page.getByTestId('save-agent-button').click();
// 等待创建完成
await expect(page.getByText(agentName)).toBeVisible();
// 断言:验证持久化
await page.reload();
await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
});
test('天气工具应返回格式化的天气数据', async () => {
// 准备
await selectFixture(page, 'weather-success');
await page.goto('/tools/weather-tool');
// 执行:使用参数执行工具
await page.getByTestId('param-city').fill('San Francisco');
await page.getByTestId('execute-tool-button').click();
// 断言:验证输出内容,而不仅仅是输出出现
const output = page.getByTestId('tool-output');
await expect(output).toContainText('temperature');
await expect(output).toContainText('San Francisco');
// 如果适用,验证结构化数据
const outputText = await output.textContent();
const outputData = JSON.parse(outputText || '{}');
expect(outputData).toHaveProperty('temperature');
expect(outputData).toHaveProperty('conditions');
});
test('工作流应在步骤间正确传递数据', async () => {
// 准备
await selectFixture(page, 'workflow-multi-step');
const sessionId = nanoid();
await page.goto(`/workflows/data-pipeline?session=${sessionId}`);
// 执行:触发工作流执行
await page.getByTestId('workflow-input').fill('test input data');
await page.getByTestId('run-workflow-button').click();
// 断言:验证每个步骤从上一步接收到正确的输入
// 等待完成
await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });
// 检查步骤输出显示数据转换链
const step1Output = await page.getByTestId('step-1-output').textContent();
const step2Output = await page.getByTestId('step-2-output').textContent();
// 验证步骤 2 将步骤 1 的输出作为输入接收
expect(step2Output).toContain(step1Output);
});
test('聊天应在多条消息间保持对话上下文', async () => {
// 准备
await selectFixture(page, 'contextual-chat');
const chatId = nanoid();
await page.goto(`/agents/assistant/chat/${chatId}`);
// 执行:多轮对话
await page.getByTestId('chat-input').fill('My name is Alice');
await page.getByTestId('send-button').click();
await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });
await page.getByTestId('chat-input').fill('What is my name?');
await page.getByTestId('send-button').click();
// 断言:验证上下文得以保持
const response = page.getByTestId('assistant-message').last();
await expect(response).toContainText('Alice', { timeout: 20000 });
});
test('当 API 失败时应显示可操作的错误并允许重试', async () => {
// 准备:设置失败夹具
await selectFixture(page, 'api-failure');
await page.goto('/tools/flaky-tool');
// 执行:触发错误
await page.getByTestId('execute-tool-button').click();
// 断言:显示带有恢复选项的错误
await expect(page.getByTestId('error-message')).toContainText('failed');
await expect(page.getByTestId('retry-button')).toBeVisible();
// 切换到成功夹具并重试
await selectFixture(page, 'api-success');
await page.getByTestId('retry-button').click();
// 验证恢复有效
await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
await expect(page.getByTestId('error-message')).not.toBeVisible();
});
当测试文件已存在时:
重构前(UI 导向):
test('dropdown opens when clicked', async () => {
await page.getByTestId('model-dropdown').click();
await expect(page.getByRole('listbox')).toBeVisible();
});
重构后(行为导向):
test('从下拉菜单中选择模型会更新智能体配置', async () => {
// 打开下拉菜单并选择模型
await page.getByTestId('model-dropdown').click();
await page.getByRole('option', { name: 'GPT-4' }).click();
// 验证选择持久化并影响行为
await page.reload();
await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');
// 可选:验证模型在实际请求中使用
//(通过请求拦截或检查响应元数据)
});
夹具应代表真实场景,而不仅仅是模拟数据:
<功能>-<场景>.fixture.ts
示例:
- agent-with-tools.fixture.ts
- chat-multi-turn-context.fixture.ts
- workflow-parallel-execution.fixture.ts
- tool-validation-error.fixture.ts
- mcp-server-timeout.fixture.ts
每个夹具必须定义:
// fixtures/agent-provider-switch.fixture.ts
export const agentProviderSwitch = {
name: 'agent-provider-switch',
description: '测试切换 LLM 提供商会改变智能体行为',
// 不同提供商的模拟响应
responses: {
openai: { content: 'Response from OpenAI', model: 'gpt-4' },
anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
},
expectedBehavior: {
// 当提供商切换时,后续消息使用新提供商
providerSwitchAffectsNextMessage: true,
// 提供商选择在页面重新加载后持久化
providerPersistsOnReload: true,
},
};
cd packages/playground && pnpm test:e2e
在认为测试完成之前,请验证:
page.reload() 验证持久化| 步骤 | 命令/操作 |
|---|---|
| 构建 | pnpm build:cli |
| 启动 | cd packages/playground/e2e/kitchen-sink && pnpm dev |
| 应用 URL | http://localhost:4111 |
| 路由 | @packages/playground/src/App.tsx |
| 运行测试 | cd packages/playground && pnpm test:e2e |
| 测试目录 | packages/playground/e2e/tests/ |
| 夹具 | packages/playground/e2e/kitchen-sink/fixtures/ |
| ❌ 不要 | ✅ 应改为 |
|---|---|
| 测试模态框是否打开 | 测试模态框操作完成并持久化 |
| 测试按钮是否可点击 | 测试点击按钮产生预期结果 |
| 测试加载指示器是否出现 | 测试加载的数据正确 |
| 测试表单验证消息是否显示 | 测试无效表单无法提交且有效表单成功 |
| 测试下拉菜单是否有选项 | 测试选择选项会改变系统行为 |
| 测试侧边栏导航是否有效 | 测试导航到的页面具有正确的数据/功能 |
| 断言元素可见 | 断言元素包含预期的数据/状态 |
每周安装量
148
仓库
GitHub 星标数
22.2K
首次出现
2026 年 1 月 26 日
安全审计
安装于
opencode141
gemini-cli137
codex136
github-copilot134
claude-code131
cursor129
CRITICAL : Tests must verify that product features WORK correctly, not just that UI elements render.
Requires Playwright MCP server. If the browser_navigate tool is unavailable, instruct the user to add it:
claude mcp add playwright -- npx @playwright/mcp@latest
Before writing ANY test, answer these questions:
Document these answers as comments in your test file.
pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev
Verify server at http://localhost:4111
| Feature Category | What to Test | Example Assertion |
|---|---|---|
| Agent Configuration | Config changes affect agent behavior | Send message → verify response uses selected model |
| LLM Provider Selection | Selected provider is used in requests | Intercept API call → verify provider in request payload |
| Tool Execution | Tool runs with correct params & returns result | Execute tool → verify output matches expected transformation |
| Workflow Execution | Steps execute in order, data flows between steps | Run workflow → verify each step's output feeds next step |
| Chat/Streaming | Messages persist, context maintained across turns | Multi-turn conversation → verify context awareness |
| MCP Server Tools | Server tools are callable and return data | Call MCP tool → verify response structure and content |
| Memory/Persistence |
import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';
/**
* FEATURE: [Name of feature]
* USER STORY: As a user, I want to [action] so that [outcome]
* BEHAVIOR UNDER TEST: [Specific behavior being validated]
*/
test.describe('[Feature Name] - Behavior Tests', () => {
let page: Page;
test.beforeEach(async ({ browser }) => {
const context = await browser.newContext();
page = await context.newPage();
});
test.afterEach(async () => {
await resetStorage(page);
});
test('should [verb describing behavior] when [trigger condition]', async () => {
// ARRANGE: Set up preconditions
// - Navigate to the feature
// - Configure any required state
// ACT: Perform the user action that triggers the behavior
// ASSERT: Verify the OUTCOME, not the UI state
// - Check data persistence
// - Verify downstream effects
// - Confirm API calls made correctly
});
});
test('selecting LLM provider should use that provider for agent responses', async () => {
// ARRANGE
await page.goto('/agents/my-agent/chat');
// Intercept API to verify provider
let capturedProvider: string | null = null;
await page.route('**/api/chat', route => {
const body = JSON.parse(route.request().postData() || '{}');
capturedProvider = body.provider;
route.continue();
});
// ACT: Select a different provider
await page.getByTestId('provider-selector').click();
await page.getByRole('option', { name: 'OpenAI' }).click();
// Send a message to trigger the agent
await page.getByTestId('chat-input').fill('Hello');
await page.getByTestId('send-button').click();
// ASSERT: Verify the selected provider was used
await expect.poll(() => capturedProvider).toBe('openai');
});
test('created agent should persist after page reload', async () => {
// ARRANGE
await page.goto('/agents');
const agentName = `Test Agent ${nanoid()}`;
// ACT: Create new agent
await page.getByTestId('create-agent-button').click();
await page.getByTestId('agent-name-input').fill(agentName);
await page.getByTestId('save-agent-button').click();
// Wait for creation to complete
await expect(page.getByText(agentName)).toBeVisible();
// ASSERT: Verify persistence
await page.reload();
await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
});
test('weather tool should return formatted weather data', async () => {
// ARRANGE
await selectFixture(page, 'weather-success');
await page.goto('/tools/weather-tool');
// ACT: Execute tool with parameters
await page.getByTestId('param-city').fill('San Francisco');
await page.getByTestId('execute-tool-button').click();
// ASSERT: Verify OUTPUT content, not just that output appears
const output = page.getByTestId('tool-output');
await expect(output).toContainText('temperature');
await expect(output).toContainText('San Francisco');
// Verify structured data if applicable
const outputText = await output.textContent();
const outputData = JSON.parse(outputText || '{}');
expect(outputData).toHaveProperty('temperature');
expect(outputData).toHaveProperty('conditions');
});
test('workflow should pass data between steps correctly', async () => {
// ARRANGE
await selectFixture(page, 'workflow-multi-step');
const sessionId = nanoid();
await page.goto(`/workflows/data-pipeline?session=${sessionId}`);
// ACT: Trigger workflow execution
await page.getByTestId('workflow-input').fill('test input data');
await page.getByTestId('run-workflow-button').click();
// ASSERT: Verify each step received correct input from previous step
// Wait for completion
await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });
// Check step outputs show data transformation chain
const step1Output = await page.getByTestId('step-1-output').textContent();
const step2Output = await page.getByTestId('step-2-output').textContent();
// Verify step 2 received step 1's output as input
expect(step2Output).toContain(step1Output);
});
test('chat should maintain conversation context across messages', async () => {
// ARRANGE
await selectFixture(page, 'contextual-chat');
const chatId = nanoid();
await page.goto(`/agents/assistant/chat/${chatId}`);
// ACT: Multi-turn conversation
await page.getByTestId('chat-input').fill('My name is Alice');
await page.getByTestId('send-button').click();
await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });
await page.getByTestId('chat-input').fill('What is my name?');
await page.getByTestId('send-button').click();
// ASSERT: Verify context was maintained
const response = page.getByTestId('assistant-message').last();
await expect(response).toContainText('Alice', { timeout: 20000 });
});
test('should show actionable error and allow retry when API fails', async () => {
// ARRANGE: Set up failure fixture
await selectFixture(page, 'api-failure');
await page.goto('/tools/flaky-tool');
// ACT: Trigger the error
await page.getByTestId('execute-tool-button').click();
// ASSERT: Error is shown with recovery option
await expect(page.getByTestId('error-message')).toContainText('failed');
await expect(page.getByTestId('retry-button')).toBeVisible();
// Switch to success fixture and retry
await selectFixture(page, 'api-success');
await page.getByTestId('retry-button').click();
// Verify recovery worked
await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
await expect(page.getByTestId('error-message')).not.toBeVisible();
});
When a test file already exists:
BEFORE (UI-focused):
test('dropdown opens when clicked', async () => {
await page.getByTestId('model-dropdown').click();
await expect(page.getByRole('listbox')).toBeVisible();
});
AFTER (Behavior-focused):
test('selecting model from dropdown updates agent configuration', async () => {
// Open dropdown and select model
await page.getByTestId('model-dropdown').click();
await page.getByRole('option', { name: 'GPT-4' }).click();
// Verify the selection persists and affects behavior
await page.reload();
await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');
// Optionally: verify the model is used in actual requests
// (via request interception or checking response metadata)
});
Fixtures should represent realistic scenarios , not just mock data:
<feature>-<scenario>.fixture.ts
Examples:
- agent-with-tools.fixture.ts
- chat-multi-turn-context.fixture.ts
- workflow-parallel-execution.fixture.ts
- tool-validation-error.fixture.ts
- mcp-server-timeout.fixture.ts
Each fixture must define:
// fixtures/agent-provider-switch.fixture.ts
export const agentProviderSwitch = {
name: 'agent-provider-switch',
description: 'Tests that switching LLM providers changes agent behavior',
// Mock responses for different providers
responses: {
openai: { content: 'Response from OpenAI', model: 'gpt-4' },
anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
},
expectedBehavior: {
// When provider is switched, subsequent messages use new provider
providerSwitchAffectsNextMessage: true,
// Provider selection persists across page reload
providerPersistsOnReload: true,
},
};
cd packages/playground && pnpm test:e2e
Before considering tests complete, verify:
page.reload() where applicable| Step | Command/Action |
|---|---|
| Build | pnpm build:cli |
| Start | cd packages/playground/e2e/kitchen-sink && pnpm dev |
| App URL | http://localhost:4111 |
| Routes | @packages/playground/src/App.tsx |
| Run tests | cd packages/playground && pnpm test:e2e |
| Test dir | packages/playground/e2e/tests/ |
| ❌ Don't | ✅ Do Instead |
|---|---|
| Test that modal opens | Test that modal action completes and persists |
| Test that button is clickable | Test that clicking button produces expected result |
| Test loading spinner appears | Test that loaded data is correct |
| Test form validation message shows | Test that invalid form cannot submit AND valid form succeeds |
| Test dropdown has options | Test that selecting option changes system behavior |
| Test sidebar navigation works | Test that navigated page has correct data/functionality |
| Assert element is visible | Assert element contains expected data/state |
Weekly Installs
148
Repository
GitHub Stars
22.2K
First Seen
Jan 26, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode141
gemini-cli137
codex136
github-copilot134
claude-code131
cursor129
Skills CLI 使用指南:AI Agent 技能包管理器安装与管理教程
36,300 周安装
| Data survives page reload |
| Create item → reload → verify item exists |
| Error Handling | Errors surface correctly to user | Trigger error condition → verify error message + recovery |
| Fixtures | packages/playground/e2e/kitchen-sink/fixtures/ |