ai-regression-testing by affaan-m/everything-claude-code
npx skills add https://github.com/affaan-m/everything-claude-code --skill ai-regression-testing专门为 AI 辅助开发设计的测试模式,在这种模式下,同一个模型既编写代码又审查代码——这会产生系统性的盲点,只有自动化测试才能发现。
/bug-check 或类似的审查命令当 AI 编写代码然后审查自己的工作时,它会将相同的假设带入两个步骤。这造成了一种可预测的失败模式:
AI 编写修复 → AI 审查修复 → AI 说"看起来正确" → 错误仍然存在
真实示例(在生产环境中观察到):
修复 1:将 notification_settings 添加到 API 响应
→ 忘记将其添加到 SELECT 查询中
→ AI 审查并遗漏了(相同的盲点)
修复 2:将其添加到 SELECT 查询
→ TypeScript 构建错误(列不在生成的类型中)
→ AI 审查了修复 1 但未发现 SELECT 问题
修复 3:改为 SELECT *
→ 修复了生产路径,忘记了沙盒路径
→ AI 审查并再次遗漏(第 4 次出现)
修复 4:测试在第一次运行时立即捕获 ✅
模式:沙盒/生产路径不一致是 AI 引入回归问题的首要原因。
大多数具有 AI 友好架构的项目都有沙盒/模拟模式。这是实现快速、无需数据库的 API 测试的关键。
// vitest.config.ts
import { defineConfig } from "vitest/config";
import path from "path";
export default defineConfig({
test: {
environment: "node",
globals: true,
include: ["__tests__/**/*.test.ts"],
setupFiles: ["__tests__/setup.ts"],
},
resolve: {
alias: {
"@": path.resolve(__dirname, "."),
},
},
});
// __tests__/setup.ts
// 强制沙盒模式 — 无需数据库
process.env.SANDBOX_MODE = "true";
process.env.NEXT_PUBLIC_SUPABASE_URL = "";
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY = "";
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
// __tests__/helpers.ts
import { NextRequest } from "next/server";
export function createTestRequest(
url: string,
options?: {
method?: string;
body?: Record<string, unknown>;
headers?: Record<string, string>;
sandboxUserId?: string;
},
): NextRequest {
const { method = "GET", body, headers = {}, sandboxUserId } = options || {};
const fullUrl = url.startsWith("http") ? url : `http://localhost:3000${url}`;
const reqHeaders: Record<string, string> = { ...headers };
if (sandboxUserId) {
reqHeaders["x-sandbox-user-id"] = sandboxUserId;
}
const init: { method: string; headers: Record<string, string>; body?: string } = {
method,
headers: reqHeaders,
};
if (body) {
init.body = JSON.stringify(body);
reqHeaders["content-type"] = "application/json";
}
return new NextRequest(fullUrl, init);
}
export async function parseResponse(response: Response) {
const json = await response.json();
return { status: response.status, json };
}
关键原则:为发现的错误编写测试,而不是为正常工作的代码编写测试。
// __tests__/api/user/profile.test.ts
import { describe, it, expect } from "vitest";
import { createTestRequest, parseResponse } from "../../helpers";
import { GET, PATCH } from "@/app/api/user/profile/route";
// 定义契约 — 响应中必须包含哪些字段
const REQUIRED_FIELDS = [
"id",
"email",
"full_name",
"phone",
"role",
"created_at",
"avatar_url",
"notification_settings", // ← 发现错误后添加
];
describe("GET /api/user/profile", () => {
it("returns all required fields", async () => {
const req = createTestRequest("/api/user/profile");
const res = await GET(req);
const { status, json } = await parseResponse(res);
expect(status).toBe(200);
for (const field of REQUIRED_FIELDS) {
expect(json.data).toHaveProperty(field);
}
});
// 回归测试 — 这个确切的错误被 AI 引入了 4 次
it("notification_settings is not undefined (BUG-R1 regression)", async () => {
const req = createTestRequest("/api/user/profile");
const res = await GET(req);
const { json } = await parseResponse(res);
expect("notification_settings" in json.data).toBe(true);
const ns = json.data.notification_settings;
expect(ns === null || typeof ns === "object").toBe(true);
});
});
最常见的 AI 回归问题:修复了生产路径但忘记了沙盒路径(反之亦然)。
// 测试沙盒响应是否符合预期的契约
describe("GET /api/user/messages (conversation list)", () => {
it("includes partner_name in sandbox mode", async () => {
const req = createTestRequest("/api/user/messages", {
sandboxUserId: "user-001",
});
const res = await GET(req);
const { json } = await parseResponse(res);
// 这捕获了一个错误:partner_name 被添加到了生产路径但未添加到沙盒路径
if (json.data.length > 0) {
for (const conv of json.data) {
expect("partner_name" in conv).toBe(true);
}
}
});
});
<!-- .claude/commands/bug-check.md -->
# Bug Check
## Step 1: Automated Tests (mandatory, cannot skip)
Run these commands FIRST before any code review:
npm run test # Vitest test suite
npm run build # TypeScript type check + build
- If tests fail → report as highest priority bug
- If build fails → report type errors as highest priority
- Only proceed to Step 2 if both pass
## Step 2: Code Review (AI review)
1. Sandbox / production path consistency
2. API response shape matches frontend expectations
3. SELECT clause completeness
4. Error handling with rollback
5. Optimistic update race conditions
## Step 3: For each bug fixed, propose a regression test
User: "バグチェックして" (or "/bug-check")
│
├─ Step 1: npm run test
│ ├─ FAIL → Bug found mechanically (no AI judgment needed)
│ └─ PASS → Continue
│
├─ Step 2: npm run build
│ ├─ FAIL → Type error found mechanically
│ └─ PASS → Continue
│
├─ Step 3: AI code review (with known blind spots in mind)
│ └─ Findings reported
│
└─ Step 4: For each fix, write a regression test
└─ Next bug-check catches if fix breaks
频率:最常见(在 4 次回归中观察到 3 次)
// ❌ AI 仅将字段添加到生产路径
if (isSandboxMode()) {
return { data: { id, email, name } }; // 缺少新字段
}
// Production path
return { data: { id, email, name, notification_settings } };
// ✅ 两条路径必须返回相同的结构
if (isSandboxMode()) {
return { data: { id, email, name, notification_settings: null } };
}
return { data: { id, email, name, notification_settings } };
用于捕获它的测试:
it("sandbox and production return same fields", async () => {
// 在测试环境中,沙盒模式被强制开启
const res = await GET(createTestRequest("/api/user/profile"));
const { json } = await parseResponse(res);
for (const field of REQUIRED_FIELDS) {
expect(json.data).toHaveProperty(field);
}
});
频率:在使用 Supabase/Prisma 添加新列时常见
// ❌ 新列已添加到响应但未添加到 SELECT
const { data } = await supabase
.from("users")
.select("id, email, name") // notification_settings 不在这里
.single();
return { data: { ...data, notification_settings: data.notification_settings } };
// → notification_settings 始终是 undefined
// ✅ 使用 SELECT * 或显式包含新列
const { data } = await supabase
.from("users")
.select("*")
.single();
频率:中等——当向现有组件添加错误处理时
// ❌ 设置了错误状态但未清除旧数据
catch (err) {
setError("Failed to load");
// reservations 仍然显示上一个标签页的数据!
}
// ✅ 出错时清除相关状态
catch (err) {
setReservations([]); // 清除陈旧数据
setError("Failed to load");
}
// ❌ 失败时没有回滚
const handleRemove = async (id: string) => {
setItems(prev => prev.filter(i => i.id !== id));
await fetch(`/api/items/${id}`, { method: "DELETE" });
// 如果 API 失败,项目从 UI 中消失但仍存在于数据库中
};
// ✅ 捕获先前状态并在失败时回滚
const handleRemove = async (id: string) => {
const prevItems = [...items];
setItems(prev => prev.filter(i => i.id !== id));
try {
const res = await fetch(`/api/items/${id}`, { method: "DELETE" });
if (!res.ok) throw new Error("API error");
} catch {
setItems(prevItems); // 回滚
alert("削除に失敗しました");
}
};
不要追求 100% 的覆盖率。而是:
在 /api/user/profile 中发现错误 → 为 profile API 编写测试
在 /api/user/messages 中发现错误 → 为 messages API 编写测试
在 /api/user/favorites 中发现错误 → 为 favorites API 编写测试
在 /api/user/notifications 中未发现错误 → 不编写测试(暂时)
为什么这在 AI 开发中有效:
| AI 回归模式 | 测试策略 | 优先级 |
|---|---|---|
| 沙盒/生产环境不匹配 | 断言沙盒模式下响应结构相同 | 🔴 高 |
| SELECT 子句遗漏 | 断言响应中包含所有必需字段 | 🔴 高 |
| 错误状态泄漏 | 断言出错时状态被清理 | 🟡 中 |
| 缺少回滚 | 断言 API 失败时状态被恢复 | 🟡 中 |
| 类型转换掩盖 null | 断言字段不是 undefined | 🟡 中 |
该做:
不该做:
每周安装量
332
仓库
GitHub 星标数
102.1K
首次出现
8 天前
安全审计
安装于
codex317
cursor284
github-copilot281
gemini-cli281
opencode281
cline281
Testing patterns specifically designed for AI-assisted development, where the same model writes code and reviews it — creating systematic blind spots that only automated tests can catch.
/bug-check or similar review commands after code changesWhen an AI writes code and then reviews its own work, it carries the same assumptions into both steps. This creates a predictable failure pattern:
AI writes fix → AI reviews fix → AI says "looks correct" → Bug still exists
Real-world example (observed in production):
Fix 1: Added notification_settings to API response
→ Forgot to add it to the SELECT query
→ AI reviewed and missed it (same blind spot)
Fix 2: Added it to SELECT query
→ TypeScript build error (column not in generated types)
→ AI reviewed Fix 1 but didn't catch the SELECT issue
Fix 3: Changed to SELECT *
→ Fixed production path, forgot sandbox path
→ AI reviewed and missed it AGAIN (4th occurrence)
Fix 4: Test caught it instantly on first run ✅
The pattern: sandbox/production path inconsistency is the #1 AI-introduced regression.
Most projects with AI-friendly architecture have a sandbox/mock mode. This is the key to fast, DB-free API testing.
// vitest.config.ts
import { defineConfig } from "vitest/config";
import path from "path";
export default defineConfig({
test: {
environment: "node",
globals: true,
include: ["__tests__/**/*.test.ts"],
setupFiles: ["__tests__/setup.ts"],
},
resolve: {
alias: {
"@": path.resolve(__dirname, "."),
},
},
});
// __tests__/setup.ts
// Force sandbox mode — no database needed
process.env.SANDBOX_MODE = "true";
process.env.NEXT_PUBLIC_SUPABASE_URL = "";
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY = "";
// __tests__/helpers.ts
import { NextRequest } from "next/server";
export function createTestRequest(
url: string,
options?: {
method?: string;
body?: Record<string, unknown>;
headers?: Record<string, string>;
sandboxUserId?: string;
},
): NextRequest {
const { method = "GET", body, headers = {}, sandboxUserId } = options || {};
const fullUrl = url.startsWith("http") ? url : `http://localhost:3000${url}`;
const reqHeaders: Record<string, string> = { ...headers };
if (sandboxUserId) {
reqHeaders["x-sandbox-user-id"] = sandboxUserId;
}
const init: { method: string; headers: Record<string, string>; body?: string } = {
method,
headers: reqHeaders,
};
if (body) {
init.body = JSON.stringify(body);
reqHeaders["content-type"] = "application/json";
}
return new NextRequest(fullUrl, init);
}
export async function parseResponse(response: Response) {
const json = await response.json();
return { status: response.status, json };
}
The key principle: write tests for bugs that were found, not for code that works.
// __tests__/api/user/profile.test.ts
import { describe, it, expect } from "vitest";
import { createTestRequest, parseResponse } from "../../helpers";
import { GET, PATCH } from "@/app/api/user/profile/route";
// Define the contract — what fields MUST be in the response
const REQUIRED_FIELDS = [
"id",
"email",
"full_name",
"phone",
"role",
"created_at",
"avatar_url",
"notification_settings", // ← Added after bug found it missing
];
describe("GET /api/user/profile", () => {
it("returns all required fields", async () => {
const req = createTestRequest("/api/user/profile");
const res = await GET(req);
const { status, json } = await parseResponse(res);
expect(status).toBe(200);
for (const field of REQUIRED_FIELDS) {
expect(json.data).toHaveProperty(field);
}
});
// Regression test — this exact bug was introduced by AI 4 times
it("notification_settings is not undefined (BUG-R1 regression)", async () => {
const req = createTestRequest("/api/user/profile");
const res = await GET(req);
const { json } = await parseResponse(res);
expect("notification_settings" in json.data).toBe(true);
const ns = json.data.notification_settings;
expect(ns === null || typeof ns === "object").toBe(true);
});
});
The most common AI regression: fixing production path but forgetting sandbox path (or vice versa).
// Test that sandbox responses match the expected contract
describe("GET /api/user/messages (conversation list)", () => {
it("includes partner_name in sandbox mode", async () => {
const req = createTestRequest("/api/user/messages", {
sandboxUserId: "user-001",
});
const res = await GET(req);
const { json } = await parseResponse(res);
// This caught a bug where partner_name was added
// to production path but not sandbox path
if (json.data.length > 0) {
for (const conv of json.data) {
expect("partner_name" in conv).toBe(true);
}
}
});
});
<!-- .claude/commands/bug-check.md -->
# Bug Check
## Step 1: Automated Tests (mandatory, cannot skip)
Run these commands FIRST before any code review:
npm run test # Vitest test suite
npm run build # TypeScript type check + build
- If tests fail → report as highest priority bug
- If build fails → report type errors as highest priority
- Only proceed to Step 2 if both pass
## Step 2: Code Review (AI review)
1. Sandbox / production path consistency
2. API response shape matches frontend expectations
3. SELECT clause completeness
4. Error handling with rollback
5. Optimistic update race conditions
## Step 3: For each bug fixed, propose a regression test
User: "バグチェックして" (or "/bug-check")
│
├─ Step 1: npm run test
│ ├─ FAIL → Bug found mechanically (no AI judgment needed)
│ └─ PASS → Continue
│
├─ Step 2: npm run build
│ ├─ FAIL → Type error found mechanically
│ └─ PASS → Continue
│
├─ Step 3: AI code review (with known blind spots in mind)
│ └─ Findings reported
│
└─ Step 4: For each fix, write a regression test
└─ Next bug-check catches if fix breaks
Frequency : Most common (observed in 3 out of 4 regressions)
// ❌ AI adds field to production path only
if (isSandboxMode()) {
return { data: { id, email, name } }; // Missing new field
}
// Production path
return { data: { id, email, name, notification_settings } };
// ✅ Both paths must return the same shape
if (isSandboxMode()) {
return { data: { id, email, name, notification_settings: null } };
}
return { data: { id, email, name, notification_settings } };
Test to catch it :
it("sandbox and production return same fields", async () => {
// In test env, sandbox mode is forced ON
const res = await GET(createTestRequest("/api/user/profile"));
const { json } = await parseResponse(res);
for (const field of REQUIRED_FIELDS) {
expect(json.data).toHaveProperty(field);
}
});
Frequency : Common with Supabase/Prisma when adding new columns
// ❌ New column added to response but not to SELECT
const { data } = await supabase
.from("users")
.select("id, email, name") // notification_settings not here
.single();
return { data: { ...data, notification_settings: data.notification_settings } };
// → notification_settings is always undefined
// ✅ Use SELECT * or explicitly include new columns
const { data } = await supabase
.from("users")
.select("*")
.single();
Frequency : Moderate — when adding error handling to existing components
// ❌ Error state set but old data not cleared
catch (err) {
setError("Failed to load");
// reservations still shows data from previous tab!
}
// ✅ Clear related state on error
catch (err) {
setReservations([]); // Clear stale data
setError("Failed to load");
}
// ❌ No rollback on failure
const handleRemove = async (id: string) => {
setItems(prev => prev.filter(i => i.id !== id));
await fetch(`/api/items/${id}`, { method: "DELETE" });
// If API fails, item is gone from UI but still in DB
};
// ✅ Capture previous state and rollback on failure
const handleRemove = async (id: string) => {
const prevItems = [...items];
setItems(prev => prev.filter(i => i.id !== id));
try {
const res = await fetch(`/api/items/${id}`, { method: "DELETE" });
if (!res.ok) throw new Error("API error");
} catch {
setItems(prevItems); // Rollback
alert("削除に失敗しました");
}
};
Don't aim for 100% coverage. Instead:
Bug found in /api/user/profile → Write test for profile API
Bug found in /api/user/messages → Write test for messages API
Bug found in /api/user/favorites → Write test for favorites API
No bug in /api/user/notifications → Don't write test (yet)
Why this works with AI development:
| AI Regression Pattern | Test Strategy | Priority |
|---|---|---|
| Sandbox/production mismatch | Assert same response shape in sandbox mode | 🔴 High |
| SELECT clause omission | Assert all required fields in response | 🔴 High |
| Error state leakage | Assert state cleanup on error | 🟡 Medium |
| Missing rollback | Assert state restored on API failure | 🟡 Medium |
| Type cast masking null | Assert field is not undefined | 🟡 Medium |
DO:
DON'T:
Weekly Installs
332
Repository
GitHub Stars
102.1K
First Seen
8 days ago
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
codex317
cursor284
github-copilot281
gemini-cli281
opencode281
cline281
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
54,900 周安装
LLM提示词缓存优化指南:降低90%成本,实现多级缓存与语义匹配
323 周安装
小红书内容转换器:一键将通用文章转为小红书爆款笔记格式 | AI写作助手
323 周安装
内容摘要AI工具:智能提取YouTube、网页、PDF和推文内容,支持测验学习和深度探索
324 周安装
Notion知识捕获工具 - 将对话笔记自动转化为结构化Notion页面 | 知识管理自动化
324 周安装
现代Angular最佳实践指南:TypeScript严格性、信号响应式、性能优化与测试
324 周安装
iOS VoIP 通话开发:CallKit + PushKit 集成原生通话 UI 指南
324 周安装