重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
performance-testing by proffesor-for-testing/agentic-qe
npx skills add https://github.com/proffesor-for-testing/agentic-qe --skill performance-testing<default_to_action> 测试性能或规划负载测试时:
快速测试类型选择:
关键成功因素:
| 类型 | 目的 | 时机 |
|---|---|---|
| 负载 |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 预期流量 |
| 每次发布 |
| 压力 | 超出容量 | 每季度 |
| 尖峰 | 突发激增 | 事件前 |
| 耐力 | 内存泄漏 | 代码变更后 |
| 可扩展性 | 扩展验证 | 基础设施变更 |
| 指标 | 目标 | 原因 |
|---|---|---|
| p95 响应时间 | < 200ms | 用户体验 |
| 吞吐量 | 10k 请求/分钟 | 容量 |
| 错误率 | < 0.1% | 可靠性 |
| CPU | < 70% | 余量 |
| 内存 | < 80% | 稳定性 |
qe-performance-tester: 负载测试编排qe-quality-analyzer: 结果分析qe-production-intelligence: 生产环境对比差: "系统应该很快" 好: "在 1,000 个并发用户下,p95 响应时间 < 200ms"
export const options = {
thresholds: {
http_req_duration: ['p(95)<200'], // 95% < 200ms
http_req_failed: ['rate<0.01'], // < 1% failures
},
};
差: 每个用户重复访问首页 好: 模拟真实用户行为
// Realistic distribution
// 40% browse, 30% search, 20% details, 10% checkout
export default function () {
const action = Math.random();
if (action < 0.4) browse();
else if (action < 0.7) search();
else if (action < 0.9) viewProduct();
else checkout();
sleep(randomInt(1, 5)); // Think time
}
症状: 负载下查询缓慢,连接池耗尽 修复: 添加索引、优化 N+1 查询、增加池大小、使用只读副本
// BAD: 100 orders = 101 queries
const orders = await Order.findAll();
for (const order of orders) {
const customer = await Customer.findById(order.customerId);
}
// GOOD: 1 query
const orders = await Order.findAll({ include: [Customer] });
问题: 请求路径中的阻塞操作(结账时发送邮件) 修复: 使用消息队列、异步处理、立即返回
检测: 耐力测试、内存分析 常见原因: 事件监听器未清理、缓存无淘汰策略
解决方案: 积极的超时设置、熔断器、缓存、优雅降级
// performance-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '1m', target: 50 }, // Ramp up
{ duration: '3m', target: 50 }, // Steady
{ duration: '1m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<200'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const res = http.get('https://api.example.com/products');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
sleep(1);
}
# GitHub Actions
- name: Run k6 test
uses: grafana/k6-action@v0.3.0
with:
filename: performance-test.js
Load: 1,000 users | p95: 180ms | Throughput: 5,000 req/s
Error rate: 0.05% | CPU: 65% | Memory: 70%
Load: 1,000 users | p95: 3,500ms ❌ | Throughput: 500 req/s ❌
Error rate: 5% ❌ | CPU: 95% ❌ | Memory: 90% ❌
| ❌ 反模式 | ✅ 更好的做法 |
|---|---|
| 测试太晚 | 尽早并经常测试 |
| 不真实的场景 | 模拟真实用户行为 |
| 瞬间从 0 到 1000 用户 | 逐步增加负载 |
| 测试期间无监控 | 监控一切 |
| 无基准 | 建立并跟踪趋势 |
| 一次性测试 | 持续性能测试 |
// Comprehensive load test
await Task("Load Test", {
target: 'https://api.example.com',
scenarios: {
checkout: { vus: 100, duration: '5m' },
search: { vus: 200, duration: '5m' },
browse: { vus: 500, duration: '5m' }
},
thresholds: {
'http_req_duration': ['p(95)<200'],
'http_req_failed': ['rate<0.01']
}
}, "qe-performance-tester");
// Bottleneck analysis
await Task("Analyze Bottlenecks", {
testResults: perfTest,
metrics: ['cpu', 'memory', 'db_queries', 'network']
}, "qe-performance-tester");
// CI integration
await Task("CI Performance Gate", {
mode: 'smoke',
duration: '1m',
vus: 10,
failOn: { 'p95_response_time': 300, 'error_rate': 0.01 }
}, "qe-performance-tester");
aqe/performance/
├── results/* - Test execution results
├── baselines/* - Performance baselines
├── bottlenecks/* - Identified bottlenecks
└── trends/* - Historical trends
const perfFleet = await FleetManager.coordinate({
strategy: 'performance-testing',
agents: [
'qe-performance-tester',
'qe-quality-analyzer',
'qe-production-intelligence',
'qe-deployment-readiness'
],
topology: 'sequential'
});
性能是一项功能: 像测试功能一样测试它 持续测试: 不仅仅在发布前 监控生产环境: 合成监控 + 真实用户监控 修复重要问题: 关注影响用户的瓶颈 跟踪趋势: 及早发现性能退化
使用智能体: 智能体自动化负载测试、分析瓶颈并与生产环境对比。使用智能体来维持大规模的性能。
每次性能测试运行后,将结果追加到此技能目录下的 run-history.json 文件中:
node -e "
const fs = require('fs');
const h = JSON.parse(fs.readFileSync('.claude/skills/performance-testing/run-history.json'));
h.runs.push({date: new Date().toISOString().split('T')[0], scenario: 'load', p95_ms: P95, throughput_rps: RPS, error_rate_pct: ERR});
fs.writeFileSync('.claude/skills/performance-testing/run-history.json', JSON.stringify(h, null, 2));
"
每次运行前读取 run-history.json — 与基准进行比较。如果 p95 比基准增加 >20%,则发出警报。
每周安装次数
59
代码仓库
GitHub 星标数
281
首次出现
2026 年 1 月 24 日
安全审计
安装于
opencode54
gemini-cli54
codex54
github-copilot53
cursor53
amp51
<default_to_action> When testing performance or planning load tests:
Quick Test Type Selection:
Critical Success Factors:
| Type | Purpose | When |
|---|---|---|
| Load | Expected traffic | Every release |
| Stress | Beyond capacity | Quarterly |
| Spike | Sudden surge | Before events |
| Endurance | Memory leaks | After code changes |
| Scalability | Scaling validation | Infrastructure changes |
| Metric | Target | Why |
|---|---|---|
| p95 response | < 200ms | User experience |
| Throughput | 10k req/min | Capacity |
| Error rate | < 0.1% | Reliability |
| CPU | < 70% | Headroom |
| Memory | < 80% | Stability |
qe-performance-tester: Load test orchestrationqe-quality-analyzer: Results analysisqe-production-intelligence: Production comparisonBad: "The system should be fast" Good: "p95 response time < 200ms under 1,000 concurrent users"
export const options = {
thresholds: {
http_req_duration: ['p(95)<200'], // 95% < 200ms
http_req_failed: ['rate<0.01'], // < 1% failures
},
};
Bad: Every user hits homepage repeatedly Good: Model actual user behavior
// Realistic distribution
// 40% browse, 30% search, 20% details, 10% checkout
export default function () {
const action = Math.random();
if (action < 0.4) browse();
else if (action < 0.7) search();
else if (action < 0.9) viewProduct();
else checkout();
sleep(randomInt(1, 5)); // Think time
}
Symptoms: Slow queries under load, connection pool exhaustion Fixes: Add indexes, optimize N+1 queries, increase pool size, read replicas
// BAD: 100 orders = 101 queries
const orders = await Order.findAll();
for (const order of orders) {
const customer = await Customer.findById(order.customerId);
}
// GOOD: 1 query
const orders = await Order.findAll({ include: [Customer] });
Problem: Blocking operations in request path (sending email during checkout) Fix: Use message queues, process async, return immediately
Detection: Endurance testing, memory profiling Common causes: Event listeners not cleaned, caches without eviction
Solutions: Aggressive timeouts, circuit breakers, caching, graceful degradation
// performance-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '1m', target: 50 }, // Ramp up
{ duration: '3m', target: 50 }, // Steady
{ duration: '1m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<200'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const res = http.get('https://api.example.com/products');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
sleep(1);
}
# GitHub Actions
- name: Run k6 test
uses: grafana/k6-action@v0.3.0
with:
filename: performance-test.js
Load: 1,000 users | p95: 180ms | Throughput: 5,000 req/s
Error rate: 0.05% | CPU: 65% | Memory: 70%
Load: 1,000 users | p95: 3,500ms ❌ | Throughput: 500 req/s ❌
Error rate: 5% ❌ | CPU: 95% ❌ | Memory: 90% ❌
| ❌ Anti-Pattern | ✅ Better |
|---|---|
| Testing too late | Test early and often |
| Unrealistic scenarios | Model real user behavior |
| 0 to 1000 users instantly | Ramp up gradually |
| No monitoring during tests | Monitor everything |
| No baseline | Establish and track trends |
| One-time testing | Continuous performance testing |
// Comprehensive load test
await Task("Load Test", {
target: 'https://api.example.com',
scenarios: {
checkout: { vus: 100, duration: '5m' },
search: { vus: 200, duration: '5m' },
browse: { vus: 500, duration: '5m' }
},
thresholds: {
'http_req_duration': ['p(95)<200'],
'http_req_failed': ['rate<0.01']
}
}, "qe-performance-tester");
// Bottleneck analysis
await Task("Analyze Bottlenecks", {
testResults: perfTest,
metrics: ['cpu', 'memory', 'db_queries', 'network']
}, "qe-performance-tester");
// CI integration
await Task("CI Performance Gate", {
mode: 'smoke',
duration: '1m',
vus: 10,
failOn: { 'p95_response_time': 300, 'error_rate': 0.01 }
}, "qe-performance-tester");
aqe/performance/
├── results/* - Test execution results
├── baselines/* - Performance baselines
├── bottlenecks/* - Identified bottlenecks
└── trends/* - Historical trends
const perfFleet = await FleetManager.coordinate({
strategy: 'performance-testing',
agents: [
'qe-performance-tester',
'qe-quality-analyzer',
'qe-production-intelligence',
'qe-deployment-readiness'
],
topology: 'sequential'
});
Performance is a feature: Test it like functionality Test continuously: Not just before launch Monitor production: Synthetic + real user monitoring Fix what matters: Focus on user-impacting bottlenecks Trend over time: Catch degradation early
With Agents: Agents automate load testing, analyze bottlenecks, and compare with production. Use agents to maintain performance at scale.
After each performance test run, append results to run-history.json in this skill directory:
node -e "
const fs = require('fs');
const h = JSON.parse(fs.readFileSync('.claude/skills/performance-testing/run-history.json'));
h.runs.push({date: new Date().toISOString().split('T')[0], scenario: 'load', p95_ms: P95, throughput_rps: RPS, error_rate_pct: ERR});
fs.writeFileSync('.claude/skills/performance-testing/run-history.json', JSON.stringify(h, null, 2));
"
Read run-history.json before each run — compare with baselines. Alert if p95 increases >20% from baseline.
Weekly Installs
59
Repository
GitHub Stars
281
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode54
gemini-cli54
codex54
github-copilot53
cursor53
amp51
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
127,000 周安装