performance-profiler by borghei/claude-skills
npx skills add https://github.com/borghei/claude-skills --skill performance-profiler层级: 强大 类别: 工程 / 性能 维护者: Claude Skills 团队
为 Node.js、Python 和 Go 应用程序提供系统性的性能分析。通过火焰图识别 CPU 瓶颈,通过堆快照检测内存泄漏,分析打包大小,优化数据库查询,检测 N+1 模式,并使用 k6 和 Artillery 进行负载测试。强制执行“先测量”的方法论:建立基线,识别瓶颈,修复,并验证改进。
性能分析,火焰图,内存泄漏,打包分析,N+1 查询,负载测试,k6,延迟,P99,CPU 分析,堆快照,数据库优化
WRONG: "I think the N+1 query is slow, let me fix it"
RIGHT: Profile → Confirm bottleneck → Fix → Measure again → Verify improvement
每次优化必须包含:
1. 基线指标(之前)
2. 分析器证据(实际慢在哪里)
3. 修复方案
4. 修复后指标(之后)
5. 差值计算(改进百分比)
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
# Install
npm install -g clinic
# Generate flamegraph (starts server, applies load, generates HTML report)
clinic flame -- node server.js
# With specific load profile
clinic flame --autocannon [ /api/endpoint -c 10 -d 30 ] -- node server.js
# Analyze specific scenario
clinic flame --on-port 'autocannon -c 50 -d 60 http://localhost:$PORT/api/heavy-endpoint' -- node server.js
# Start Node with inspector
node --inspect server.js
# Or profile on demand
node --cpu-prof --cpu-prof-dir=./profiles server.js
# Load the .cpuprofile file in Chrome DevTools > Performance
# Programmatic profiling of a specific function
const { Session } = require('inspector');
const session = new Session();
session.connect();
session.post('Profiler.enable', () => {
session.post('Profiler.start', () => {
// Run the code you want to profile
runHeavyOperation();
session.post('Profiler.stop', (err, { profile }) => {
require('fs').writeFileSync('profile.cpuprofile', JSON.stringify(profile));
});
});
});
// Take heap snapshots programmatically
const v8 = require('v8');
const fs = require('fs');
function takeHeapSnapshot(label) {
const snapshotPath = `heap-${label}-${Date.now()}.heapsnapshot`;
const stream = v8.writeHeapSnapshot(snapshotPath);
console.log(`Heap snapshot written to: ${snapshotPath}`);
return snapshotPath;
}
// Leak detection pattern: compare two snapshots
// 1. Take snapshot at startup
takeHeapSnapshot('baseline');
// 2. Run operations that you suspect leak
// ... process 1000 requests ...
// 3. Force GC and take another snapshot
if (global.gc) global.gc(); // requires --expose-gc flag
takeHeapSnapshot('after-load');
// Load both .heapsnapshot files in Chrome DevTools > Memory
// Use "Comparison" view to find objects that grew
# Install tracemalloc-based profiler
pip install memray
# Profile a script
memray run my_script.py
memray flamegraph memray-output.bin -o flamegraph.html
# Profile a specific function
python -c "
import tracemalloc
tracemalloc.start()
# Run your code
from my_module import heavy_function
heavy_function()
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print('Top 10 memory allocations:')
for stat in top_stats[:10]:
print(stat)
"
-- Step 1: Get the actual execution plan (not just estimated)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT t.*, p.name as project_name
FROM tasks t
JOIN projects p ON p.id = t.project_id
WHERE p.workspace_id = 'ws_abc123'
AND t.status = 'in_progress'
AND t.deleted_at IS NULL
ORDER BY t.updated_at DESC
LIMIT 20;
-- What to look for in the output:
-- Seq Scan on tasks → MISSING INDEX (should be Index Scan)
-- Rows Removed by Filter: 99000 → INDEX NOT SELECTIVE ENOUGH
-- Sort Method: external merge → NOT ENOUGH work_mem
-- Nested Loop with inner Seq Scan → MISSING INDEX ON JOIN COLUMN
-- Actual rows=1000 vs estimated rows=1 → STALE STATISTICS (run ANALYZE)
// PROBLEM: N+1 query pattern
async function getProjectsWithTasks(workspaceId: string) {
const projects = await db.query.projects.findMany({
where: eq(projects.workspaceId, workspaceId),
});
// This executes N additional queries (one per project)
for (const project of projects) {
project.tasks = await db.query.tasks.findMany({
where: eq(tasks.projectId, project.id),
});
}
return projects;
}
// Total queries: 1 + N (where N = number of projects)
// FIX: Single query with JOIN or relation loading
async function getProjectsWithTasks(workspaceId: string) {
return db.query.projects.findMany({
where: eq(projects.workspaceId, workspaceId),
with: {
tasks: true, // Drizzle generates a single JOIN or subquery
},
});
}
// Total queries: 1-2 (depending on ORM strategy)
# Log query count per request (add to middleware)
# Node.js with Drizzle:
let queryCount = 0;
const originalQuery = db.execute;
db.execute = (...args) => { queryCount++; return originalQuery.apply(db, args); };
// After request completes:
if (queryCount > 10) {
console.warn(`N+1 ALERT: ${req.method} ${req.path} executed ${queryCount} queries`);
}
# Install
pnpm add -D @next/bundle-analyzer
# next.config.js
const withBundleAnalyzer = require('@next/bundle-analyzer')({
enabled: process.env.ANALYZE === 'true',
});
module.exports = withBundleAnalyzer(nextConfig);
# Run analysis
ANALYZE=true pnpm build
# Opens browser with interactive treemap
# Check what you're shipping
npx source-map-explorer .next/static/chunks/*.js
# Size of individual imports
npx import-cost # VS Code extension for inline size
# Find heavy dependencies
npx depcheck --json | jq '.dependencies'
npx bundlephobia-cli <package-name>
| 优化前 | 优化后 | 节省空间 |
|---|---|---|
import _ from 'lodash' | import groupBy from 'lodash/groupBy' | ~70KB |
import moment from 'moment' | import { format } from 'date-fns' | ~60KB |
import { icons } from 'lucide-react' | import { Search } from 'lucide-react' | ~50KB |
| 静态导入重型组件 | dynamic(() => import('./HeavyChart')) | 延迟加载 |
| 所有路由在一个代码块中 | 按路由代码分割(Next.js 自动支持) | 按路由 |
// load-test.k6.js
import http from 'k6/http'
import { check, sleep } from 'k6'
import { Trend, Rate } from 'k6/metrics'
const apiLatency = new Trend('api_latency')
const errorRate = new Rate('errors')
export const options = {
stages: [
{ duration: '1m', target: 20 }, // ramp up
{ duration: '3m', target: 100 }, // sustain
{ duration: '1m', target: 0 }, // ramp down
],
thresholds: {
http_req_duration: ['p(95)<200', 'p(99)<500'],
errors: ['rate<0.01'],
api_latency: ['p(95)<150'],
},
}
export default function () {
const res = http.get(`${__ENV.BASE_URL}/api/v1/projects?limit=20`, {
headers: { Authorization: `Bearer ${__ENV.TOKEN}` },
})
apiLatency.add(res.timings.duration)
check(res, {
'status 200': (r) => r.status === 200,
'body has data': (r) => JSON.parse(r.body).data !== undefined,
}) || errorRate.add(1)
sleep(1)
}
# Run locally
k6 run load-test.k6.js -e BASE_URL=http://localhost:3000 -e TOKEN=$TOKEN
# Run with cloud reporting
k6 cloud load-test.k6.js
## Performance Optimization: [What You Fixed]
**Date:** YYYY-MM-DD
**Ticket:** PROJ-123
### Problem
[1-2 sentences: what was slow, how it was observed]
### Root Cause
[What the profiler revealed — include flamegraph link or screenshot]
### Baseline (Before)
| Metric | Value |
|--------|-------|
| P50 latency | XXms |
| P95 latency | XXms |
| P99 latency | XXms |
| Throughput (RPS) | XX |
| DB queries/request | XX |
| Bundle size | XXkB |
### Fix Applied
[Brief description + link to PR]
### After
| Metric | Before | After | Delta |
|--------|--------|-------|-------|
| P50 | XXms | XXms | -XX% |
| P95 | XXms | XXms | -XX% |
| P99 | XXms | XXms | -XX% |
| RPS | XX | XX | +XX% |
| DB queries/req | XX | XX | -XX% |
### Verification
[Link to k6 output, CI run, or monitoring dashboard]
DATABASE
[ ] Missing indexes on WHERE/ORDER BY columns
[ ] N+1 queries (check query count per request)
[ ] SELECT * when only 2-3 columns needed
[ ] No LIMIT on unbounded queries
[ ] Missing connection pool (new connection per request)
[ ] Stale statistics (run ANALYZE on busy tables)
NODE.JS
[ ] Sync I/O (fs.readFileSync) in request handlers
[ ] JSON.parse/stringify of large objects in hot loops
[ ] Missing response compression (gzip/brotli)
[ ] Dependencies loaded inside request handlers (move to module level)
[ ] Sequential awaits that could be Promise.all
BUNDLE
[ ] Full lodash/moment import instead of specific functions
[ ] Static imports of heavy components (use dynamic import)
[ ] Images not optimized / not using next/image
[ ] No code splitting on routes
API
[ ] No pagination on list endpoints
[ ] No Cache-Control headers on stable responses
[ ] Serial fetches that could run in parallel
[ ] Fetching related data in loops instead of JOINs
p(95) < 200ms 作为 k6 的 CI 门禁每周安装次数
1
代码仓库
GitHub 星标数
29
首次出现
今天
安全审计
安装于
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
Tier: POWERFUL Category: Engineering / Performance Maintainer: Claude Skills Team
Systematic performance profiling for Node.js, Python, and Go applications. Identifies CPU bottlenecks with flamegraphs, detects memory leaks with heap snapshots, analyzes bundle sizes, optimizes database queries, detects N+1 patterns, and runs load tests with k6 and Artillery. Enforces a measure-first methodology: establish baseline, identify bottleneck, fix, and verify improvement.
performance profiling, flamegraph, memory leak, bundle analysis, N+1 queries, load testing, k6, latency, P99, CPU profiling, heap snapshot, database optimization
WRONG: "I think the N+1 query is slow, let me fix it"
RIGHT: Profile → Confirm bottleneck → Fix → Measure again → Verify improvement
Every optimization must have:
1. Baseline metrics (before)
2. Profiler evidence (what's actually slow)
3. The fix
4. Post-fix metrics (after)
5. Delta calculation (improvement %)
# Install
npm install -g clinic
# Generate flamegraph (starts server, applies load, generates HTML report)
clinic flame -- node server.js
# With specific load profile
clinic flame --autocannon [ /api/endpoint -c 10 -d 30 ] -- node server.js
# Analyze specific scenario
clinic flame --on-port 'autocannon -c 50 -d 60 http://localhost:$PORT/api/heavy-endpoint' -- node server.js
# Start Node with inspector
node --inspect server.js
# Or profile on demand
node --cpu-prof --cpu-prof-dir=./profiles server.js
# Load the .cpuprofile file in Chrome DevTools > Performance
# Programmatic profiling of a specific function
const { Session } = require('inspector');
const session = new Session();
session.connect();
session.post('Profiler.enable', () => {
session.post('Profiler.start', () => {
// Run the code you want to profile
runHeavyOperation();
session.post('Profiler.stop', (err, { profile }) => {
require('fs').writeFileSync('profile.cpuprofile', JSON.stringify(profile));
});
});
});
// Take heap snapshots programmatically
const v8 = require('v8');
const fs = require('fs');
function takeHeapSnapshot(label) {
const snapshotPath = `heap-${label}-${Date.now()}.heapsnapshot`;
const stream = v8.writeHeapSnapshot(snapshotPath);
console.log(`Heap snapshot written to: ${snapshotPath}`);
return snapshotPath;
}
// Leak detection pattern: compare two snapshots
// 1. Take snapshot at startup
takeHeapSnapshot('baseline');
// 2. Run operations that you suspect leak
// ... process 1000 requests ...
// 3. Force GC and take another snapshot
if (global.gc) global.gc(); // requires --expose-gc flag
takeHeapSnapshot('after-load');
// Load both .heapsnapshot files in Chrome DevTools > Memory
// Use "Comparison" view to find objects that grew
# Install tracemalloc-based profiler
pip install memray
# Profile a script
memray run my_script.py
memray flamegraph memray-output.bin -o flamegraph.html
# Profile a specific function
python -c "
import tracemalloc
tracemalloc.start()
# Run your code
from my_module import heavy_function
heavy_function()
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print('Top 10 memory allocations:')
for stat in top_stats[:10]:
print(stat)
"
-- Step 1: Get the actual execution plan (not just estimated)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT t.*, p.name as project_name
FROM tasks t
JOIN projects p ON p.id = t.project_id
WHERE p.workspace_id = 'ws_abc123'
AND t.status = 'in_progress'
AND t.deleted_at IS NULL
ORDER BY t.updated_at DESC
LIMIT 20;
-- What to look for in the output:
-- Seq Scan on tasks → MISSING INDEX (should be Index Scan)
-- Rows Removed by Filter: 99000 → INDEX NOT SELECTIVE ENOUGH
-- Sort Method: external merge → NOT ENOUGH work_mem
-- Nested Loop with inner Seq Scan → MISSING INDEX ON JOIN COLUMN
-- Actual rows=1000 vs estimated rows=1 → STALE STATISTICS (run ANALYZE)
// PROBLEM: N+1 query pattern
async function getProjectsWithTasks(workspaceId: string) {
const projects = await db.query.projects.findMany({
where: eq(projects.workspaceId, workspaceId),
});
// This executes N additional queries (one per project)
for (const project of projects) {
project.tasks = await db.query.tasks.findMany({
where: eq(tasks.projectId, project.id),
});
}
return projects;
}
// Total queries: 1 + N (where N = number of projects)
// FIX: Single query with JOIN or relation loading
async function getProjectsWithTasks(workspaceId: string) {
return db.query.projects.findMany({
where: eq(projects.workspaceId, workspaceId),
with: {
tasks: true, // Drizzle generates a single JOIN or subquery
},
});
}
// Total queries: 1-2 (depending on ORM strategy)
# Log query count per request (add to middleware)
# Node.js with Drizzle:
let queryCount = 0;
const originalQuery = db.execute;
db.execute = (...args) => { queryCount++; return originalQuery.apply(db, args); };
// After request completes:
if (queryCount > 10) {
console.warn(`N+1 ALERT: ${req.method} ${req.path} executed ${queryCount} queries`);
}
# Install
pnpm add -D @next/bundle-analyzer
# next.config.js
const withBundleAnalyzer = require('@next/bundle-analyzer')({
enabled: process.env.ANALYZE === 'true',
});
module.exports = withBundleAnalyzer(nextConfig);
# Run analysis
ANALYZE=true pnpm build
# Opens browser with interactive treemap
# Check what you're shipping
npx source-map-explorer .next/static/chunks/*.js
# Size of individual imports
npx import-cost # VS Code extension for inline size
# Find heavy dependencies
npx depcheck --json | jq '.dependencies'
npx bundlephobia-cli <package-name>
| Before | After | Savings |
|---|---|---|
import _ from 'lodash' | import groupBy from 'lodash/groupBy' | ~70KB |
import moment from 'moment' | import { format } from 'date-fns' | ~60KB |
import { icons } from 'lucide-react' | import { Search } from 'lucide-react' | ~50KB |
| Static import of heavy component |
// load-test.k6.js
import http from 'k6/http'
import { check, sleep } from 'k6'
import { Trend, Rate } from 'k6/metrics'
const apiLatency = new Trend('api_latency')
const errorRate = new Rate('errors')
export const options = {
stages: [
{ duration: '1m', target: 20 }, // ramp up
{ duration: '3m', target: 100 }, // sustain
{ duration: '1m', target: 0 }, // ramp down
],
thresholds: {
http_req_duration: ['p(95)<200', 'p(99)<500'],
errors: ['rate<0.01'],
api_latency: ['p(95)<150'],
},
}
export default function () {
const res = http.get(`${__ENV.BASE_URL}/api/v1/projects?limit=20`, {
headers: { Authorization: `Bearer ${__ENV.TOKEN}` },
})
apiLatency.add(res.timings.duration)
check(res, {
'status 200': (r) => r.status === 200,
'body has data': (r) => JSON.parse(r.body).data !== undefined,
}) || errorRate.add(1)
sleep(1)
}
# Run locally
k6 run load-test.k6.js -e BASE_URL=http://localhost:3000 -e TOKEN=$TOKEN
# Run with cloud reporting
k6 cloud load-test.k6.js
## Performance Optimization: [What You Fixed]
**Date:** YYYY-MM-DD
**Ticket:** PROJ-123
### Problem
[1-2 sentences: what was slow, how it was observed]
### Root Cause
[What the profiler revealed — include flamegraph link or screenshot]
### Baseline (Before)
| Metric | Value |
|--------|-------|
| P50 latency | XXms |
| P95 latency | XXms |
| P99 latency | XXms |
| Throughput (RPS) | XX |
| DB queries/request | XX |
| Bundle size | XXkB |
### Fix Applied
[Brief description + link to PR]
### After
| Metric | Before | After | Delta |
|--------|--------|-------|-------|
| P50 | XXms | XXms | -XX% |
| P95 | XXms | XXms | -XX% |
| P99 | XXms | XXms | -XX% |
| RPS | XX | XX | +XX% |
| DB queries/req | XX | XX | -XX% |
### Verification
[Link to k6 output, CI run, or monitoring dashboard]
DATABASE
[ ] Missing indexes on WHERE/ORDER BY columns
[ ] N+1 queries (check query count per request)
[ ] SELECT * when only 2-3 columns needed
[ ] No LIMIT on unbounded queries
[ ] Missing connection pool (new connection per request)
[ ] Stale statistics (run ANALYZE on busy tables)
NODE.JS
[ ] Sync I/O (fs.readFileSync) in request handlers
[ ] JSON.parse/stringify of large objects in hot loops
[ ] Missing response compression (gzip/brotli)
[ ] Dependencies loaded inside request handlers (move to module level)
[ ] Sequential awaits that could be Promise.all
BUNDLE
[ ] Full lodash/moment import instead of specific functions
[ ] Static imports of heavy components (use dynamic import)
[ ] Images not optimized / not using next/image
[ ] No code splitting on routes
API
[ ] No pagination on list endpoints
[ ] No Cache-Control headers on stable responses
[ ] Serial fetches that could run in parallel
[ ] Fetching related data in loops instead of JOINs
p(95) < 200ms as a CI gate with k6Weekly Installs
1
Repository
GitHub Stars
29
First Seen
Today
Security Audits
Gen Agent Trust HubPassSocketFailSnykPass
Installed on
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
79,900 周安装
dynamic(() => import('./HeavyChart')) |
| Deferred |
| All routes in one chunk | Code splitting per route (automatic in Next.js) | Per-route |