npx skills add https://github.com/dasien/retrowarden --skill 'Performance Profiling'使用剖析工具系统地测量和分析应用程序性能,以识别瓶颈、热点路径、内存泄漏和低效操作。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
场景:剖析一个缓慢的 Python Web API 端点
步骤 1:基线测量
# 测量端点响应时间
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8000/api/users
# 结果:总时间:2.8 秒(目标:<500 毫秒)
步骤 2:CPU 剖析
# profile_endpoint.py
import cProfile
import pstats
from io import StringIO
def profile_request():
profiler = cProfile.Profile()
profiler.enable()
# 执行缓慢的端点
response = app.test_client().get('/api/users')
profiler.disable()
# 生成报告
s = StringIO()
ps = pstats.Stats(profiler, stream=s).sort_stats('cumulative')
ps.print_stats(20) # 前 20 个函数
print(s.getvalue())
profile_request()
CPU 剖析结果:
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 2.756 2.756 views.py:45(get_users)
500 1.200 0.002 2.450 0.005 database.py:89(get_user_details)
5000 0.850 0.000 0.850 0.000 {method 'execute' of 'sqlite3.Cursor'}
500 0.300 0.001 0.300 0.001 serializers.py:22(serialize_user)
1 0.150 0.150 0.150 0.150 {method 'fetchall' of 'sqlite3.Cursor'}
分析:
get_user_details() 被调用了 500 次 → N+1 查询问题步骤 3:数据库查询分析
# 原始代码(N+1 问题)
def get_users():
users = User.query.all() # 1 个查询
results = []
for user in users:
# N 个查询(每个用户一个)
user_details = UserDetail.query.filter_by(user_id=user.id).first()
results.append({
'user': user,
'details': user_details
})
return results
步骤 4:内存剖析
from memory_profiler import profile
@profile
def get_users():
users = User.query.all()
results = []
for user in users:
user_details = UserDetail.query.filter_by(user_id=user.id).first()
results.append({
'user': user,
'details': user_details
})
return results
内存剖析结果:
Line # Mem usage Increment Line Contents
================================================
45 50.2 MiB 50.2 MiB def get_users():
46 75.5 MiB 25.3 MiB users = User.query.all()
47 75.5 MiB 0.0 MiB results = []
48 125.8 MiB 50.3 MiB for user in users:
49 125.8 MiB 0.0 MiB user_details = UserDetail.query...
50 125.8 MiB 0.0 MiB results.append(...)
51 125.8 MiB 0.0 MiB return results
分析:加载 500 个用户及其详细信息使用了 75 MiB 内存
步骤 5:火焰图分析
# 生成火焰图(可视化)
py-spy record -o profile.svg --duration 30 -- python app.py
火焰图显示:
应用的优化:
# 优化后的代码(使用连接的单次查询)
def get_users():
# 使用预加载在一个查询中获取用户和详细信息
users = User.query.options(
joinedload(User.details)
).all()
results = []
for user in users:
results.append({
'user': user,
'details': user.details # 已加载,无需查询
})
return results
步骤 6:验证改进
# 重新测量端点响应时间
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8000/api/users
# 结果:总时间:0.18 秒(改进 94%!)
预期结果:
每周安装次数
–
代码仓库
GitHub 星标数
4
首次出现时间
–
安全审计
Systematically measure and analyze application performance using profiling tools to identify bottlenecks, hot paths, memory leaks, and inefficient operations.
Establish Baseline
Select Profiling Tools
Collect Profiling Data
Analyze Results
Prioritize Optimizations
Context : Profiling a slow Python web API endpoint
Step 1: Baseline Measurement
# Measure endpoint response time
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8000/api/users
# Result: Total time: 2.8 seconds (Target: <500ms)
Step 2: CPU Profiling
# profile_endpoint.py
import cProfile
import pstats
from io import StringIO
def profile_request():
profiler = cProfile.Profile()
profiler.enable()
# Execute the slow endpoint
response = app.test_client().get('/api/users')
profiler.disable()
# Generate report
s = StringIO()
ps = pstats.Stats(profiler, stream=s).sort_stats('cumulative')
ps.print_stats(20) # Top 20 functions
print(s.getvalue())
profile_request()
CPU Profile Results :
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 2.756 2.756 views.py:45(get_users)
500 1.200 0.002 2.450 0.005 database.py:89(get_user_details)
5000 0.850 0.000 0.850 0.000 {method 'execute' of 'sqlite3.Cursor'}
500 0.300 0.001 0.300 0.001 serializers.py:22(serialize_user)
1 0.150 0.150 0.150 0.150 {method 'fetchall' of 'sqlite3.Cursor'}
Analysis :
get_user_details() called 500 times → N+1 query problemStep 3: Database Query Analysis
# Original code (N+1 problem)
def get_users():
users = User.query.all() # 1 query
results = []
for user in users:
# N queries (one per user)
user_details = UserDetail.query.filter_by(user_id=user.id).first()
results.append({
'user': user,
'details': user_details
})
return results
Step 4: Memory Profiling
from memory_profiler import profile
@profile
def get_users():
users = User.query.all()
results = []
for user in users:
user_details = UserDetail.query.filter_by(user_id=user.id).first()
results.append({
'user': user,
'details': user_details
})
return results
Memory Profile Results :
Line # Mem usage Increment Line Contents
================================================
45 50.2 MiB 50.2 MiB def get_users():
46 75.5 MiB 25.3 MiB users = User.query.all()
47 75.5 MiB 0.0 MiB results = []
48 125.8 MiB 50.3 MiB for user in users:
49 125.8 MiB 0.0 MiB user_details = UserDetail.query...
50 125.8 MiB 0.0 MiB results.append(...)
51 125.8 MiB 0.0 MiB return results
Analysis : Loading 500 users with details uses 75 MiB memory
Step 5: Flame Graph Analysis
# Generate flame graph (visual)
py-spy record -o profile.svg --duration 30 -- python app.py
Flame Graph Shows :
Optimization Applied :
# Optimized code (single query with join)
def get_users():
# Use eager loading to fetch users and details in one query
users = User.query.options(
joinedload(User.details)
).all()
results = []
for user in users:
results.append({
'user': user,
'details': user.details # Already loaded, no query
})
return results
Step 6: Verify Improvement
# Re-measure endpoint response time
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8000/api/users
# Result: Total time: 0.18 seconds (94% improvement!)
Expected Result :
Weekly Installs
–
Repository
GitHub Stars
4
First Seen
–
Security Audits
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
79,900 周安装
Substrate漏洞扫描器 - 7大关键安全漏洞检测,保障区块链运行时安全
1,300 周安装
Ruzzy:Ruby覆盖率引导模糊测试工具,支持C扩展内存安全检测
1,300 周安装
AI财经新闻简报工具 - 自动化市场新闻投递,支持多语言与多平台配置
1,300 周安装
Redis 性能优化最佳实践指南:数据结构、向量搜索、语义缓存与查询引擎
1,400 周安装
Figma MCP 集成指南:使用 OpenAI Skills 实现 Figma 到代码的自动化转换
1,300 周安装
Swift Actor 线程安全持久化:构建离线优先应用的编译器强制安全数据层
1,400 周安装