⚠️

重要前提

安装AI Skills的关键前提是：必须科学上网，且开启TUN模式，这一点至关重要，直接决定安装能否顺利完成，在此郑重提醒三遍：科学上网，科学上网，科学上网。查看完整安装教程 →

多智能体性能剖析方法：识别系统瓶颈，优化数据库与流水线性能

multi-agent-performance-profiling by terrylica/cc-skills

62 周安装量

26 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/terrylica/cc-skills --skill multi-agent-performance-profiling

数据分析开发运维性能优化

🇨🇳中文介绍

多智能体性能剖析

概述

用于生成并行剖析智能体的规范性工作流，以全面识别跨多个系统层的性能瓶颈。成功发现 QuestDB 的摄入速度为每秒 110 万行（比目标快 11 倍），证明数据库并非瓶颈——CloudFront 下载占用了流水线 90% 的时间。

何时使用此技能

在以下情况使用此技能：

性能低于服务等级目标（例如，每秒 47K 行 vs 每秒 100K 行的目标）
多阶段流水线优化（下载 → 解压 → 解析 → 摄入）
数据库性能调查
复杂工作流中的瓶颈识别
优化前分析（在进行更改之前）

关键成果：

识别真正的瓶颈（而非假设的瓶颈）
量化每个阶段对总时间的贡献
按影响程度（P0/P1/P2）确定优化优先级
避免过早优化非瓶颈环节

核心方法论

1. 多层剖析模型（5 智能体模式）

智能体 1：剖析（插桩）

对流水线每个阶段进行经验性计时
使用 time.perf_counter() 在阶段边界进行插桩
内存剖析（峰值使用量、分配情况）
瓶颈识别（占总时间的百分比）

智能体 2：数据库配置分析

服务器设置审查（WAL、堆内存、提交间隔）
生产环境与开发环境配置对比
预期影响量化（<5%、10%、50%）

智能体 3：客户端库分析

API 使用模式（数据框 vs 逐行）
缓冲区大小调优机会

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

2. 智能体编排模式

并行执行（所有 5 个智能体同时运行）：

    Agent 1 (Profiling)          → [PARALLEL]
    Agent 2 (DB Config)          → [PARALLEL]
    Agent 3 (Client Library)     → [PARALLEL]
    Agent 4 (Batch Size)         → [PARALLEL]
    Agent 5 (Integration)        → [PARALLEL - reads tmp/ outputs from others]

关键原则：调查智能体（1-4）之间无依赖关系。集成智能体综合所有发现。

动态待办事项管理：

从调查计划（5 个智能体）开始
使用包含多个 Task 工具调用的单条消息并行生成智能体
在每个智能体完成时更新待办事项
集成智能体等待所有发现后再进行综合

3. 剖析脚本结构

每个智能体产出：

调查脚本（例如 profile_pipeline.py）
- 在阶段边界使用 time.perf_counter() 插桩
- 使用 tracemalloc 进行内存剖析
- 结构化输出（阶段、持续时间、占总时间的百分比）
报告（包含发现、建议、影响量化的 Markdown 文档）
证据（基准测试结果、配置转储、API 跟踪）

示例剖析代码：

    import time
    
    # Profile multi-stage pipeline
    def profile_pipeline():
        results = {}
    
        # Phase 1: Download
        start = time.perf_counter()
        data = download_from_cdn(url)
        results["download"] = time.perf_counter() - start
    
        # Phase 2: Extract
        start = time.perf_counter()
        csv_data = extract_zip(data)
        results["extract"] = time.perf_counter() - start
    
        # Phase 3: Parse
        start = time.perf_counter()
        df = parse_csv(csv_data)
        results["parse"] = time.perf_counter() - start
    
        # Phase 4: Ingest
        start = time.perf_counter()
        ingest_to_db(df)
        results["ingest"] = time.perf_counter() - start
    
        # Analysis
        total = sum(results.values())
        for phase, duration in results.items():
            pct = (duration / total) * 100
            print(f"{phase}: {duration:.3f}s ({pct:.1f}%)")
    
        return results

4. 影响量化框架

优先级等级：

P0（关键）：>5 倍提升，解决主要瓶颈
P1（高）：2-5 倍提升，次要优化
P2（中）：1.2-2 倍提升，快速见效
P3（低）：<1.2 倍提升，微调

影响报告格式：

    ### Recommendation: [Optimization Name] (P0/P1/P2) - [IMPACT LEVEL]
    
    **Impact**: 🔴/🟠/🟡 **Nx improvement**
    **Effort**: High/Medium/Low (N days)
    **Expected Improvement**: CurrentK → TargetK rows/sec
    
    **Rationale**:
    
    - [Why this matters]
    - [Supporting evidence from profiling]
    - [Comparison to alternatives]
    
    **Implementation**:
    [Code snippet or architecture description]

5. 共识建立模式

集成智能体职责：

阅读所有调查报告（智能体 1-4）
识别共识建议（所有智能体一致同意）
标记矛盾之处（智能体意见不一致）
综合主集成报告
创建实施路线图（P0 → P1 → P2）

≥3/4 的智能体推荐相同的优化 → 达成共识
2/4 的智能体推荐，2/4 中立 → 进一步调查
智能体意见矛盾（一个说"优化 X"，另一个说"X 不是瓶颈"）→ 运行决胜实验

工作流：分步指南

步骤 1：定义性能问题

输入：低于服务等级目标的性能指标输出：包含基线指标的问题陈述

示例问题陈述：

    Performance Issue: BTCUSDT 1m ingestion at 47K rows/sec
    Target SLO: >100K rows/sec
    Gap: 53% below target
    Pipeline: CloudFront download → ZIP extract → CSV parse → QuestDB ILP ingest

步骤 2：创建调查计划

    tmp/perf-optimization/
      profiling/              # Agent 1
        profile_pipeline.py
        PROFILING_REPORT.md
      questdb-config/         # Agent 2
        CONFIG_ANALYSIS.md
      python-client/          # Agent 3
        CLIENT_ANALYSIS.md
      batch-size/             # Agent 4
        BATCH_ANALYSIS.md
      MASTER_INTEGRATION_REPORT.md  # Agent 5

智能体分配：

智能体 1：经验性剖析（插桩）
智能体 2：数据库配置分析
智能体 3：客户端库使用分析
智能体 4：批处理大小优化分析
智能体 5：综合与集成

步骤 3：并行生成智能体

重要：使用包含多个 Task 工具调用的单条消息实现真正的并行

    I'm going to spawn 5 parallel investigation agents:
    
    [Uses Task tool 5 times in a single message]
    - Agent 1: Profiling
    - Agent 2: QuestDB Config
    - Agent 3: Python Client
    - Agent 4: Batch Size
    - Agent 5: Integration (depends on others completing)

    # All agents run simultaneously (user observes 5 parallel tool calls)
    # Each agent writes to its own tmp/ subdirectory
    # Integration agent polls for completed reports

步骤 4：等待所有智能体完成

在每个智能体完成时更新待办事项列表
集成智能体轮询 tmp/ 目录以查找报告文件
一旦存在 4/4 份调查报告 → 集成智能体开始综合

所有 4 份调查报告均已编写
集成报告综合了所有发现
创建了主建议列表

步骤 5：审查主集成报告

    # Master Performance Optimization Integration Report
    
    ## Executive Summary
    
    - Critical discovery (what is/isn't the bottleneck)
    - Key findings from each agent (1-sentence summary)
    
    ## Top 3 Recommendations (Consensus)
    
    1. [P0 Optimization] - HIGHEST IMPACT
    2. [P1 Optimization] - HIGH IMPACT
    3. [P2 Optimization] - QUICK WIN
    
    ## Agent Investigation Summary
    
    ### Agent 1: Profiling
    
    ### Agent 2: Database Config
    
    ### Agent 3: Client Library
    
    ### Agent 4: Batch Size
    
    ## Implementation Roadmap
    
    ### Phase 1: P0 Optimizations (Week 1)
    
    ### Phase 2: P1 Optimizations (Week 2)
    
    ### Phase 3: P2 Quick Wins (As time permits)

步骤 6：实施优化（优先 P0）

针对每项建议：

实施最高优先级优化（P0）
重新运行剖析脚本
验证是否达到预期改进
使用实际结果更新报告
转向下一个优先级（P1、P2、P3）

    # Before optimization
    uv run python tmp/perf-optimization/profiling/profile_pipeline.py
    # Output: 47K rows/sec, download=857ms (90%)
    
    # Implement P0 recommendation (concurrent downloads)
    # [Make code changes]
    
    # After optimization
    uv run python tmp/perf-optimization/profiling/profile_pipeline.py
    # Output: 450K rows/sec, download=90ms per symbol * 10 concurrent (90%)

真实案例：QuestDB 重构性能调查

背景：流水线达到每秒 47K 行，目标为每秒 100K 行（低于服务等级目标 53%）

调查前的假设：

QuestDB ILP 摄入是瓶颈（占 4% 的时间）
需要调整数据库配置
需要优化 Sender API 的使用

5 智能体调查后的发现：

剖析智能体：CloudFront 下载占 90% 的时间（857 毫秒），ILP 摄入仅占 4%（40 毫秒）
QuestDB 配置智能体：数据库已处于最佳状态，调整只能提供 <5% 的改进
Python 客户端智能体：Sender API 已处于最佳状态（使用 dataframe() 批量摄入）
批处理大小智能体：44K 的批处理大小在最佳范围内
集成智能体：共识建议——优化下载，而非数据库

前 3 项建议：

🔴 P0：并发多符号下载（10-20 倍改进）
🟠 P1：多个月份流水线并行化（2 倍改进）
🟡 P2：流式 ZIP 解压（1.3 倍改进）

影响：发现数据库摄入速度为每秒 110 万行（比目标快 11 倍）——证明数据库从来不是瓶颈

结果：避免了在下载才是真正瓶颈的情况下浪费 2-3 周优化数据库

1. 仅剖析单一层

❌ 错误做法：仅剖析数据库，假设它是瓶颈 ✅ 正确做法：剖析整个流水线（下载 → 解压 → 解析 → 摄入）

2. 串行执行智能体

❌ 错误做法：运行智能体 1，等待，然后运行智能体 2，等待，等等。 ✅ 正确做法：使用包含多个 Task 调用的单条消息并行生成所有 5 个智能体

3. 未剖析就进行优化

❌ 错误做法："我们先优化数据库配置吧"（基于假设） ✅ 正确做法：先剖析，发现数据库只占 4% 的时间，转而优化下载

4. 忽略低垂果实

❌ 错误做法：仅实施 P0（影响最大，工作量也最大） ✅ 正确做法：在规划 P0 的同时，实施 P2 快速见效项（4-8 小时工作量带来 1.3 倍提升）

5. 更改后未重新剖析

❌ 错误做法：实施优化，假设它生效了 ✅ 正确做法：重新运行剖析脚本，验证是否达到预期改进

不适用 - 剖析脚本是项目特定的（存储在 tmp/perf-optimization/ 中）

profiling_template.py - 阶段边界插桩模板
integration_report_template.md - 主集成报告模板
impact_quantification_guide.md - 如何评估 P0/P1/P2 优先级

不适用 - 剖析产物是项目特定的

问题	原因	解决方案
智能体串行运行	使用了多条独立消息	在单条消息中使用多 Task 调用生成所有智能体
集成报告为空	智能体报告未写入	等待所有 4 个调查智能体完成
识别出错误的瓶颈	单层剖析	剖析整个流水线，而不仅仅是假设的层
剖析结果波动	没有预热运行	在测量前运行 3-5 次预热迭代
未剖析内存	缺少 `tracemalloc`	在剖析脚本中添加 `tracemalloc` 插桩
P0/P1 优先级不明确	没有影响量化	为每个发现包含预期的 N 倍改进
缺少共识	未比较智能体结果	集成智能体必须综合所有 4 份报告
重新剖析显示无变化	缓存效应	在重新剖析前清除缓存、重启服务

2026 年 1 月 24 日

🇺🇸English

Multi-Agent Performance Profiling

Overview

Prescriptive workflow for spawning parallel profiling agents to comprehensively identify performance bottlenecks across multiple system layers. Successfully discovered that QuestDB ingests at 1.1M rows/sec (11x faster than target), proving database was NOT the bottleneck - CloudFront download was 90% of pipeline time.

When to Use This Skill

Use this skill when:

Performance below SLO (e.g., 47K vs 100K rows/sec target)
Multi-stage pipeline optimization (download → extract → parse → ingest)
Database performance investigation
Bottleneck identification in complex workflows
Pre-optimization analysis (before making changes)

Key outcomes:

Identify true bottleneck (vs assumed bottleneck)
Quantify each stage's contribution to total time
Prioritize optimizations by impact (P0/P1/P2)
Avoid premature optimization of non-bottlenecks

Core Methodology

1. Multi-Layer Profiling Model (5-Agent Pattern)

Agent 1: Profiling (Instrumentation)

Empirical timing of each pipeline stage
Phase-boundary instrumentation with time.perf_counter()
Memory profiling (peak usage, allocations)
Bottleneck identification (% of total time)

Agent 2: Database Configuration Analysis

Server settings review (WAL, heap, commit intervals)
Production vs development config comparison
Expected impact quantification (<5%, 10%, 50%)

Agent 3: Client Library Analysis

API usage patterns (dataframe vs row-by-row)
Buffer size tuning opportunities
Auto-flush behavior analysis

Agent 4: Batch Size Analysis

Current batch size validation
Optimal batch range determination
Memory overhead vs throughput tradeoff

Agent 5: Integration & Synthesis

Consensus-building across agents
Prioritization (P0/P1/P2) with impact quantification
Implementation roadmap creation

2. Agent Orchestration Pattern

Parallel Execution (all 5 agents run simultaneously):

Agent 1 (Profiling)          → [PARALLEL]
Agent 2 (DB Config)          → [PARALLEL]
Agent 3 (Client Library)     → [PARALLEL]
Agent 4 (Batch Size)         → [PARALLEL]
Agent 5 (Integration)        → [PARALLEL - reads tmp/ outputs from others]

Key Principle : No dependencies between investigation agents (1-4). Integration agent synthesizes findings.

Dynamic Todo Management:

Start with investigation plan (5 agents)
Spawn agents in parallel using single message with multiple Task tool calls
Update todos as each agent completes
Integration agent waits for all findings before synthesizing

3. Profiling Script Structure

Each agent produces:

Investigation Script (e.g., profile_pipeline.py)
- time.perf_counter() instrumentation at phase boundaries
- Memory profiling with tracemalloc
- Structured output (phase, duration, % of total)
Report (markdown with findings, recommendations, impact quantification)
Evidence (benchmark results, config dumps, API traces)

Example Profiling Code:

import time

# Profile multi-stage pipeline
def profile_pipeline():
    results = {}

    # Phase 1: Download
    start = time.perf_counter()
    data = download_from_cdn(url)
    results["download"] = time.perf_counter() - start

    # Phase 2: Extract
    start = time.perf_counter()
    csv_data = extract_zip(data)
    results["extract"] = time.perf_counter() - start

    # Phase 3: Parse
    start = time.perf_counter()
    df = parse_csv(csv_data)
    results["parse"] = time.perf_counter() - start

    # Phase 4: Ingest
    start = time.perf_counter()
    ingest_to_db(df)
    results["ingest"] = time.perf_counter() - start

    # Analysis
    total = sum(results.values())
    for phase, duration in results.items():
        pct = (duration / total) * 100
        print(f"{phase}: {duration:.3f}s ({pct:.1f}%)")

    return results

4. Impact Quantification Framework

Priority Levels:

P0 (Critical) : >5x improvement, addresses primary bottleneck
P1 (High) : 2-5x improvement, secondary optimizations
P2 (Medium) : 1.2-2x improvement, quick wins
P3 (Low) : <1.2x improvement, minor tuning

Impact Reporting Format:

### Recommendation: [Optimization Name] (P0/P1/P2) - [IMPACT LEVEL]

**Impact**: 🔴/🟠/🟡 **Nx improvement**
**Effort**: High/Medium/Low (N days)
**Expected Improvement**: CurrentK → TargetK rows/sec

**Rationale**:

- [Why this matters]
- [Supporting evidence from profiling]
- [Comparison to alternatives]

**Implementation**:
[Code snippet or architecture description]

5. Consensus-Building Pattern

Integration Agent Responsibilities:

Read all investigation reports (Agents 1-4)
Identify consensus recommendations (all agents agree)
Flag contradictions (agents disagree)
Synthesize master integration report
Create implementation roadmap (P0 → P1 → P2)

Consensus Criteria:

≥3/4 agents recommend same optimization → Consensus
2/4 agents recommend, 2/4 neutral → Investigate further
Agents contradict (one says "optimize X", another says "X is not bottleneck") → Run tie-breaker experiment

Workflow: Step-by-Step

Step 1: Define Performance Problem

Input : Performance metric below SLO Output : Problem statement with baseline metrics

Example Problem Statement:

Performance Issue: BTCUSDT 1m ingestion at 47K rows/sec
Target SLO: >100K rows/sec
Gap: 53% below target
Pipeline: CloudFront download → ZIP extract → CSV parse → QuestDB ILP ingest

Step 2: Create Investigation Plan

Directory Structure:

tmp/perf-optimization/
  profiling/              # Agent 1
    profile_pipeline.py
    PROFILING_REPORT.md
  questdb-config/         # Agent 2
    CONFIG_ANALYSIS.md
  python-client/          # Agent 3
    CLIENT_ANALYSIS.md
  batch-size/             # Agent 4
    BATCH_ANALYSIS.md
  MASTER_INTEGRATION_REPORT.md  # Agent 5

Agent Assignment:

Agent 1: Empirical profiling (instrumentation)
Agent 2: Database configuration analysis
Agent 3: Client library usage analysis
Agent 4: Batch size optimization analysis
Agent 5: Synthesis and integration

Step 3: Spawn Agents in Parallel

IMPORTANT : Use single message with multiple Task tool calls for true parallelism

Example:

I'm going to spawn 5 parallel investigation agents:

[Uses Task tool 5 times in a single message]
- Agent 1: Profiling
- Agent 2: QuestDB Config
- Agent 3: Python Client
- Agent 4: Batch Size
- Agent 5: Integration (depends on others completing)

Execution:

# All agents run simultaneously (user observes 5 parallel tool calls)
# Each agent writes to its own tmp/ subdirectory
# Integration agent polls for completed reports

Step 4: Wait for All Agents to Complete

Progress Tracking:

Update todo list as each agent completes
Integration agent polls tmp/ directory for report files
Once 4/4 investigation reports exist → Integration agent synthesizes

Completion Criteria:

All 4 investigation reports written
Integration report synthesizes findings
Master recommendations list created

Step 5: Review Master Integration Report

Report Structure:

# Master Performance Optimization Integration Report

## Executive Summary

- Critical discovery (what is/isn't the bottleneck)
- Key findings from each agent (1-sentence summary)

## Top 3 Recommendations (Consensus)

1. [P0 Optimization] - HIGHEST IMPACT
2. [P1 Optimization] - HIGH IMPACT
3. [P2 Optimization] - QUICK WIN

## Agent Investigation Summary

### Agent 1: Profiling

### Agent 2: Database Config

### Agent 3: Client Library

### Agent 4: Batch Size

## Implementation Roadmap

### Phase 1: P0 Optimizations (Week 1)

### Phase 2: P1 Optimizations (Week 2)

### Phase 3: P2 Quick Wins (As time permits)

Step 6: Implement Optimizations (P0 First)

For each recommendation:

Implement highest-priority optimization (P0)
Re-run profiling script
Verify expected improvement achieved
Update report with actual results
Move to next priority (P1, P2, P3)

Example Implementation:

# Before optimization
uv run python tmp/perf-optimization/profiling/profile_pipeline.py
# Output: 47K rows/sec, download=857ms (90%)

# Implement P0 recommendation (concurrent downloads)
# [Make code changes]

# After optimization
uv run python tmp/perf-optimization/profiling/profile_pipeline.py
# Output: 450K rows/sec, download=90ms per symbol * 10 concurrent (90%)

Real-World Example: QuestDB Refactor Performance Investigation

Context : Pipeline achieving 47K rows/sec, target 100K rows/sec (53% below SLO)

Assumptions Before Investigation:

QuestDB ILP ingestion is the bottleneck (4% of time)
Need to tune database configuration
Need to optimize Sender API usage

Findings After 5-Agent Investigation:

Profiling Agent : CloudFront download is 90% of time (857ms), ILP ingest only 4% (40ms)
QuestDB Config Agent : Database already optimal, tuning provides <5% improvement
Python Client Agent : Sender API already optimal (using dataframe() bulk ingestion)
Batch Size Agent : 44K batch size is within optimal range
Integration Agent : Consensus recommendation - optimize download, NOT database

Top 3 Recommendations:

🔴 P0 : Concurrent multi-symbol downloads (10-20x improvement)
🟠 P1 : Multi-month pipeline parallelism (2x improvement)
🟡 P2 : Streaming ZIP extraction (1.3x improvement)

Impact : Discovered database ingests at 1.1M rows/sec (11x faster than target) - proving database was never the bottleneck

Outcome : Avoided wasting 2-3 weeks optimizing database when download was the real bottleneck

Common Pitfalls

1. Profiling Only One Layer

❌ Bad : Profile database only, assume it's the bottleneck ✅ Good : Profile entire pipeline (download → extract → parse → ingest)

2. Serial Agent Execution

❌ Bad : Run Agent 1, wait, then run Agent 2, wait, etc. ✅ Good : Spawn all 5 agents in parallel using single message with multiple Task calls

3. Optimizing Without Profiling

❌ Bad : "Let's optimize the database config first" (assumption-driven) ✅ Good : Profile first, discover database is only 4% of time, optimize download instead

4. Ignoring Low-Hanging Fruit

❌ Bad : Only implement P0 (highest impact, highest effort) ✅ Good : Implement P2 quick wins (1.3x for 4-8 hours effort) while planning P0

5. Not Re-Profiling After Changes

❌ Bad : Implement optimization, assume it worked ✅ Good : Re-run profiling script, verify expected improvement achieved

Resources

scripts/

Not applicable - profiling scripts are project-specific (stored in tmp/perf-optimization/)

references/

profiling_template.py - Template for phase-boundary instrumentation
integration_report_template.md - Template for master integration report
impact_quantification_guide.md - How to assess P0/P1/P2 priorities

assets/

Not applicable - profiling artifacts are project-specific

Troubleshooting

Issue	Cause	Solution
Agents running sequentially	Using separate messages	Spawn all agents in single message with multi-Task
Integration report empty	Agent reports not written	Wait for all 4 investigation agents to complete
Wrong bottleneck identified	Single-layer profiling	Profile entire pipeline, not just assumed layer
Profiling results vary	No warmup runs	Run 3-5 warmup iterations before measuring
Memory not profiled	Missing tracemalloc	Add tracemalloc instrumentation to profiling script
P0/P1 priority unclear	No impact quantification	Include expected Nx improvement for each finding
Consensus missing	Agents not compared	Integration agent must synthesize all 4 reports
Re-profile shows no change	Caching effects	Clear caches, restart services before re-profiling

Weekly Installs

Repository

terrylica/cc-skills

GitHub Stars

First Seen

Jan 24, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

opencode59

gemini-cli58

claude-code57

codex57

github-copilot56

cursor56

Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU

111,700 周安装

多智能体性能剖析方法：识别系统瓶颈，优化数据库与流水线性能

🇨🇳中文介绍

多智能体性能剖析

概述

何时使用此技能

核心方法论

1. 多层剖析模型（5 智能体模式）

相关 Skills

2. 智能体编排模式

3. 剖析脚本结构

4. 影响量化框架

5. 共识建立模式

工作流：分步指南

步骤 1：定义性能问题

步骤 2：创建调查计划

步骤 3：并行生成智能体

步骤 4：等待所有智能体完成

步骤 5：审查主集成报告

步骤 6：实施优化（优先 P0）

真实案例：QuestDB 重构性能调查

常见陷阱

1. 仅剖析单一层

2. 串行执行智能体

3. 未剖析就进行优化

4. 忽略低垂果实

5. 更改后未重新剖析

资源

scripts/

references/

assets/

故障排除

🇺🇸English

Multi-Agent Performance Profiling

Overview

When to Use This Skill

Core Methodology

1. Multi-Layer Profiling Model (5-Agent Pattern)

2. Agent Orchestration Pattern

3. Profiling Script Structure

4. Impact Quantification Framework

5. Consensus-Building Pattern

Workflow: Step-by-Step

Step 1: Define Performance Problem

Step 2: Create Investigation Plan

Step 3: Spawn Agents in Parallel

Step 4: Wait for All Agents to Complete

Step 5: Review Master Integration Report

Step 6: Implement Optimizations (P0 First)

Real-World Example: QuestDB Refactor Performance Investigation

Common Pitfalls

1. Profiling Only One Layer

2. Serial Agent Execution

3. Optimizing Without Profiling

4. Ignoring Low-Hanging Fruit

5. Not Re-Profiling After Changes

Resources

scripts/

references/

assets/

Troubleshooting

最新 Skills