自我进化AI智能体：终身学习系统，实现多记忆架构与自动技能优化

self-improving-agent by charon-fan/agent-playbook

16,900 周安装量

22 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/charon-fan/agent-playbook --skill self-improving-agent

AI/机器学习开发生产力

🇨🇳中文介绍

自我进化智能体

"一个从每次交互中学习，积累模式和洞见以持续提升自身能力的 AI 智能体。" — 基于 2025 年终身学习研究

概述

这是一个通用的自我进化系统，它从所有技能经验中学习，而不仅仅是 PRD。它实现了一个完整的反馈循环，包含：

多记忆架构：语义记忆 + 情景记忆 + 工作记忆
自我修正：检测并修复技能指导中的错误
自我验证：定期验证技能准确性
钩子集成：在技能事件上自动触发（before_start, after_complete, on_error）
进化标记：可追溯的变更，并带有来源归属

基于研究的设计

基于 2025 年研究：

研究	关键洞见	应用
SimpleMem	高效的终身记忆	模式积累系统
Multi-Memory Survey	语义 + 情景记忆	世界知识 + 经验
Lifelong Learning

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

自动触发（通过钩子）

事件	触发条件	动作
before_start	任何技能开始时	记录会话开始
after_complete	任何技能完成时	提取模式，更新技能
on_error	Bash 返回非零退出码时	捕获错误上下文，触发自我修正

用户说"自我进化", "self-improve", "从经验中学习"
用户说"分析今天的经验", "总结教训"
用户要求改进特定技能

进化优先级矩阵

当出现新的可重用知识时，触发进化：

触发条件	目标技能	优先级	动作
发现新的 PRD 模式	prd-planner	高	添加到质量检查清单
明确架构权衡	architecting-solutions	高	添加到决策模式
学习到 API 设计规则	api-designer	高	更新模板
发现调试修复方法	debugger	高	添加到反模式
审查清单存在缺口	code-reviewer	高	添加清单项
性能/安全洞见	performance-engineer, security-auditor	高	添加到模式
UI/UX 规范问题	prd-planner, architecting-solutions	高	添加视觉规范要求
React/状态模式	debugger, refactoring-specialist	中	添加到模式
测试策略改进	test-automator, qa-expert	中	更新方法
CI/部署修复	deployment-engineer	中	添加到故障排除

1. 语义记忆 (`memory/semantic-patterns.json`)

存储抽象模式和规则，可在不同上下文中重用：

{
  "patterns": {
    "pattern_id": {
      "id": "pat-2025-01-11-001",
      "name": "Pattern Name",
      "source": "user_feedback|implementation_review|retrospective",
      "confidence": 0.95,
      "applications": 5,
      "created": "2025-01-11",
      "category": "prd_structure|react_patterns|async_patterns|...",
      "pattern": "One-line summary",
      "problem": "What problem does this solve?",
      "solution": { ... },
      "quality_rules": [ ... ],
      "target_skills": [ ... ]
    }
  }
}

2. 情景记忆 (`memory/episodic/`)

存储具体经验和发生的事件：

memory/episodic/
├── 2025/
│   ├── 2025-01-11-prd-creation.json
│   ├── 2025-01-11-debug-session.json
│   └── 2025-01-12-refactoring.json

{
  "id": "ep-2025-01-11-001",
  "timestamp": "2025-01-11T10:30:00Z",
  "skill": "debugger",
  "situation": "User reported data not refreshing after form submission",
  "root_cause": "Empty callback in onRefresh prop",
  "solution": "Implement actual refresh logic in callback",
  "lesson": "Always verify callbacks are not empty functions",
  "related_pattern": "callback_verification",
  "user_feedback": {
    "rating": 8,
    "comments": "This was exactly the issue"
  }
}

3. 工作记忆 (`memory/working/`)

存储当前会话上下文：

memory/working/
├── current_session.json   # 活跃会话数据
├── last_error.json        # 用于自我修正的错误上下文
└── session_end.json       # 会话结束标记

阶段 1：经验提取

任何技能完成后，提取：

What happened:
  skill_used: {which skill}
  task: {what was being done}
  outcome: {success|partial|failure}

Key Insights:
  what_went_well: [what worked]
  what_went_wrong: [what didn't work]
  root_cause: {underlying issue if applicable}

User Feedback:
  rating: {1-10 if provided}
  comments: {specific feedback}

阶段 2：模式抽象

将经验转化为可重用的模式：

具体经验	抽象模式	目标技能
"用户忘记保存 PRD 笔记"	"始终将思考持久化到文件"	prd-planner
"代码审查漏掉了 SQL 注入"	"添加安全检查清单项"	code-reviewer
"回调函数为空，未生效"	"验证回调函数实现"	debugger
"净 APY 位置不明确"	"UI 规范需要精确的相对位置"	prd-planner

If experience_repeats 3+ times:
  pattern_level: critical
  action: Add to skill's "Critical Mistakes" section

If solution_was_effective:
  pattern_level: best_practice
  action: Add to skill's "Best Practices" section

If user_rating >= 7:
  pattern_level: strength
  action: Reinforce this approach

If user_rating <= 4:
  pattern_level: weakness
  action: Add to "What to Avoid" section

阶段 3：技能更新

使用进化标记更新相应的技能文件：

<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->

## Pattern Added (2025-01-12)

**Pattern**: Always verify callbacks are not empty functions

**Source**: Episode ep-2025-01-12-001

**Confidence**: 0.95

### Updated Checklist
- [ ] Verify all callbacks have implementations
- [ ] Test callback execution paths

修正标记（当修复错误指导时）：

<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->

## Corrected Guidance

Use direct state monitoring instead of callback chains:
```typescript
// ✅ Do: Direct state monitoring
const prevPendingCount = usePrevious(pendingCount);

阶段 4：记忆巩固

更新语义记忆 (memory/semantic-patterns.json)
存储情景记忆 (memory/episodic/YYYY-MM-DD-{skill}.json)
基于应用/反馈更新模式置信度
修剪过时的模式（低置信度，近期无应用）

自我修正 (on_error 钩子)

在以下情况触发：

Bash 命令返回非零退出码
遵循技能指导后测试失败
用户报告指导产生了错误结果

## Self-Correction Workflow

1. Detect Error
   - Capture error context from working/last_error.json
   - Identify which skill guidance was followed

2. Verify Root Cause
   - Was the skill guidance incorrect?
   - Was the guidance misinterpreted?
   - Was the guidance incomplete?

3. Apply Correction
   - Update skill file with corrected guidance
   - Add correction marker with reason
   - Update related patterns in semantic memory

4. Validate Fix
   - Test the corrected guidance
   - Ask user to verify

<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->

## Self-Correction: Click-Time Computation

**Issue**: Using useMemo for claimable IDs caused stale data
**Fix**: Compute at click time for always-fresh data
**Pattern**: click_time_vs_open_time_computation

在审查更新时，使用 references/appendix.md 中的验证模板。

在 Claude Code 设置中配置钩子

添加到 Claude Code 设置 (~/.claude/settings.json)：

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""
          }
        ]
      }
    ],
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"
          }
        ]
      }
    ]
  }
}

将 ${SKILLS_DIR} 替换为你的实际技能路径。

有关内存结构、工作流程图、指标、反馈模板和研究链接，请参阅 references/appendix.md。

✅ 从每次技能交互中学习
✅ 在正确的抽象级别提取模式
✅ 更新多个相关技能
✅ 跟踪置信度和应用次数
✅ 就改进征求用户反馈
✅ 使用进化/修正标记以实现可追溯性
✅ 在广泛应用前验证指导

❌ 从单一经验过度泛化
❌ 更新技能时不跟踪置信度
❌ 忽略负面反馈
❌ 做出破坏现有功能的更改
❌ 创建相互矛盾的模式
❌ 不理解上下文就更新技能

任何技能完成后，此智能体会自动：

分析发生了什么
提取模式和洞见
更新相关技能文件
记录到内存以供将来参考
报告摘要给用户

🇺🇸English

Self-Improving Agent

"An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research

Overview

This is a universal self-improvement system that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:

Multi-Memory Architecture : Semantic + Episodic + Working memory
Self-Correction : Detects and fixes skill guidance errors
Self-Validation : Periodically verifies skill accuracy
Hooks Integration : Auto-triggers on skill events (before_start, after_complete, on_error)
Evolution Markers : Traceable changes with source attribution

Research-Based Design

Based on 2025 research:

Research	Key Insight	Application
SimpleMem	Efficient lifelong memory	Pattern accumulation system
Multi-Memory Survey	Semantic + Episodic memory	World knowledge + experiences
Lifelong Learning	Continuous task stream learning	Learn from every skill use
Evo-Memory	Test-time lifelong learning	Real-time adaptation

The Self-Improvement Loop

┌─────────────────────────────────────────────────────────────────┐
│                    UNIVERSAL SELF-IMPROVEMENT                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   Skill Event → Extract Experience → Abstract Pattern → Update  │
│        │                  │                │         │          │
│        ▼                  ▼                ▼         ▼          │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              MULTI-MEMORY SYSTEM                      │       │
│   ├─────────────────────────────────────────────────────┤       │
│   │  Semantic Memory   │  Episodic Memory  │ Working Memory │  │
│   │  (Patterns/Rules)  │  (Experiences)    │  (Current)     │  │
│   │  memory/semantic/  │  memory/episodic/ │  memory/working/│  │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              FEEDBACK LOOP                            │       │
│   │  User Feedback → Confidence Update → Pattern Adapt   │       │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

When This Activates

Automatic Triggers (via hooks)

Event	Trigger	Action
before_start	Any skill starts	Log session start
after_complete	Any skill completes	Extract patterns, update skills
on_error	Bash returns non-zero exit	Capture error context, trigger self-correction

Manual Triggers

User says "自我进化", "self-improve", "从经验中学习"
User says "分析今天的经验", "总结教训"
User asks to improve a specific skill

Evolution Priority Matrix

Trigger evolution when new reusable knowledge appears:

Trigger	Target Skill	Priority	Action
New PRD pattern discovered	prd-planner	High	Add to quality checklist
Architecture tradeoff clarified	architecting-solutions	High	Add to decision patterns
API design rule learned	api-designer	High	Update template
Debugging fix discovered	debugger	High	Add to anti-patterns
Review checklist gap	code-reviewer	High	Add checklist item
Perf/security insight	performance-engineer, security-auditor	High	Add to patterns
UI/UX spec issue	prd-planner, architecting-solutions	High

Multi-Memory Architecture

1. Semantic Memory (`memory/semantic-patterns.json`)

Stores abstract patterns and rules reusable across contexts:

{
  "patterns": {
    "pattern_id": {
      "id": "pat-2025-01-11-001",
      "name": "Pattern Name",
      "source": "user_feedback|implementation_review|retrospective",
      "confidence": 0.95,
      "applications": 5,
      "created": "2025-01-11",
      "category": "prd_structure|react_patterns|async_patterns|...",
      "pattern": "One-line summary",
      "problem": "What problem does this solve?",
      "solution": { ... },
      "quality_rules": [ ... ],
      "target_skills": [ ... ]
    }
  }
}

2. Episodic Memory (`memory/episodic/`)

Stores specific experiences and what happened :

memory/episodic/
├── 2025/
│   ├── 2025-01-11-prd-creation.json
│   ├── 2025-01-11-debug-session.json
│   └── 2025-01-12-refactoring.json



{
  "id": "ep-2025-01-11-001",
  "timestamp": "2025-01-11T10:30:00Z",
  "skill": "debugger",
  "situation": "User reported data not refreshing after form submission",
  "root_cause": "Empty callback in onRefresh prop",
  "solution": "Implement actual refresh logic in callback",
  "lesson": "Always verify callbacks are not empty functions",
  "related_pattern": "callback_verification",
  "user_feedback": {
    "rating": 8,
    "comments": "This was exactly the issue"
  }
}

3. Working Memory (`memory/working/`)

Stores current session context :

memory/working/
├── current_session.json   # Active session data
├── last_error.json        # Error context for self-correction
└── session_end.json       # Session end marker

Self-Improvement Process

Phase 1: Experience Extraction

After any skill completes, extract:

What happened:
  skill_used: {which skill}
  task: {what was being done}
  outcome: {success|partial|failure}

Key Insights:
  what_went_well: [what worked]
  what_went_wrong: [what didn't work]
  root_cause: {underlying issue if applicable}

User Feedback:
  rating: {1-10 if provided}
  comments: {specific feedback}

Phase 2: Pattern Abstraction

Convert experiences to reusable patterns:

Concrete Experience	Abstract Pattern	Target Skill
"User forgot to save PRD notes"	"Always persist thinking to files"	prd-planner
"Code review missed SQL injection"	"Add security checklist item"	code-reviewer
"Callback was empty, didn't work"	"Verify callback implementations"	debugger
"Net APY position ambiguous"	"UI specs need exact relative positions"	prd-planner

Abstraction Rules:

If experience_repeats 3+ times:
  pattern_level: critical
  action: Add to skill's "Critical Mistakes" section

If solution_was_effective:
  pattern_level: best_practice
  action: Add to skill's "Best Practices" section

If user_rating >= 7:
  pattern_level: strength
  action: Reinforce this approach

If user_rating <= 4:
  pattern_level: weakness
  action: Add to "What to Avoid" section

Phase 3: Skill Updates

Update the appropriate skill files with evolution markers :

<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->

## Pattern Added (2025-01-12)

**Pattern**: Always verify callbacks are not empty functions

**Source**: Episode ep-2025-01-12-001

**Confidence**: 0.95

### Updated Checklist
- [ ] Verify all callbacks have implementations
- [ ] Test callback execution paths

Correction Markers (when fixing wrong guidance):

<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->

## Corrected Guidance

Use direct state monitoring instead of callback chains:
```typescript
// ✅ Do: Direct state monitoring
const prevPendingCount = usePrevious(pendingCount);



### Phase 4: Memory Consolidation

1. **Update semantic memory** (`memory/semantic-patterns.json`)
2. **Store episodic memory** (`memory/episodic/YYYY-MM-DD-{skill}.json`)
3. **Update pattern confidence** based on applications/feedback
4. **Prune outdated patterns** (low confidence, no recent applications)

## Self-Correction (on_error hook)

Triggered when:
- Bash command returns non-zero exit code
- Tests fail after following skill guidance
- User reports the guidance produced incorrect results

**Process:**

```markdown
## Self-Correction Workflow

1. Detect Error
   - Capture error context from working/last_error.json
   - Identify which skill guidance was followed

2. Verify Root Cause
   - Was the skill guidance incorrect?
   - Was the guidance misinterpreted?
   - Was the guidance incomplete?

3. Apply Correction
   - Update skill file with corrected guidance
   - Add correction marker with reason
   - Update related patterns in semantic memory

4. Validate Fix
   - Test the corrected guidance
   - Ask user to verify

Example:

<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->

## Self-Correction: Click-Time Computation

**Issue**: Using useMemo for claimable IDs caused stale data
**Fix**: Compute at click time for always-fresh data
**Pattern**: click_time_vs_open_time_computation

Self-Validation

Use the validation template in references/appendix.md when reviewing updates.

Hooks Integration

Wiring Hooks in Claude Code Settings

Add to Claude Code settings (~/.claude/settings.json):

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""
          }
        ]
      }
    ],
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"
          }
        ]
      }
    ]
  }
}

Replace ${SKILLS_DIR} with your actual skills path.

Additional References

See references/appendix.md for memory structure, workflow diagrams, metrics, feedback templates, and research links.

Best Practices

DO

✅ Learn from EVERY skill interaction
✅ Extract patterns at the right abstraction level
✅ Update multiple related skills
✅ Track confidence and apply counts
✅ Ask for user feedback on improvements
✅ Use evolution/correction markers for traceability
✅ Validate guidance before applying broadly

DON'T

❌ Over-generalize from single experiences
❌ Update skills without confidence tracking
❌ Ignore negative feedback
❌ Make changes that break existing functionality
❌ Create contradictory patterns
❌ Update skills without understanding context

Quick Start

After any skill completes, this agent automatically:

Analyzes what happened
Extracts patterns and insights
Updates relevant skill files
Logs to memory for future reference
Reports summary to user

References

Weekly Installs

5.8K

Repository

charon-fan/agen…playbook

GitHub Stars

First Seen

Jan 22, 2026

Security Audits

Gen Agent Trust HubWarn SocketPass SnykPass

Installed on

opencode5.5K

gemini-cli5.5K

codex5.5K

cursor5.5K

github-copilot5.5K

kimi-cli5.4K

自我进化AI智能体：终身学习系统，实现多记忆架构与自动技能优化

🇨🇳中文介绍

自我进化智能体

概述

基于研究的设计

相关 Skills

自我进化循环

何时激活

自动触发（通过钩子）

手动触发

进化优先级矩阵

多记忆架构

1. 语义记忆 (memory/semantic-patterns.json)

2. 情景记忆 (memory/episodic/)

3. 工作记忆 (memory/working/)

自我进化过程