npx skills add https://github.com/crinkj/common-claude-setting --skill memory-systems记忆提供持久化层,使智能体能够在会话间保持连续性,并对积累的知识进行推理。简单智能体完全依赖上下文实现记忆,会话结束时所有状态都会丢失。复杂智能体则采用分层记忆架构,在即时上下文需求和长期知识保留之间取得平衡。从向量存储到知识图谱再到时序知识图谱的演进,代表着对结构化记忆的投入不断增加,以提升检索和推理能力。
在以下情况下启用此技能:
记忆涵盖从易失性上下文窗口到持久化存储的整个范围。基准测试的关键启示:工具复杂性不如可靠检索重要——Letta 基于文件系统的智能体使用基本文件操作在 LoCoMo 上获得了 74% 的分数,超过了 Mem0 专用工具的 68.5%。从简单开始,仅在检索质量要求时才添加结构(图谱、时序有效性)。
| 框架 | 架构 | 最佳适用场景 | 权衡取舍 |
|---|---|---|---|
| Mem0 | 向量存储 + 图谱记忆,可插拔后端 | 多租户系统,广泛集成 | 对多智能体场景的专业性较弱 |
| Zep/Graphiti | 时序知识图谱,双时序模型 | 需要关系建模和时序推理的企业场景 | 高级功能锁定在云端 |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| Letta | 具有分层存储(上下文内/核心/归档)的自编辑记忆 | 完整的智能体内省,有状态服务 | 对于简单用例过于复杂 |
| Cognee | 通过可定制的 ECL 管道和可定制任务实现的多层语义图谱 | 能够适应和学习的演进型智能体记忆;多跳推理 | 更重的数据摄取时处理 |
| LangMem | 用于 LangGraph 工作流的记忆工具 | 已在使用 LangGraph 的团队 | 与 LangGraph 紧密耦合 |
| 文件系统 | 具有命名约定的纯文件 | 简单智能体,原型开发 | 无语义搜索,无关系 |
Zep 的 Graphiti 引擎构建了一个三层知识图谱(事件、语义实体、社区子图),并采用双时序模型来跟踪事件发生的时间和被摄取的时间。Mem0 通过托管基础设施提供了最快的生产部署路径。Letta 通过其智能体开发环境提供了最深入的智能体控制。Cognee 生成多层语义图谱——它将文本块和实体类型作为节点,并带有详细的关系边,构建出互联的知识引擎。每个核心部分(摄取、实体提取、后处理、检索)都是可定制的。
基准性能对比
| 系统 | DMR 准确率 | LoCoMo | HotPotQA(多跳) | 延迟 |
|---|---|---|---|---|
| Cognee | — | — | 在 EM、F1、Correctness 上最高 | 可变 |
| Zep(时序 KG) | 94.8% | — | 各项指标中等 | 2.58s |
| Letta(文件系统) | — | 74.0% | — | — |
| Mem0 | — | 68.5% | 各项指标最低 | — |
| MemGPT | 93.4% | — | — | 可变 |
| GraphRAG | ~75-85% | — | — | 可变 |
| 向量 RAG 基线 | ~60-70% | — | — | 快 |
Zep 在 LongMemEval 上实现了高达 18.5% 的准确率提升,同时将延迟降低了 90%。在 HotPotQA 多跳推理基准测试中,Cognee 在精确匹配率、F1 分数和类人正确性指标上均优于 Mem0、Graphiti 和 LightRAG。Letta 基于文件系统的智能体使用基本文件操作在 LoCoMo 上获得了 74% 的分数,优于专用记忆工具——工具复杂性不如可靠检索重要。没有一个基准测试是决定性的;应将这些视为特定检索维度的信号,而非排名。
| 层级 | 持久性 | 实现方式 | 何时使用 |
|---|---|---|---|
| 工作记忆 | 仅限上下文窗口 | 系统提示中的便签 | 始终使用——通过关注度优势位置进行优化 |
| 短期记忆 | 会话范围 | 文件系统,内存缓存 | 中间工具结果,对话状态 |
| 长期记忆 | 跨会话 | 键值存储 → 图数据库 | 用户偏好,领域知识,实体注册表 |
| 实体记忆 | 跨会话 | 实体注册表 + 属性 | 保持身份一致性(“John Doe”在不同对话中是同一个人) |
| 时序知识图谱 | 跨会话 + 历史 | 带有有效性区间的图谱 | 随时间变化的事实,时间旅行查询,防止上下文冲突 |
| 策略 | 使用时机 | 局限性 |
|---|---|---|
| 语义检索(嵌入相似度) | 直接事实查询 | 在多跳推理上效果下降 |
| 基于实体的检索(图遍历) | “告诉我关于 X 的一切” | 需要图结构 |
| 时序检索(有效性过滤) | 随时间变化的事实 | 需要有效性元数据 |
| 混合检索(语义 + 关键词 + 图谱) | 最佳整体准确率 | 基础设施最复杂 |
Zep 的混合方法通过仅检索相关子图,实现了 90% 的延迟降低(2.58 秒 vs 28.9 秒)。Cognee 通过其 14 种搜索模式实现混合检索——每种模式都结合了其三存储架构(图、向量、关系)中的不同策略,让智能体根据查询类型选择检索策略,而不是使用一刀切的方法。
定期整合记忆以防止无限增长。使其失效但不要丢弃——保留历史对于时序查询很重要。在记忆数量达到阈值、检索质量下降或预定时间间隔时触发整合。有关工作整合代码,请参阅实现参考。
从简单开始,仅在检索失败时增加复杂性。 大多数智能体在第一天并不需要时序知识图谱。
记忆必须与上下文系统集成才能发挥作用。使用即时记忆加载在需要时检索相关记忆。使用策略性注入将记忆放置在关注度优势位置(上下文开头/结尾)。
valid_until 时间戳。如果大多数结果已过期,在重试前触发整合。valid_from 时间最近的事实。如果置信度低,向用户展示冲突。示例 1:Mem0 集成
from mem0 import Memory
m = Memory()
m.add("User prefers dark mode and Python 3.12", user_id="alice")
m.add("User switched to light mode", user_id="alice")
# 检索当前偏好(浅色模式),而非过时的偏好
results = m.search("What theme does the user prefer?", user_id="alice")
示例 2:时序查询
# 跟踪具有有效期的实体
graph.create_temporal_relationship(
source_id=user_node,
rel_type="LIVES_AT",
target_id=address_node,
valid_from=datetime(2024, 1, 15),
valid_until=datetime(2024, 9, 1), # 搬出
)
# 查询:用户在 2024 年 3 月 1 日住在哪里?
results = graph.query_at_time(
{"type": "LIVES_AT", "source_label": "User"},
query_time=datetime(2024, 3, 1)
)
示例 3:Cognee 记忆摄取与搜索
import cognee
from cognee.modules.search.types import SearchType
# 摄取并构建知识图谱
await cognee.add("./docs/")
await cognee.add("any data")
await cognee.cognify()
# 丰富记忆
await cognee.memify()
# 智能体检索具有关系感知的上下文
results = await cognee.search(
query_text="Any query for your memory",
query_type=SearchType.GRAPH_COMPLETION,
)
此技能建立在 context-fundamentals 之上。它连接到:
内部参考资料:
本集合中的相关技能:
外部资源:
创建日期:2025-12-20 最后更新:2026-02-26 作者:Agent Skills for Context Engineering Contributors 版本:3.0.0
每周安装数
1
代码库
首次出现
今天
安全审计
安装于
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
Memory provides the persistence layer that allows agents to maintain continuity across sessions and reason over accumulated knowledge. Simple agents rely entirely on context for memory, losing all state when sessions end. Sophisticated agents implement layered memory architectures that balance immediate context needs with long-term knowledge retention. The evolution from vector stores to knowledge graphs to temporal knowledge graphs represents increasing investment in structured memory for improved retrieval and reasoning.
Activate this skill when:
Memory spans a spectrum from volatile context window to persistent storage. Key insight from benchmarks: tool complexity matters less than reliable retrieval — Letta's filesystem agents scored 74% on LoCoMo using basic file operations, beating Mem0's specialized tools at 68.5%. Start simple, add structure (graphs, temporal validity) only when retrieval quality demands it.
| Framework | Architecture | Best For | Trade-off |
|---|---|---|---|
| Mem0 | Vector store + graph memory, pluggable backends | Multi-tenant systems, broad integrations | Less specialized for multi-agent |
| Zep/Graphiti | Temporal knowledge graph, bi-temporal model | Enterprise requiring relationship modeling + temporal reasoning | Advanced features cloud-locked |
| Letta | Self-editing memory with tiered storage (in-context/core/archival) | Full agent introspection, stateful services | Complexity for simple use cases |
| Cognee | Multi-layer semantic graph via customizable ECL pipeline with customizable Tasks | Evolving agent memory that adapts and learns; multi-hop reasoning | Heavier ingest-time processing |
| LangMem | Memory tools for LangGraph workflows | Teams already on LangGraph | Tightly coupled to LangGraph |
| File-system | Plain files with naming conventions | Simple agents, prototyping | No semantic search, no relationships |
Zep's Graphiti engine builds a three-tier knowledge graph (episode, semantic entity, community subgraphs) with a bi-temporal model tracking both when events occurred and when they were ingested. Mem0 offers the fastest path to production with managed infrastructure. Letta provides the deepest agent control through its Agent Development Environment. Cognee produces multi-layer semantic graphs — it layers text chunks and entity types as nodes with detailed relationship edges, building interconnected knowledge engine. Every core piece (ingestion, entity extraction, post-processing, retrieval) is customizable.
Benchmark Performance Comparison
| System | DMR Accuracy | LoCoMo | HotPotQA (multi-hop) | Latency |
|---|---|---|---|---|
| Cognee | — | — | Highest on EM, F1, Correctness | Variable |
| Zep (Temporal KG) | 94.8% | — | Mid-range across metrics | 2.58s |
| Letta (filesystem) | — | 74.0% | — | — |
| Mem0 | — | 68.5% | Lowest across metrics | — |
| MemGPT | 93.4% | — | — | Variable |
| GraphRAG | ~75-85% | — |
Zep achieves up to 18.5% accuracy improvement on LongMemEval while reducing latency by 90%. Cognee outperformed Mem0, Graphiti, and LightRAG on HotPotQA multi-hop reasoning benchmarks across Exact Match, F1, and human-like correctness metrics. Letta's filesystem-based agents achieved 74% on LoCoMo using basic file operations, outperforming specialized memory tools — tool complexity matters less than reliable retrieval. No single benchmark is definitive; treat these as signals for specific retrieval dimensions rather than rankings.
| Layer | Persistence | Implementation | When to Use |
|---|---|---|---|
| Working | Context window only | Scratchpad in system prompt | Always — optimize with attention-favored positions |
| Short-term | Session-scoped | File-system, in-memory cache | Intermediate tool results, conversation state |
| Long-term | Cross-session | Key-value store → graph DB | User preferences, domain knowledge, entity registries |
| Entity | Cross-session | Entity registry + properties | Maintaining identity ("John Doe" = same person across conversations) |
| Temporal KG | Cross-session + history | Graph with validity intervals | Facts that change over time, time-travel queries, preventing context clash |
| Strategy | Use When | Limitation |
|---|---|---|
| Semantic (embedding similarity) | Direct factual queries | Degrades on multi-hop reasoning |
| Entity-based (graph traversal) | "Tell me everything about X" | Requires graph structure |
| Temporal (validity filter) | Facts change over time | Requires validity metadata |
| Hybrid (semantic + keyword + graph) | Best overall accuracy | Most infrastructure |
Zep's hybrid approach achieves 90% latency reduction (2.58s vs 28.9s) by retrieving only relevant subgraphs. Cognee implements hybrid retrieval through its 14 search modes — each mode combines different strategies from its three-store architecture (graph, vector, relational), letting agents select the retrieval strategy that fits the query type rather than using a one-size-fits-all approach.
Consolidate periodically to prevent unbounded growth. Invalidate but don't discard — preserving history matters for temporal queries. Trigger on memory count thresholds, degraded retrieval quality, or scheduled intervals. See implementation reference for working consolidation code.
Start simple, add complexity only when retrieval fails. Most agents don't need a temporal knowledge graph on day one.
Memories must integrate with context systems to be useful. Use just-in-time memory loading to retrieve relevant memories when needed. Use strategic injection to place memories in attention-favored positions (beginning/end of context).
valid_until timestamps. If most results are expired, trigger consolidation before retrying.valid_from. Surface the conflict to the user if confidence is low.Example 1: Mem0 Integration
from mem0 import Memory
m = Memory()
m.add("User prefers dark mode and Python 3.12", user_id="alice")
m.add("User switched to light mode", user_id="alice")
# Retrieves current preference (light mode), not outdated one
results = m.search("What theme does the user prefer?", user_id="alice")
Example 2: Temporal Query
# Track entity with validity periods
graph.create_temporal_relationship(
source_id=user_node,
rel_type="LIVES_AT",
target_id=address_node,
valid_from=datetime(2024, 1, 15),
valid_until=datetime(2024, 9, 1), # moved out
)
# Query: Where did user live on March 1, 2024?
results = graph.query_at_time(
{"type": "LIVES_AT", "source_label": "User"},
query_time=datetime(2024, 3, 1)
)
Example 3: Cognee Memory Ingestion and Search
import cognee
from cognee.modules.search.types import SearchType
# Ingest and build knowledge graph
await cognee.add("./docs/")
await cognee.add("any data")
await cognee.cognify()
# Enrich memory
await cognee.memify()
# Agent retrieves relationship-aware context
results = await cognee.search(
query_text="Any query for your memory",
query_type=SearchType.GRAPH_COMPLETION,
)
This skill builds on context-fundamentals. It connects to:
Internal references:
Related skills in this collection:
External resources:
Created : 2025-12-20 Last Updated : 2026-02-26 Author : Agent Skills for Context Engineering Contributors Version : 3.0.0
Weekly Installs
1
Repository
First Seen
Today
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
45,100 周安装
| — |
| Variable |
| Vector RAG baseline | ~60-70% | — | — | Fast |