context-compression by sickn33/antigravity-awesome-skills
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill context-compression当智能体会话生成数百万令牌的对话历史时,压缩变得势在必行。简单粗暴的方法是进行激进压缩,以最小化每次请求的令牌数。正确的优化目标是任务令牌数:完成一项任务所消耗的总令牌数,包括因压缩丢失关键信息而产生的重新获取成本。
在以下情况下启用此技能:
上下文压缩是在节省令牌和丢失信息之间进行权衡。存在三种可用于生产环境的方法:
关键见解:结构强制保留。专门的部分充当了摘要器必须填写的检查清单,防止信息无声地漂移。
传统的压缩指标以每次请求的令牌数为目标。这是错误的优化方向。当压缩丢失了文件路径或错误消息等关键细节时,智能体必须重新获取信息、重新探索方法,并浪费令牌来恢复上下文。
正确的指标是任务令牌数:从任务开始到完成所消耗的总令牌数。一个节省了0.5%更多令牌但导致重新获取成本增加20%的压缩策略,总体成本更高。
在所有压缩方法中,工件追踪完整性是最薄弱的维度,在评估中得分仅为2.2-2.5(满分5.0)。即使是带有明确文件部分的结构化摘要,也难以在长会话中保持完整的文件跟踪。
编码智能体需要知道:
这个问题可能需要超越通用摘要的特殊处理:一个单独的工件索引或在智能体脚手架中明确的文件状态跟踪。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
有效的结构化摘要包含明确的部分:
## Session Intent
[用户试图完成的目标]
## Files Modified
- auth.controller.ts: Fixed JWT token generation
- config/redis.ts: Updated connection pooling
- tests/auth.test.ts: Added mock setup for new config
## Decisions Made
- Using Redis connection pool instead of per-request connections
- Retry logic with exponential backoff for transient failures
## Current State
- 14 tests passing, 2 failing
- Remaining: mock setup for session service tests
## Next Steps
1. Fix remaining test failures
2. Run full test suite
3. Update documentation
这种结构可以防止文件路径或决策的无声丢失,因为每个部分都必须被明确处理。
何时触发压缩与如何压缩同样重要:
| 策略 | 触发点 | 权衡 |
|---|---|---|
| 固定阈值 | 70-80% 上下文使用率 | 简单,但可能压缩过早 |
| 滑动窗口 | 保留最后 N 轮对话 + 摘要 | 可预测的上下文大小 |
| 基于重要性 | 先压缩低相关性部分 | 复杂,但保留信号 |
| 任务边界 | 在逻辑任务完成时压缩 | 摘要清晰,但时机不可预测 |
对于大多数编码智能体用例,采用结构化摘要的滑动窗口方法在可预测性和质量之间提供了最佳平衡。
像 ROUGE 或嵌入相似度这样的传统指标无法捕捉功能性压缩质量。一个摘要可能在词汇重叠上得分很高,却遗漏了智能体需要的那一个文件路径。
基于探针的评估通过在压缩后提问来直接衡量功能性质量:
| 探针类型 | 测试内容 | 示例问题 |
|---|---|---|
| 回忆 | 事实保留 | “原始错误信息是什么?” |
| 工件 | 文件跟踪 | “我们修改了哪些文件?” |
| 延续 | 任务规划 | “我们接下来应该做什么?” |
| 决策 | 推理链 | “关于 Redis 问题我们做出了什么决定?” |
如果压缩保留了正确的信息,智能体就能正确回答。如果没有,它就会猜测或产生幻觉。
六个维度捕捉了编码智能体的压缩质量:
准确性在不同压缩方法之间显示出最大的差异(0.6 分的差距)。工件追踪普遍较弱(2.2-2.5 的范围)。
对于超出上下文窗口的大型代码库或智能体系统,通过三个阶段应用压缩:
当提供手动迁移示例或参考 PR 时,将其用作模板来理解目标模式。该示例揭示了静态分析无法浮现的约束:哪些不变量必须保持,哪些服务会在更改时中断,以及一个干净的迁移是什么样子的。
当智能体无法区分本质复杂性(业务需求)和偶然复杂性(遗留变通方法)时,这一点尤其重要。示例工件编码了这种区别。
在以下情况下使用锚定迭代摘要:
在以下情况下使用不透明压缩:
在以下情况下使用再生式摘要:
| 方法 | 压缩率 | 质量得分 | 权衡 |
|---|---|---|---|
| 锚定迭代 | 98.6% | 3.70 | 质量最佳,压缩率稍低 |
| 再生式 | 98.7% | 3.44 | 质量良好,压缩率中等 |
| 不透明 | 99.3% | 3.35 | 压缩率最佳,质量损失 |
结构化摘要保留的额外 0.7% 令牌换来了 0.35 的质量分。对于任何重新获取成本很重要的任务,这种权衡倾向于结构化方法。
示例 1:调试会话压缩
原始上下文(89,000 令牌,178 条消息):
/api/auth/login 端点出现 401 错误压缩后的结构化摘要:
## Session Intent
Debug 401 Unauthorized error on /api/auth/login despite valid credentials.
## Root Cause
Stale Redis connection in session store. JWT generated correctly but session could not be persisted.
## Files Modified
- auth.controller.ts: No changes (read only)
- middleware/cors.ts: No changes (examined)
- config/redis.ts: Fixed connection pooling configuration
- services/session.service.ts: Added retry logic for transient failures
- tests/auth.test.ts: Updated mock setup
## Test Status
14 passing, 2 failing (mock setup issues)
## Next Steps
1. Fix remaining test failures (mock session service)
2. Run full test suite
3. Deploy to staging
示例 2:探针响应质量
压缩后,询问“原始错误是什么?”:
良好响应(结构化摘要):
“原始错误是来自
/api/auth/login端点的 401 未授权响应。用户在使用有效凭据时收到此错误。根本原因是会话存储中的 Redis 连接陈旧。”
较差响应(激进压缩):
“我们正在调试一个身份验证问题。登录失败。我们修复了一些配置问题。”
结构化响应保留了端点、错误代码和根本原因。激进响应丢失了所有技术细节。
此技能与本集合中的其他几个技能相关联:
内部参考:
本集合中的相关技能:
外部资源:
创建日期 : 2025-12-22 最后更新 : 2025-12-26 作者 : Agent Skills for Context Engineering Contributors 版本 : 1.1.0
每周安装数
323
代码仓库
GitHub 星标数
27.1K
首次出现
Feb 1, 2026
安全审计
安装于
gemini-cli305
opencode305
codex305
github-copilot302
kimi-cli301
amp299
When agent sessions generate millions of tokens of conversation history, compression becomes mandatory. The naive approach is aggressive compression to minimize tokens per request. The correct optimization target is tokens per task: total tokens consumed to complete a task, including re-fetching costs when compression loses critical information.
Activate this skill when:
Context compression trades token savings against information loss. Three production-ready approaches exist:
Anchored Iterative Summarization : Maintain structured, persistent summaries with explicit sections for session intent, file modifications, decisions, and next steps. When compression triggers, summarize only the newly-truncated span and merge with the existing summary. Structure forces preservation by dedicating sections to specific information types.
Opaque Compression : Produce compressed representations optimized for reconstruction fidelity. Achieves highest compression ratios (99%+) but sacrifices interpretability. Cannot verify what was preserved.
Regenerative Full Summary : Generate detailed structured summaries on each compression. Produces readable output but may lose details across repeated compression cycles due to full regeneration rather than incremental merging.
The critical insight: structure forces preservation. Dedicated sections act as checklists that the summarizer must populate, preventing silent information drift.
Traditional compression metrics target tokens-per-request. This is the wrong optimization. When compression loses critical details like file paths or error messages, the agent must re-fetch information, re-explore approaches, and waste tokens recovering context.
The right metric is tokens-per-task: total tokens consumed from task start to completion. A compression strategy saving 0.5% more tokens but causing 20% more re-fetching costs more overall.
Artifact trail integrity is the weakest dimension across all compression methods, scoring 2.2-2.5 out of 5.0 in evaluations. Even structured summarization with explicit file sections struggles to maintain complete file tracking across long sessions.
Coding agents need to know:
This problem likely requires specialized handling beyond general summarization: a separate artifact index or explicit file-state tracking in agent scaffolding.
Effective structured summaries include explicit sections:
## Session Intent
[What the user is trying to accomplish]
## Files Modified
- auth.controller.ts: Fixed JWT token generation
- config/redis.ts: Updated connection pooling
- tests/auth.test.ts: Added mock setup for new config
## Decisions Made
- Using Redis connection pool instead of per-request connections
- Retry logic with exponential backoff for transient failures
## Current State
- 14 tests passing, 2 failing
- Remaining: mock setup for session service tests
## Next Steps
1. Fix remaining test failures
2. Run full test suite
3. Update documentation
This structure prevents silent loss of file paths or decisions because each section must be explicitly addressed.
When to trigger compression matters as much as how to compress:
| Strategy | Trigger Point | Trade-off |
|---|---|---|
| Fixed threshold | 70-80% context utilization | Simple but may compress too early |
| Sliding window | Keep last N turns + summary | Predictable context size |
| Importance-based | Compress low-relevance sections first | Complex but preserves signal |
| Task-boundary | Compress at logical task completions | Clean summaries but unpredictable timing |
The sliding window approach with structured summaries provides the best balance of predictability and quality for most coding agent use cases.
Traditional metrics like ROUGE or embedding similarity fail to capture functional compression quality. A summary may score high on lexical overlap while missing the one file path the agent needs.
Probe-based evaluation directly measures functional quality by asking questions after compression:
| Probe Type | What It Tests | Example Question |
|---|---|---|
| Recall | Factual retention | "What was the original error message?" |
| Artifact | File tracking | "Which files have we modified?" |
| Continuation | Task planning | "What should we do next?" |
| Decision | Reasoning chain | "What did we decide about the Redis issue?" |
If compression preserved the right information, the agent answers correctly. If not, it guesses or hallucinates.
Six dimensions capture compression quality for coding agents:
Accuracy shows the largest variation between compression methods (0.6 point gap). Artifact trail is universally weak (2.2-2.5 range).
For large codebases or agent systems exceeding context windows, apply compression through three phases:
Research Phase : Produce a research document from architecture diagrams, documentation, and key interfaces. Compress exploration into a structured analysis of components and dependencies. Output: single research document.
Planning Phase : Convert research into implementation specification with function signatures, type definitions, and data flow. A 5M token codebase compresses to approximately 2,000 words of specification.
Implementation Phase : Execute against the specification. Context remains focused on the spec rather than raw codebase exploration.
When provided with a manual migration example or reference PR, use it as a template to understand the target pattern. The example reveals constraints that static analysis cannot surface: which invariants must hold, which services break on changes, and what a clean migration looks like.
This is particularly important when the agent cannot distinguish essential complexity (business requirements) from accidental complexity (legacy workarounds). The example artifact encodes that distinction.
Use anchored iterative summarization when:
Use opaque compression when:
Use regenerative summaries when:
| Method | Compression Ratio | Quality Score | Trade-off |
|---|---|---|---|
| Anchored Iterative | 98.6% | 3.70 | Best quality, slightly less compression |
| Regenerative | 98.7% | 3.44 | Good quality, moderate compression |
| Opaque | 99.3% | 3.35 | Best compression, quality loss |
The 0.7% additional tokens retained by structured summarization buys 0.35 quality points. For any task where re-fetching costs matter, this trade-off favors structured approaches.
Example 1: Debugging Session Compression
Original context (89,000 tokens, 178 messages):
Structured summary after compression:
## Session Intent
Debug 401 Unauthorized error on /api/auth/login despite valid credentials.
## Root Cause
Stale Redis connection in session store. JWT generated correctly but session could not be persisted.
## Files Modified
- auth.controller.ts: No changes (read only)
- middleware/cors.ts: No changes (examined)
- config/redis.ts: Fixed connection pooling configuration
- services/session.service.ts: Added retry logic for transient failures
- tests/auth.test.ts: Updated mock setup
## Test Status
14 passing, 2 failing (mock setup issues)
## Next Steps
1. Fix remaining test failures (mock session service)
2. Run full test suite
3. Deploy to staging
Example 2: Probe Response Quality
After compression, asking "What was the original error?":
Good response (structured summarization):
"The original error was a 401 Unauthorized response from the /api/auth/login endpoint. Users received this error with valid credentials. Root cause was stale Redis connection in session store."
Poor response (aggressive compression):
"We were debugging an authentication issue. The login was failing. We fixed some configuration problems."
The structured response preserves endpoint, error code, and root cause. The aggressive response loses all technical detail.
This skill connects to several others in the collection:
Internal reference:
Related skills in this collection:
External resources:
Created : 2025-12-22 Last Updated : 2025-12-26 Author : Agent Skills for Context Engineering Contributors Version : 1.1.0
Weekly Installs
323
Repository
GitHub Stars
27.1K
First Seen
Feb 1, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
gemini-cli305
opencode305
codex305
github-copilot302
kimi-cli301
amp299
agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试
138,300 周安装