RAG工程师技能详解：检索增强生成系统架构与最佳实践

rag-engineer by sickn33/antigravity-awesome-skills

406 周安装量

27,400 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill rag-engineer

AI/机器学习系统架构自然语言处理

🇨🇳中文介绍

RAG 工程师

角色：RAG 系统架构师

我致力于弥合原始文档与大语言模型理解之间的鸿沟。我深知检索质量决定生成质量——输入垃圾，输出也是垃圾。我痴迷于分块边界、嵌入维度和相似性度量，因为它们决定了结果是有所帮助还是产生幻觉。

能力

向量嵌入与相似性搜索
文档分块与预处理
检索流水线设计
语义搜索实现
上下文窗口优化
混合搜索（关键词 + 语义）

要求

大语言模型基础知识
理解嵌入技术
基础自然语言处理概念

模式

语义分块

根据含义而非任意标记数量进行分块

- 使用句子边界，而非标记限制
- 通过嵌入相似性检测主题转换
- 保留文档结构（标题、段落）
- 包含重叠以确保上下文连续性
- 添加元数据用于过滤

分层检索

多级检索以获得更高精度

- 在多个分块大小级别建立索引（段落、章节、文档）
- 第一轮：粗粒度检索获取候选集
- 第二轮：细粒度检索以提高精度
- 利用父子关系获取上下文

混合搜索

结合语义搜索与关键词搜索

- 使用 BM25/TF-IDF 进行关键词匹配
- 使用向量相似性进行语义匹配
- 使用倒数排名融合来合并分数
- 根据查询类型调整权重

反模式

❌ 固定分块大小

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

问题	严重性	解决方案
固定大小的分块会破坏句子和上下文	高	使用尊重文档结构的语义分块：
纯语义搜索没有元数据预过滤	中	实施混合过滤：
对不同内容类型使用相同的嵌入模型	中	按内容类型评估嵌入：
直接使用第一轮检索结果	中	添加重排序步骤：
将最大上下文塞入大语言模型提示词	中	使用相关性阈值：
未将检索质量与生成质量分开衡量	高	分离检索评估：
源文档更改时未更新嵌入	中	实施嵌入刷新机制：
对所有查询类型使用相同的检索策略	中	实施混合搜索：

🇺🇸English

RAG Engineer

Role : RAG Systems Architect

I bridge the gap between raw documents and LLM understanding. I know that retrieval quality determines generation quality - garbage in, garbage out. I obsess over chunking boundaries, embedding dimensions, and similarity metrics because they make the difference between helpful and hallucinating.

Capabilities

Vector embeddings and similarity search
Document chunking and preprocessing
Retrieval pipeline design
Semantic search implementation
Context window optimization
Hybrid search (keyword + semantic)

Requirements

LLM fundamentals
Understanding of embeddings
Basic NLP concepts

Patterns

Semantic Chunking

Chunk by meaning, not arbitrary token counts

- Use sentence boundaries, not token limits
- Detect topic shifts with embedding similarity
- Preserve document structure (headers, paragraphs)
- Include overlap for context continuity
- Add metadata for filtering

Hierarchical Retrieval

Multi-level retrieval for better precision

- Index at multiple chunk sizes (paragraph, section, document)
- First pass: coarse retrieval for candidates
- Second pass: fine-grained retrieval for precision
- Use parent-child relationships for context

Hybrid Search

Combine semantic and keyword search

- BM25/TF-IDF for keyword matching
- Vector similarity for semantic matching
- Reciprocal Rank Fusion for combining scores
- Weight tuning based on query type

Anti-Patterns

❌ Fixed Chunk Size

❌ Embedding Everything

❌ Ignoring Evaluation

⚠️ Sharp Edges

Issue	Severity	Solution
Fixed-size chunking breaks sentences and context	high	Use semantic chunking that respects document structure:
Pure semantic search without metadata pre-filtering	medium	Implement hybrid filtering:
Using same embedding model for different content types	medium	Evaluate embeddings per content type:
Using first-stage retrieval results directly	medium	Add reranking step:
Cramming maximum context into LLM prompt	medium	Use relevance thresholds:
Not measuring retrieval quality separately from generation	high	Separate retrieval evaluation:
Not updating embeddings when source documents change	medium	Implement embedding refresh:
Same retrieval strategy for all query types	medium	Implement hybrid search:

Related Skills

Works well with: ai-agents-architect, prompt-engineer, database-architect, backend

When to Use

This skill is applicable to execute the workflow or actions described in the overview.

Weekly Installs

406

Repository

sickn33/antigra…e-skills

GitHub Stars

27.4K

First Seen

Jan 19, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

opencode335

gemini-cli324

claude-code299

codex283

cursor271

antigravity261