Vector Search Designer by jmsktm/claude-settings
npx skills add https://github.com/jmsktm/claude-settings --skill 'Vector Search Designer'Vector Search Designer 技能可帮助您架构和实现向量相似性搜索系统,为语义搜索、推荐引擎和 AI 应用程序提供支持。它指导您选择合适的向量数据库、设计索引结构、优化查询性能,并扩展到数百万甚至数十亿个向量。
向量搜索已成为现代 AI 系统的基础,从 RAG 管道到产品推荐。此技能涵盖全栈内容:理解近似最近邻 (ANN) 算法、在数据库选项之间进行选择、调整召回率与延迟的权衡,以及实现生产就绪的搜索基础设施。
无论您是基于 Pinecone、Weaviate、Qdrant、pgvector 构建,还是实现自己的解决方案,此技能都能确保您的向量搜索系统满足性能和准确性要求。
选择 ANN 算法:
配置 索引参数:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
hnsw_config = {
"M": 16, # 每个节点的连接数(越高 = 召回率越好,内存占用越多)
"efConstruction": 200, # 构建时的搜索深度
"efSearch": 100, # 查询时的搜索深度
}
# IVF 示例配置
ivf_config = {
"nlist": 1024, # 聚类数量
"nprobe": 32, # 查询时搜索的聚类数量
}
3. 规划 扩展的分片策略 4. 设计 用于过滤的元数据模式
| 操作 | 命令/触发条件 |
|---|---|
| 选择数据库 | "Which vector database for [use case]" |
| 设计索引 | "Design vector index for [scale]" |
| 优化搜索 | "Speed up vector search" |
| 添加过滤 | "Add metadata filters to vector search" |
| 扩展向量 | "Scale to [N] million vectors" |
| 基准测试搜索 | "Benchmark vector search performance" |
选择合适规模的数据库:不要为不需要的规模过度设计
理解召回率与速度的权衡:ANN 本质上是近似的
需要时使用混合搜索:结合向量搜索和关键词搜索
为过滤设计元数据:预先规划过滤策略
尽可能批量操作:减少网络开销
监控和告警:生产环境搜索需要可观测性
处理具有多种表示形式的文档:
class MultiVectorIndex:
def __init__(self):
self.title_index = VectorIndex(dim=768)
self.content_index = VectorIndex(dim=768)
self.summary_index = VectorIndex(dim=768)
def search(self, query_embedding, weights=None):
weights = weights or {"title": 0.3, "content": 0.5, "summary": 0.2}
results = {}
for field, weight in weights.items():
index = getattr(self, f"{field}_index")
field_results = index.search(query_embedding, k=20)
for doc_id, score in field_results:
results[doc_id] = results.get(doc_id, 0) + score * weight
return sorted(results.items(), key=lambda x: x[1], reverse=True)[:10]
使用过滤器优化搜索:
def filtered_search(query_embedding, filters, k=10):
# 策略 1:预过滤(针对选择性过滤器)
if estimate_selectivity(filters) < 0.1:
candidate_ids = apply_filters(filters)
return vector_search_subset(query_embedding, candidate_ids, k)
# 策略 2:后过滤(针对非选择性过滤器)
elif estimate_selectivity(filters) > 0.5:
results = vector_search(query_embedding, k * 3)
filtered = [r for r in results if matches_filters(r, filters)]
return filtered[:k]
# 策略 3:混合(通用情况)
else:
return vector_search_with_filters(query_embedding, filters, k)
以可接受的准确性损失减少内存占用:
# 乘积量化配置
pq_config = {
"nbits": 8, # 每个子量化器的位数
"m": 16, # 子量化器数量
# 768 维 * 4 字节 = 3KB/向量 -> 16 * 1 字节 = 16 字节/向量
}
# 二值化量化(极端压缩)
binary_config = {
"threshold": 0, # 值 > 0 -> 1,否则 -> 0
# 768 维 * 4 字节 = 3KB/向量 -> 768 位 = 96 字节/向量
}
高效处理动态数据:
class DynamicVectorIndex:
def __init__(self, rebuild_threshold=10000):
self.main_index = build_optimized_index()
self.delta_index = [] # 最近添加的数据
self.rebuild_threshold = rebuild_threshold
def add(self, vector, metadata):
self.delta_index.append((vector, metadata))
if len(self.delta_index) >= self.rebuild_threshold:
self.rebuild()
def search(self, query, k):
main_results = self.main_index.search(query, k)
delta_results = brute_force_search(self.delta_index, query, k)
return merge_results(main_results, delta_results, k)
def rebuild(self):
all_data = self.main_index.get_all() + self.delta_index
self.main_index = build_optimized_index(all_data)
self.delta_index = []
每周安装次数
–
代码仓库
GitHub 星标数
2
首次出现时间
–
安全审计
The Vector Search Designer skill helps you architect and implement vector similarity search systems that power semantic search, recommendation engines, and AI applications. It guides you through selecting the right vector database, designing index structures, optimizing query performance, and scaling to millions or billions of vectors.
Vector search has become foundational to modern AI systems, from RAG pipelines to product recommendations. This skill covers the full stack: understanding approximate nearest neighbor (ANN) algorithms, choosing between database options, tuning recall vs latency tradeoffs, and implementing production-ready search infrastructure.
Whether you are building on Pinecone, Weaviate, Qdrant, pgvector, or implementing your own solution, this skill ensures your vector search system meets your performance and accuracy requirements.
Choose ANN algorithm:
Configure index parameters:
hnsw_config = {
"M": 16, # Connections per node (higher = better recall, more memory)
"efConstruction": 200, # Build-time search depth
"efSearch": 100, # Query-time search depth
}
# IVF example configuration
ivf_config = {
"nlist": 1024, # Number of clusters
"nprobe": 32, # Clusters to search at query time
}
3. Plan sharding strategy for scale 4. Design metadata schema for filtering
| Action | Command/Trigger |
|---|---|
| Choose database | "Which vector database for [use case]" |
| Design index | "Design vector index for [scale]" |
| Optimize search | "Speed up vector search" |
| Add filtering | "Add metadata filters to vector search" |
| Scale vectors | "Scale to [N] million vectors" |
| Benchmark search | "Benchmark vector search performance" |
Right-Size Your Database : Don't over-engineer for scale you don't need
Understand Recall vs Speed Tradeoff : ANN is approximate by design
Use Hybrid Search When Needed : Combine vector and keyword search
Design Metadata for Filtering : Plan your filter strategy upfront
Batch Operations When Possible : Reduce network overhead
Monitor and Alert : Production search needs observability
Handle documents with multiple representations:
class MultiVectorIndex:
def __init__(self):
self.title_index = VectorIndex(dim=768)
self.content_index = VectorIndex(dim=768)
self.summary_index = VectorIndex(dim=768)
def search(self, query_embedding, weights=None):
weights = weights or {"title": 0.3, "content": 0.5, "summary": 0.2}
results = {}
for field, weight in weights.items():
index = getattr(self, f"{field}_index")
field_results = index.search(query_embedding, k=20)
for doc_id, score in field_results:
results[doc_id] = results.get(doc_id, 0) + score * weight
return sorted(results.items(), key=lambda x: x[1], reverse=True)[:10]
Optimize search with filters:
def filtered_search(query_embedding, filters, k=10):
# Strategy 1: Pre-filter (for selective filters)
if estimate_selectivity(filters) < 0.1:
candidate_ids = apply_filters(filters)
return vector_search_subset(query_embedding, candidate_ids, k)
# Strategy 2: Post-filter (for non-selective filters)
elif estimate_selectivity(filters) > 0.5:
results = vector_search(query_embedding, k * 3)
filtered = [r for r in results if matches_filters(r, filters)]
return filtered[:k]
# Strategy 3: Hybrid (general case)
else:
return vector_search_with_filters(query_embedding, filters, k)
Reduce memory with acceptable accuracy loss:
# Product Quantization configuration
pq_config = {
"nbits": 8, # Bits per sub-quantizer
"m": 16, # Number of sub-quantizers
# 768-dim * 4 bytes = 3KB/vector -> 16 * 1 byte = 16 bytes/vector
}
# Binary quantization (extreme compression)
binary_config = {
"threshold": 0, # Values > 0 -> 1, else -> 0
# 768-dim * 4 bytes = 3KB/vector -> 768 bits = 96 bytes/vector
}
Handle dynamic data efficiently:
class DynamicVectorIndex:
def __init__(self, rebuild_threshold=10000):
self.main_index = build_optimized_index()
self.delta_index = [] # Recent additions
self.rebuild_threshold = rebuild_threshold
def add(self, vector, metadata):
self.delta_index.append((vector, metadata))
if len(self.delta_index) >= self.rebuild_threshold:
self.rebuild()
def search(self, query, k):
main_results = self.main_index.search(query, k)
delta_results = brute_force_search(self.delta_index, query, k)
return merge_results(main_results, delta_results, k)
def rebuild(self):
all_data = self.main_index.get_all() + self.delta_index
self.main_index = build_optimized_index(all_data)
self.delta_index = []
Weekly Installs
–
Repository
GitHub Stars
2
First Seen
–
Security Audits
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
45,100 周安装