vector-database-ops by bagelhole/devops-security-agent-skills
npx skills add https://github.com/bagelhole/devops-security-agent-skills --skill vector-database-ops运行生产级向量数据库,用于人工智能驱动的搜索、RAG 和推荐系统。
在以下情况下使用此技能:
| 数据库 | 最适合场景 | 托管方式 | 过滤能力 | 扩展性 |
|---|---|---|---|---|
| Qdrant | 高性能、丰富过滤、自托管 | 自托管 / 云 | 优秀 | 非常高 |
| Weaviate | 模式优先、混合搜索、多模态 | 自托管 / 云 | 良好 | 高 |
| pgvector | 已在使用 Postgres、简单用例 | 自托管 | 良好 | 中等 |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| Pinecone | 零运维托管、无服务器 | 仅托管 | 良好 | 非常高 |
| Chroma | 本地开发、原型设计 | 仅自托管 | 基础 | 低-中等 |
# Docker (单节点)
docker run -d \
--name qdrant \
-p 6333:6333 \
-p 6334:6334 \
-v $(pwd)/qdrant-data:/qdrant/storage \
qdrant/qdrant:latest
# 使用自定义配置
docker run -d \
--name qdrant \
-p 6333:6333 \
-v $(pwd)/qdrant-data:/qdrant/storage \
-v $(pwd)/qdrant-config.yaml:/qdrant/config/production.yaml \
qdrant/qdrant:latest
# qdrant-config.yaml
storage:
storage_path: /qdrant/storage
on_disk_payload: true # 将有效负载存储在磁盘上(节省 RAM)
service:
max_request_size_mb: 32
hnsw_index:
m: 16 # 每个节点的图连接数
ef_construct: 100 # 准确性与构建时间的权衡
full_scan_threshold: 10000 # 低于此值时切换到暴力搜索
quantization:
scalar:
type: int8
quantile: 0.99
always_ram: true # 将量化索引保留在 RAM 中
telemetry_disabled: true
from qdrant_client import QdrantClient
from qdrant_client.models import (
Distance, VectorParams, HnswConfigDiff,
ScalarQuantizationConfig, ScalarType, QuantizationConfig
)
client = QdrantClient("http://localhost:6333")
# 创建优化集合
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(
size=1536, # OpenAI ada-002 / text-embedding-3-small
distance=Distance.COSINE,
on_disk=True, # 节省 RAM — 向量存储在磁盘上
),
hnsw_config=HnswConfigDiff(
m=32, # 值越高 = 召回率越好,占用更多 RAM
ef_construct=200,
on_disk=False, # 将 HNSW 图保留在 RAM 中以提高速度
),
quantization_config=QuantizationConfig(
scalar=ScalarQuantizationConfig(
type=ScalarType.INT8,
quantile=0.99,
always_ram=True,
)
),
)
# 为快速过滤创建有效负载索引
client.create_payload_index(
collection_name="documents",
field_name="tenant_id",
field_schema="keyword",
)
client.create_payload_index(
collection_name="documents",
field_name="created_at",
field_schema="datetime",
)
# 集合信息
info = client.get_collection("documents")
print(f"Vectors: {info.vectors_count}, Status: {info.status}")
from qdrant_client.models import Filter, FieldCondition, MatchValue, Range
# 租户隔离搜索(多租户 RAG)
results = client.query_points(
collection_name="documents",
query=query_embedding,
query_filter=Filter(
must=[
FieldCondition(key="tenant_id", match=MatchValue(value="acme-corp")),
FieldCondition(key="doc_type", match=MatchValue(value="contract")),
],
should=[
FieldCondition(key="created_at", range=Range(gte="2024-01-01")),
],
),
limit=10,
with_payload=True,
)
-- 启用扩展
CREATE EXTENSION IF NOT EXISTS vector;
-- 创建带向量列的表
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
embedding VECTOR(1536),
metadata JSONB DEFAULT '{}',
tenant_id TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- 创建 HNSW 索引(查询更快,占用更多内存)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- 创建 IVFFlat 索引(占用内存更少,构建更慢)
-- CREATE INDEX ON documents
-- USING ivfflat (embedding vector_cosine_ops)
-- WITH (lists = 100);
-- 带元数据过滤的语义搜索
SELECT id, content, metadata,
1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE tenant_id = 'acme-corp'
AND metadata->>'doc_type' = 'contract'
ORDER BY embedding <=> $1::vector
LIMIT 10;
# 通过 Docker 部署 pgvector
docker run -d \
--name pgvector \
-e POSTGRES_PASSWORD=secret \
-e POSTGRES_DB=vectordb \
-p 5432:5432 \
-v pgvector-data:/var/lib/postgresql/data \
pgvector/pgvector:pg16
# Weaviate 的 docker-compose 配置
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
- "50051:50051"
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "false"
AUTHENTICATION_APIKEY_ENABLED: "true"
AUTHENTICATION_APIKEY_ALLOWED_KEYS: "${WEAVIATE_API_KEY}"
AUTHENTICATION_APIKEY_USERS: "admin"
PERSISTENCE_DATA_PATH: /var/lib/weaviate
ENABLE_MODULES: text2vec-openai,generative-openai
OPENAI_APIKEY: "${OPENAI_API_KEY}"
CLUSTER_HOSTNAME: node1
volumes:
- weaviate-data:/var/lib/weaviate
restart: unless-stopped
volumes:
weaviate-data:
# Qdrant — 快照备份
curl -X POST "http://localhost:6333/collections/documents/snapshots"
# 下载快照
curl -O "http://localhost:6333/collections/documents/snapshots/documents-snapshot.snapshot"
# 恢复
curl -X POST "http://localhost:6333/collections/documents/snapshots/recover" \
-H "Content-Type: application/json" \
-d '{"location": "/qdrant/snapshots/documents-snapshot.snapshot"}'
# pgvector — 标准 pg_dump
pg_dump -h localhost -U postgres -d vectordb \
--table=documents --format=custom > documents-backup.dump
# 恢复
pg_restore -h localhost -U postgres -d vectordb documents-backup.dump
# Qdrant — 批量加载后优化集合
client.update_collection(
collection_name="documents",
optimizer_config={"indexing_threshold": 0}, # 强制立即索引
)
# 等待优化完成
import time
while True:
info = client.get_collection("documents")
if info.status.value == "green":
break
time.sleep(5)
print(f"Optimizing... segments: {info.segments_count}")
| 问题 | 原因 | 解决方法 |
|---|---|---|
| 查询缓慢 | 尚未构建 HNSW 索引 | 等待索引构建完成;检查 status == green |
| RAM 使用率高 | 向量在内存中 | 为向量启用 on_disk=True |
| 召回率低 | 搜索参数 ef 过低 | 在搜索请求中增加 ef(查询时) |
| pgvector 缓慢 | 使用 IVFFlat 但未执行 vacuum | 运行 VACUUM ANALYZE documents |
| Weaviate 内存不足 | 对象过多 | 启用异步索引;增加堆内存 |
tenant_id、doc_type)上创建有效负载索引。on_disk 向量 + always_ram 量化。每周安装次数
1
仓库
GitHub 星标数
12
首次出现
1 天前
安全审计
已安装于
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
Run production vector databases for AI-powered search, RAG, and recommendation systems.
Use this skill when:
| Database | Best For | Hosting | Filtering | Scale |
|---|---|---|---|---|
| Qdrant | High-performance, rich filtering, self-hosted | Self / Cloud | Excellent | Very High |
| Weaviate | Schema-first, hybrid search, multi-modal | Self / Cloud | Good | High |
| pgvector | Already on Postgres, simple use cases | Self | Good | Medium |
| Pinecone | Zero-ops managed, serverless | Managed only | Good | Very High |
| Chroma | Local dev, prototyping | Self only | Basic | Low-Medium |
# Docker (single node)
docker run -d \
--name qdrant \
-p 6333:6333 \
-p 6334:6334 \
-v $(pwd)/qdrant-data:/qdrant/storage \
qdrant/qdrant:latest
# With custom config
docker run -d \
--name qdrant \
-p 6333:6333 \
-v $(pwd)/qdrant-data:/qdrant/storage \
-v $(pwd)/qdrant-config.yaml:/qdrant/config/production.yaml \
qdrant/qdrant:latest
# qdrant-config.yaml
storage:
storage_path: /qdrant/storage
on_disk_payload: true # store payload on disk (saves RAM)
service:
max_request_size_mb: 32
hnsw_index:
m: 16 # graph connections per node
ef_construct: 100 # accuracy vs build time trade-off
full_scan_threshold: 10000 # switch to brute force below this
quantization:
scalar:
type: int8
quantile: 0.99
always_ram: true # keep quantized index in RAM
telemetry_disabled: true
from qdrant_client import QdrantClient
from qdrant_client.models import (
Distance, VectorParams, HnswConfigDiff,
ScalarQuantizationConfig, ScalarType, QuantizationConfig
)
client = QdrantClient("http://localhost:6333")
# Create optimized collection
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(
size=1536, # OpenAI ada-002 / text-embedding-3-small
distance=Distance.COSINE,
on_disk=True, # save RAM — vectors stored on disk
),
hnsw_config=HnswConfigDiff(
m=32, # higher = better recall, more RAM
ef_construct=200,
on_disk=False, # keep HNSW graph in RAM for speed
),
quantization_config=QuantizationConfig(
scalar=ScalarQuantizationConfig(
type=ScalarType.INT8,
quantile=0.99,
always_ram=True,
)
),
)
# Create payload index for fast filtering
client.create_payload_index(
collection_name="documents",
field_name="tenant_id",
field_schema="keyword",
)
client.create_payload_index(
collection_name="documents",
field_name="created_at",
field_schema="datetime",
)
# Collection info
info = client.get_collection("documents")
print(f"Vectors: {info.vectors_count}, Status: {info.status}")
from qdrant_client.models import Filter, FieldCondition, MatchValue, Range
# Tenant-isolated search (multi-tenant RAG)
results = client.query_points(
collection_name="documents",
query=query_embedding,
query_filter=Filter(
must=[
FieldCondition(key="tenant_id", match=MatchValue(value="acme-corp")),
FieldCondition(key="doc_type", match=MatchValue(value="contract")),
],
should=[
FieldCondition(key="created_at", range=Range(gte="2024-01-01")),
],
),
limit=10,
with_payload=True,
)
-- Enable extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Create table with vector column
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
embedding VECTOR(1536),
metadata JSONB DEFAULT '{}',
tenant_id TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Create HNSW index (faster queries, more memory)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- Create IVFFlat index (less memory, slower build)
-- CREATE INDEX ON documents
-- USING ivfflat (embedding vector_cosine_ops)
-- WITH (lists = 100);
-- Semantic search with metadata filtering
SELECT id, content, metadata,
1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE tenant_id = 'acme-corp'
AND metadata->>'doc_type' = 'contract'
ORDER BY embedding <=> $1::vector
LIMIT 10;
# Deploy pgvector via Docker
docker run -d \
--name pgvector \
-e POSTGRES_PASSWORD=secret \
-e POSTGRES_DB=vectordb \
-p 5432:5432 \
-v pgvector-data:/var/lib/postgresql/data \
pgvector/pgvector:pg16
# docker-compose for Weaviate
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
- "50051:50051"
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "false"
AUTHENTICATION_APIKEY_ENABLED: "true"
AUTHENTICATION_APIKEY_ALLOWED_KEYS: "${WEAVIATE_API_KEY}"
AUTHENTICATION_APIKEY_USERS: "admin"
PERSISTENCE_DATA_PATH: /var/lib/weaviate
ENABLE_MODULES: text2vec-openai,generative-openai
OPENAI_APIKEY: "${OPENAI_API_KEY}"
CLUSTER_HOSTNAME: node1
volumes:
- weaviate-data:/var/lib/weaviate
restart: unless-stopped
volumes:
weaviate-data:
# Qdrant — snapshot backup
curl -X POST "http://localhost:6333/collections/documents/snapshots"
# Download snapshot
curl -O "http://localhost:6333/collections/documents/snapshots/documents-snapshot.snapshot"
# Restore
curl -X POST "http://localhost:6333/collections/documents/snapshots/recover" \
-H "Content-Type: application/json" \
-d '{"location": "/qdrant/snapshots/documents-snapshot.snapshot"}'
# pgvector — standard pg_dump
pg_dump -h localhost -U postgres -d vectordb \
--table=documents --format=custom > documents-backup.dump
# Restore
pg_restore -h localhost -U postgres -d vectordb documents-backup.dump
# Qdrant — optimize collection after bulk load
client.update_collection(
collection_name="documents",
optimizer_config={"indexing_threshold": 0}, # force indexing now
)
# Wait for optimization to complete
import time
while True:
info = client.get_collection("documents")
if info.status.value == "green":
break
time.sleep(5)
print(f"Optimizing... segments: {info.segments_count}")
| Issue | Cause | Fix |
|---|---|---|
| Slow queries | No HNSW index built yet | Wait for indexing; check status == green |
| High RAM usage | Vectors in memory | Enable on_disk=True for vectors |
| Poor recall | Low ef search param | Increase ef in search request (at query time) |
| pgvector slow | Using IVFFlat without vacuum | Run VACUUM ANALYZE documents |
| Weaviate OOM |
tenant_id, doc_type).on_disk vectors + always_ram quantization.Weekly Installs
1
Repository
GitHub Stars
12
First Seen
1 day ago
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
AI数字人口播视频制作教程 - 使用inference.sh CLI与唇形同步技术
7,500 周安装
A/B测试设置指南:实验设计、样本量计算与统计显著性分析
1 周安装
AI书籍封面设计指南:使用inference.sh CLI生成惊悚、言情、科幻等流派风格封面
7,400 周安装
AI邮件设计工具:使用inference.sh CLI生成高转化率营销邮件模板与布局指南
7,500 周安装
GitHub Copilot TLDR Prompt:AI助手快速生成技术文档摘要工具
7,500 周安装
Power Platform Dataverse Python SDK 解决方案架构师 - 业务用例构建与代码生成
7,500 周安装
| Too many objects |
| Enable async indexing; increase heap |