向量数据库操作指南：Qdrant、Weaviate、pgvector、Pinecone生产部署与优化

vector-database-ops by bagelhole/devops-security-agent-skills

1 周安装量

12 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/bagelhole/devops-security-agent-skills --skill vector-database-ops

AI/机器学习数据库开发运维

🇨🇳中文介绍

向量数据库操作

运行生产级向量数据库，用于人工智能驱动的搜索、RAG 和推荐系统。

何时使用此技能

在以下情况下使用此技能：

为 RAG 或语义搜索应用设置向量数据库时
在 Qdrant、Weaviate、pgvector 或 Pinecone 之间进行选择时
管理集合、索引和数据迁移时
为生产负载优化查询性能和索引时
实现具有命名空间隔离的多租户向量搜索时

向量数据库对比

数据库	最适合场景	托管方式	过滤能力	扩展性
Qdrant	高性能、丰富过滤、自托管	自托管 / 云	优秀	非常高
Weaviate	模式优先、混合搜索、多模态	自托管 / 云	良好	高
pgvector	已在使用 Postgres、简单用例	自托管	良好	中等

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

812,900 周安装

Vercel React 最佳实践指南 | 58条Next.js性能优化规则与代码重构

269,400 周安装

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

147,400 周安装

Azure Data Explorer (Kusto) 查询技能：KQL数据分析、日志遥测与时间序列处理

114,200 周安装

# Docker (单节点)
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -p 6334:6334 \
  -v $(pwd)/qdrant-data:/qdrant/storage \
  qdrant/qdrant:latest

# 使用自定义配置
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -v $(pwd)/qdrant-data:/qdrant/storage \
  -v $(pwd)/qdrant-config.yaml:/qdrant/config/production.yaml \
  qdrant/qdrant:latest

# qdrant-config.yaml
storage:
  storage_path: /qdrant/storage
  on_disk_payload: true          # 将有效负载存储在磁盘上（节省 RAM）

service:
  max_request_size_mb: 32

hnsw_index:
  m: 16                          # 每个节点的图连接数
  ef_construct: 100              # 准确性与构建时间的权衡
  full_scan_threshold: 10000     # 低于此值时切换到暴力搜索

quantization:
  scalar:
    type: int8
    quantile: 0.99
    always_ram: true             # 将量化索引保留在 RAM 中

telemetry_disabled: true

from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance, VectorParams, HnswConfigDiff,
    ScalarQuantizationConfig, ScalarType, QuantizationConfig
)

client = QdrantClient("http://localhost:6333")

# 创建优化集合
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(
        size=1536,                         # OpenAI ada-002 / text-embedding-3-small
        distance=Distance.COSINE,
        on_disk=True,                      # 节省 RAM — 向量存储在磁盘上
    ),
    hnsw_config=HnswConfigDiff(
        m=32,                              # 值越高 = 召回率越好，占用更多 RAM
        ef_construct=200,
        on_disk=False,                     # 将 HNSW 图保留在 RAM 中以提高速度
    ),
    quantization_config=QuantizationConfig(
        scalar=ScalarQuantizationConfig(
            type=ScalarType.INT8,
            quantile=0.99,
            always_ram=True,
        )
    ),
)

# 为快速过滤创建有效负载索引
client.create_payload_index(
    collection_name="documents",
    field_name="tenant_id",
    field_schema="keyword",
)
client.create_payload_index(
    collection_name="documents",
    field_name="created_at",
    field_schema="datetime",
)

# 集合信息
info = client.get_collection("documents")
print(f"Vectors: {info.vectors_count}, Status: {info.status}")

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

# 租户隔离搜索（多租户 RAG）
results = client.query_points(
    collection_name="documents",
    query=query_embedding,
    query_filter=Filter(
        must=[
            FieldCondition(key="tenant_id", match=MatchValue(value="acme-corp")),
            FieldCondition(key="doc_type", match=MatchValue(value="contract")),
        ],
        should=[
            FieldCondition(key="created_at", range=Range(gte="2024-01-01")),
        ],
    ),
    limit=10,
    with_payload=True,
)

-- 启用扩展
CREATE EXTENSION IF NOT EXISTS vector;

-- 创建带向量列的表
CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content     TEXT NOT NULL,
    embedding   VECTOR(1536),
    metadata    JSONB DEFAULT '{}',
    tenant_id   TEXT NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT NOW()
);

-- 创建 HNSW 索引（查询更快，占用更多内存）
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- 创建 IVFFlat 索引（占用内存更少，构建更慢）
-- CREATE INDEX ON documents
-- USING ivfflat (embedding vector_cosine_ops)
-- WITH (lists = 100);

-- 带元数据过滤的语义搜索
SELECT id, content, metadata,
       1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE tenant_id = 'acme-corp'
  AND metadata->>'doc_type' = 'contract'
ORDER BY embedding <=> $1::vector
LIMIT 10;

# Weaviate 的 docker-compose 配置
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "false"
      AUTHENTICATION_APIKEY_ENABLED: "true"
      AUTHENTICATION_APIKEY_ALLOWED_KEYS: "${WEAVIATE_API_KEY}"
      AUTHENTICATION_APIKEY_USERS: "admin"
      PERSISTENCE_DATA_PATH: /var/lib/weaviate
      ENABLE_MODULES: text2vec-openai,generative-openai
      OPENAI_APIKEY: "${OPENAI_API_KEY}"
      CLUSTER_HOSTNAME: node1
    volumes:
      - weaviate-data:/var/lib/weaviate
    restart: unless-stopped

volumes:
  weaviate-data:

# Qdrant — 快照备份
curl -X POST "http://localhost:6333/collections/documents/snapshots"
# 下载快照
curl -O "http://localhost:6333/collections/documents/snapshots/documents-snapshot.snapshot"
# 恢复
curl -X POST "http://localhost:6333/collections/documents/snapshots/recover" \
  -H "Content-Type: application/json" \
  -d '{"location": "/qdrant/snapshots/documents-snapshot.snapshot"}'

# pgvector — 标准 pg_dump
pg_dump -h localhost -U postgres -d vectordb \
  --table=documents --format=custom > documents-backup.dump

# 恢复
pg_restore -h localhost -U postgres -d vectordb documents-backup.dump

# Qdrant — 批量加载后优化集合
client.update_collection(
    collection_name="documents",
    optimizer_config={"indexing_threshold": 0},  # 强制立即索引
)

# 等待优化完成
import time
while True:
    info = client.get_collection("documents")
    if info.status.value == "green":
        break
    time.sleep(5)
    print(f"Optimizing... segments: {info.segments_count}")

问题	原因	解决方法
查询缓慢	尚未构建 HNSW 索引	等待索引构建完成；检查 `status == green`
RAM 使用率高	向量在内存中	为向量启用 `on_disk=True`
召回率低	搜索参数 `ef` 过低	在搜索请求中增加 `ef`（查询时）
pgvector 缓慢	使用 IVFFlat 但未执行 vacuum	运行 `VACUUM ANALYZE documents`
Weaviate 内存不足	对象过多	启用异步索引；增加堆内存

对归一化嵌入使用余弦距离；对未归一化嵌入使用点积。
始终在过滤字段（tenant_id、doc_type）上创建有效负载索引。
对于数据集 > 1000 万向量，使用 on_disk 向量 + always_ram 量化。
在选择 IVFFlat 与 HNSW 之前，使用实际的查询模式进行基准测试。
在任何批量删除或迁移操作之前创建快照。

🇺🇸English

Vector Database Operations

Run production vector databases for AI-powered search, RAG, and recommendation systems.

When to Use This Skill

Use this skill when:

Setting up a vector database for a RAG or semantic search application
Choosing between Qdrant, Weaviate, pgvector, or Pinecone
Managing collections, indexes, and data migrations
Optimizing query performance and indexing for production loads
Implementing multi-tenant vector search with namespace isolation

Vector Database Comparison

Database	Best For	Hosting	Filtering	Scale
Qdrant	High-performance, rich filtering, self-hosted	Self / Cloud	Excellent	Very High
Weaviate	Schema-first, hybrid search, multi-modal	Self / Cloud	Good	High
pgvector	Already on Postgres, simple use cases	Self	Good	Medium
Pinecone	Zero-ops managed, serverless	Managed only	Good	Very High
Chroma	Local dev, prototyping	Self only	Basic	Low-Medium

Qdrant — Production Deployment

# Docker (single node)
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -p 6334:6334 \
  -v $(pwd)/qdrant-data:/qdrant/storage \
  qdrant/qdrant:latest

# With custom config
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -v $(pwd)/qdrant-data:/qdrant/storage \
  -v $(pwd)/qdrant-config.yaml:/qdrant/config/production.yaml \
  qdrant/qdrant:latest



# qdrant-config.yaml
storage:
  storage_path: /qdrant/storage
  on_disk_payload: true          # store payload on disk (saves RAM)

service:
  max_request_size_mb: 32

hnsw_index:
  m: 16                          # graph connections per node
  ef_construct: 100              # accuracy vs build time trade-off
  full_scan_threshold: 10000     # switch to brute force below this

quantization:
  scalar:
    type: int8
    quantile: 0.99
    always_ram: true             # keep quantized index in RAM

telemetry_disabled: true

Qdrant Collection Management

from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance, VectorParams, HnswConfigDiff,
    ScalarQuantizationConfig, ScalarType, QuantizationConfig
)

client = QdrantClient("http://localhost:6333")

# Create optimized collection
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(
        size=1536,                         # OpenAI ada-002 / text-embedding-3-small
        distance=Distance.COSINE,
        on_disk=True,                      # save RAM — vectors stored on disk
    ),
    hnsw_config=HnswConfigDiff(
        m=32,                              # higher = better recall, more RAM
        ef_construct=200,
        on_disk=False,                     # keep HNSW graph in RAM for speed
    ),
    quantization_config=QuantizationConfig(
        scalar=ScalarQuantizationConfig(
            type=ScalarType.INT8,
            quantile=0.99,
            always_ram=True,
        )
    ),
)

# Create payload index for fast filtering
client.create_payload_index(
    collection_name="documents",
    field_name="tenant_id",
    field_schema="keyword",
)
client.create_payload_index(
    collection_name="documents",
    field_name="created_at",
    field_schema="datetime",
)

# Collection info
info = client.get_collection("documents")
print(f"Vectors: {info.vectors_count}, Status: {info.status}")

Qdrant Filtered Search

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

# Tenant-isolated search (multi-tenant RAG)
results = client.query_points(
    collection_name="documents",
    query=query_embedding,
    query_filter=Filter(
        must=[
            FieldCondition(key="tenant_id", match=MatchValue(value="acme-corp")),
            FieldCondition(key="doc_type", match=MatchValue(value="contract")),
        ],
        should=[
            FieldCondition(key="created_at", range=Range(gte="2024-01-01")),
        ],
    ),
    limit=10,
    with_payload=True,
)

pgvector — PostgreSQL Extension

-- Enable extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create table with vector column
CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content     TEXT NOT NULL,
    embedding   VECTOR(1536),
    metadata    JSONB DEFAULT '{}',
    tenant_id   TEXT NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT NOW()
);

-- Create HNSW index (faster queries, more memory)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Create IVFFlat index (less memory, slower build)
-- CREATE INDEX ON documents
-- USING ivfflat (embedding vector_cosine_ops)
-- WITH (lists = 100);

-- Semantic search with metadata filtering
SELECT id, content, metadata,
       1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE tenant_id = 'acme-corp'
  AND metadata->>'doc_type' = 'contract'
ORDER BY embedding <=> $1::vector
LIMIT 10;



# Deploy pgvector via Docker
docker run -d \
  --name pgvector \
  -e POSTGRES_PASSWORD=secret \
  -e POSTGRES_DB=vectordb \
  -p 5432:5432 \
  -v pgvector-data:/var/lib/postgresql/data \
  pgvector/pgvector:pg16

Weaviate Deployment

# docker-compose for Weaviate
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "false"
      AUTHENTICATION_APIKEY_ENABLED: "true"
      AUTHENTICATION_APIKEY_ALLOWED_KEYS: "${WEAVIATE_API_KEY}"
      AUTHENTICATION_APIKEY_USERS: "admin"
      PERSISTENCE_DATA_PATH: /var/lib/weaviate
      ENABLE_MODULES: text2vec-openai,generative-openai
      OPENAI_APIKEY: "${OPENAI_API_KEY}"
      CLUSTER_HOSTNAME: node1
    volumes:
      - weaviate-data:/var/lib/weaviate
    restart: unless-stopped

volumes:
  weaviate-data:

Backup and Restore

# Qdrant — snapshot backup
curl -X POST "http://localhost:6333/collections/documents/snapshots"
# Download snapshot
curl -O "http://localhost:6333/collections/documents/snapshots/documents-snapshot.snapshot"
# Restore
curl -X POST "http://localhost:6333/collections/documents/snapshots/recover" \
  -H "Content-Type: application/json" \
  -d '{"location": "/qdrant/snapshots/documents-snapshot.snapshot"}'

# pgvector — standard pg_dump
pg_dump -h localhost -U postgres -d vectordb \
  --table=documents --format=custom > documents-backup.dump

# Restore
pg_restore -h localhost -U postgres -d vectordb documents-backup.dump

Performance Tuning

# Qdrant — optimize collection after bulk load
client.update_collection(
    collection_name="documents",
    optimizer_config={"indexing_threshold": 0},  # force indexing now
)

# Wait for optimization to complete
import time
while True:
    info = client.get_collection("documents")
    if info.status.value == "green":
        break
    time.sleep(5)
    print(f"Optimizing... segments: {info.segments_count}")

Common Issues

Issue	Cause	Fix
Slow queries	No HNSW index built yet	Wait for indexing; check `status == green`
High RAM usage	Vectors in memory	Enable `on_disk=True` for vectors
Poor recall	Low `ef` search param	Increase `ef` in search request (at query time)
pgvector slow	Using IVFFlat without vacuum	Run `VACUUM ANALYZE documents`
Weaviate OOM

Best Practices

Use cosine distance for normalized embeddings; dot product for unnormalized.
Always create payload indexes on filter fields (tenant_id, doc_type).
For datasets >10M vectors, use on_disk vectors + always_ram quantization.
Benchmark with your actual query patterns before choosing IVFFlat vs HNSW.
Snapshot before any bulk delete or migration operation.

Related Skills

rag-infrastructure - Full RAG pipeline
databases - General database management
postgresql - pgvector host database ops

Weekly Installs

Repository

bagelhole/devop…t-skills

GitHub Stars

First Seen

1 day ago

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

zencoder1

amp1

cline1

openclaw1

opencode1

cursor1

向量数据库操作指南：Qdrant、Weaviate、pgvector、Pinecone生产部署与优化

🇨🇳中文介绍

向量数据库操作

何时使用此技能

向量数据库对比

相关 Skills

Qdrant — 生产部署

Qdrant 集合管理

Qdrant 过滤搜索

pgvector — PostgreSQL 扩展

Weaviate 部署

备份与恢复

性能调优

常见问题

最佳实践

相关技能