PostgreSQL数据库工程：查询优化、索引策略与高可用性配置实战指南

postgresql-database-engineering by manutej/luxor-claude-marketplace

493 周安装量

47 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/manutej/luxor-claude-marketplace --skill postgresql-database-engineering

数据库系统架构性能优化

🇨🇳中文介绍

PostgreSQL 数据库工程

一项全面的专业技能，涵盖专业的 PostgreSQL 数据库工程，从查询优化和索引策略到高可用性、复制和生产数据库管理。此技能使您能够大规模地设计、优化和维护高性能的 PostgreSQL 数据库。

何时使用此技能

在以下情况下使用此技能：

为高性能应用程序设计数据库模式
优化慢查询并提升数据库性能
为复杂查询模式实施索引策略
为大型表（超过 1 亿行）设置分区
配置流式复制和高可用性
为生产负载调整 PostgreSQL 配置
实施备份和恢复流程
调试性能问题和查询瓶颈
使用 pgBouncer 或 PgPool 设置连接池
监控数据库健康状况和性能指标
规划数据库迁移和模式变更
实施数据库安全和访问控制
水平或垂直扩展 PostgreSQL 数据库
管理 VACUUM 操作和数据库维护
为数据分发设置逻辑复制

核心概念

PostgreSQL 架构

PostgreSQL 采用基于进程的架构，包含以下几个关键组件：

Postmaster 进程：管理连接的主服务器进程
后端进程：每个客户端连接一个，处理查询
共享内存：共享缓冲区、WAL 缓冲区、锁表
后台工作进程：自动清理、检查点、WAL 写入器、统计信息收集器
预写日志：用于持久性和复制的事务日志
存储层：用于大值的 TOAST，用于空闲空间的 FSM，用于可见性的 VM

MVCC（多版本并发控制）

PostgreSQL 的基础并发机制：

快照：每个事务看到数据的一致快照
元组版本：多个行版本共存以实现并发访问
：xmin（创建事务）、xmax（删除事务）

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

查询规划与优化

PostgreSQL 的查询规划器决定执行策略：

规划器组件：

统计信息：用于基数估计的表和列统计信息
成本模型：CPU、I/O 和内存成本估计
计划类型：顺序扫描、索引扫描、位图扫描、连接
连接方法：嵌套循环、哈希连接、归并连接
优化：查询重写、谓词下推、连接重排序

关键统计信息：

n_distinct：不同值的数量（用于选择性）
correlation：物理行排序相关性
most_common_vals：用于偏斜分布的最常见值列表
histogram_bounds：值分布直方图

理解 EXPLAIN：

成本：启动成本 .. 总成本（任意单位）
行数：估计的行数
宽度：平均行大小（字节）
实际时间：实际执行时间（使用 ANALYZE）
循环次数：节点执行的次数

用于管理大型数据集的表分区：

用于：时间序列数据、顺序值
示例：按日期范围分区（每日、每月、每年）
优势：易于数据生命周期管理，查询更快

用于：离散的分类值
示例：按国家、地区、状态分区
优势：逻辑数据分离，分区裁剪

用于：均匀的数据分布
示例：按 hash(user_id) 分区
优势：分区大小均衡，并行查询

规划器排除不相关的分区
大幅减少查询范围
对分区性能至关重要

分区级操作：

分区级连接：直接连接匹配的分区
分区级聚合：在分区内聚合
并行分区处理

复制与高可用性

PostgreSQL 复制选项：

流式复制（物理）

类型：二进制 WAL 流传输到备用服务器
模式：异步、同步、基于仲裁
用于：高可用性、读取扩展性
故障转移：使用 Patroni、repmgr 等工具自动进行

同步与异步：

同步：零数据丢失，延迟较高
异步：低延迟，可能数据丢失
仲裁：在安全性和性能之间取得平衡

类型：行级变更流
用于：选择性复制、升级、多主
优势：复制特定表，跨版本
限制：不复制 DDL，有开销

备用服务器从其他备用服务器复制
减少主服务器负载
地理分布

高效管理数据库连接：

类型：轻量级连接池
模式：会话、事务、语句池化
用于：高连接数应用程序
优势：减少连接开销，资源限制

会话：客户端连接整个会话
事务：每个事务一个连接
语句：每个语句一个连接（很少使用）

类型：功能丰富的中间件
功能：连接池、负载均衡、查询缓存
用于：读/写分离、连接管理
优势：高级路由、内存缓存

关键的维护操作：

目的：回收死元组空间，更新统计信息
类型：常规 VACUUM、VACUUM FULL
时机：在大规模更新/删除后，通过自动清理定期进行
影响：常规 VACUUM 是非阻塞的

目的：更新规划器统计信息
时机：数据变更后，模式修改后
影响：最小，在大多数表上很快

目的：重建索引，修复膨胀
时机：索引损坏，显著膨胀
影响：锁定表，使用 REINDEX CONCURRENTLY（PG 12+）

目的：自动化的 VACUUM 和 ANALYZE
配置：基于阈值的触发
调优：平衡资源使用与响应性
监控：跟踪自动清理运行，防止回卷

关键配置参数：

shared_buffers: 25% of RAM (start point)
effective_cache_size: 50-75% of RAM
work_mem: Per-operation memory (sort, hash)
maintenance_work_mem: VACUUM, CREATE INDEX memory

checkpoint_timeout: How often to checkpoint
max_wal_size: WAL size before checkpoint
checkpoint_completion_target: Spread checkpoint I/O
wal_buffers: WAL write buffer size

random_page_cost: Relative cost of random I/O
effective_io_concurrency: Concurrent I/O operations
default_statistics_target: Histogram detail level

max_connections: Maximum client connections
connection_limit: Per-database/user limits

选择正确的索引

查询模式	索引类型	原因
`WHERE id = 5`	B-tree	相等性查找
`WHERE created_at > '2024-01-01'`	B-tree	范围查询
`ORDER BY name`	B-tree	支持排序
`WHERE tags @> ARRAY['sql']`	GIN	数组包含
`WHERE data->>'status' = 'active'`	GIN (jsonb_path_ops)	JSONB 查询
`WHERE to_tsvector(content) @@ query`	GIN	全文搜索
`WHERE location <-> point(0,0)`	GiST	最近邻
`WHERE timestamp BETWEEN ... (large table)`	BRIN	顺序时间序列
`WHERE ip_address << '192.168.0.0/16'`	GiST 或 SP-GiST	IP 范围查询

用于复杂查询的多列索引：

列排序规则：

相等性列在前
排序/范围列在后
高选择性列在前
精确匹配查询模式

-- Query: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY created_at
-- Optimal index: (status, created_at)
CREATE INDEX idx_users_status_created ON users(status, created_at);

索引行的子集：

索引体积更小
对非索引行的更新更快
针对性的查询优化

仅索引活动记录：WHERE deleted_at IS NULL
索引近期数据：WHERE created_at > NOW() - INTERVAL '90 days'
索引特定状态：WHERE status IN ('pending', 'processing')

-- Case-insensitive search
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Date truncation
CREATE INDEX idx_events_date ON events(DATE(created_at));

-- JSONB field
CREATE INDEX idx_data_status ON documents((data->>'status'));

覆盖索引（INCLUDE）

包含非键列以实现仅索引扫描：

CREATE INDEX idx_users_email_include
ON users(email)
INCLUDE (first_name, last_name, created_at);

优势： 查询完全由索引满足，无需表查找

监控索引使用情况：

-- Unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

-- Index bloat estimation
SELECT schemaname, tablename, indexname,
       pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
       idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

使用 EXPLAIN ANALYZE

理解查询执行：

-- Basic EXPLAIN
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';

-- EXPLAIN ANALYZE (actually runs query)
EXPLAIN ANALYZE SELECT * FROM users WHERE created_at > '2024-01-01';

-- Detailed output
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT u.*, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01';

规划时间：生成计划的时间
执行时间：实际查询运行时间
共享命中 vs 读取：缓冲区缓存命中 vs 磁盘读取
行数：估计 vs 实际行数
过滤 vs 索引条件：扫描后过滤 vs 索引使用

常见查询反模式

问题： 循环中每行执行一次查询 解决方案： 使用 JOIN 或批量查询

问题： 获取不必要的列 解决方案： 仅选择需要的列

3. 隐式类型转换

问题： 由于类型不匹配导致索引未被使用 解决方案： 确保查询类型与列类型匹配

4. 对索引列使用函数

问题： WHERE UPPER(email) = 'USER@EXAMPLE.COM' 解决方案： 使用表达式索引或正确比较

问题： WHERE status = 'A' OR status = 'B' 解决方案： 使用 IN：WHERE status IN ('A', 'B')

嵌套循环
- 最适合：小外表，有索引的内表
- 方式：对于每个外表行，扫描内表
- 时机：小结果集，良好的索引
哈希连接
- 最适合：大表，没有良好索引
- 方式：构建较小表的哈希表
- 时机：相等性连接，足够的内存
归并连接
- 最适合：预排序数据，相等性连接
- 方式：对两个输入排序，归并扫描
- 时机：两个输入已排序或可以低成本排序

连接顺序很重要：

规划器会重新排序连接以进行优化
统计信息指导连接顺序决策
可以使用 SET join_collapse_limit 强制顺序

部分聚合：分区级聚合
哈希聚合：内存中分组
排序聚合：预排序输入
并行聚合：多个工作进程

预计算昂贵的聚合
按计划或触发器刷新
牺牲新鲜度换取查询速度

共享缓冲区：PostgreSQL 页面缓存
操作系统页面缓存：操作系统缓存
应用程序缓存：Redis、Memcached
预处理语句：重用查询计划

时间序列示例：

-- Create partitioned table
CREATE TABLE events (
    id BIGSERIAL,
    event_type TEXT NOT NULL,
    user_id INTEGER NOT NULL,
    data JSONB,
    created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Default partition for data outside ranges
CREATE TABLE events_default PARTITION OF events DEFAULT;

-- Indexes on partitions
CREATE INDEX idx_events_2024_01_user ON events_2024_01(user_id);
CREATE INDEX idx_events_2024_02_user ON events_2024_02(user_id);

自动化分区管理：

-- Function to create monthly partitions
CREATE OR REPLACE FUNCTION create_monthly_partition(
    base_table TEXT,
    partition_date DATE
) RETURNS VOID AS $$
DECLARE
    partition_name TEXT;
    start_date DATE;
    end_date DATE;
BEGIN
    partition_name := base_table || '_' || TO_CHAR(partition_date, 'YYYY_MM');
    start_date := DATE_TRUNC('month', partition_date);
    end_date := start_date + INTERVAL '1 month';

    EXECUTE format(
        'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I
         FOR VALUES FROM (%L) TO (%L)',
        partition_name, base_table, start_date, end_date
    );

    -- Create indexes
    EXECUTE format(
        'CREATE INDEX IF NOT EXISTS %I ON %I(user_id)',
        'idx_' || partition_name || '_user', partition_name
    );
END;
$$ LANGUAGE plpgsql;

删除旧分区：

-- Detach partition (fast, non-blocking)
ALTER TABLE events DETACH PARTITION events_2023_01;

-- Drop detached partition
DROP TABLE events_2023_01;

-- Or archive before dropping
CREATE TABLE archive.events_2023_01 AS SELECT * FROM events_2023_01;
DROP TABLE events_2023_01;

高可用性与复制

主服务器配置（postgresql.conf）：

# Replication settings
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
hot_standby = on
synchronous_commit = on  # or off for async
synchronous_standby_names = 'standby1,standby2'  # for sync replication

创建复制用户：

CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'secure_password';

主服务器上的 pg_hba.conf：

# Allow replication connections
host replication replicator standby_ip/32 md5

备用服务器设置：

# Stop standby PostgreSQL
systemctl stop postgresql

# Remove old data directory
rm -rf /var/lib/postgresql/14/main

# Base backup from primary
pg_basebackup -h primary_host -D /var/lib/postgresql/14/main \
              -U replicator -P -v -R -X stream -C -S standby1

# Start standby
systemctl start postgresql

备用服务器配置（由 -R 标志创建）：

# standby.signal file created automatically
# postgresql.auto.conf contains:
primary_conninfo = 'host=primary_host port=5432 user=replicator password=secure_password'
primary_slot_name = 'standby1'

在主服务器上：

-- Check replication status
SELECT client_addr, state, sync_state, replay_lag
FROM pg_stat_replication;

-- Check replication slots
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;

在备用服务器上：

-- Check replication lag
SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;

-- Check recovery status
SELECT pg_is_in_recovery();

故障转移与切换

将备用服务器提升为主服务器：

# Trigger failover
pg_ctl promote -D /var/lib/postgresql/14/main

# Or using SQL
SELECT pg_promote();

# 1. Stop writes on primary
# 2. Wait for standby to catch up
# 3. Promote standby
# 4. Reconfigure old primary as new standby

在发布者（源）上：

-- Create publication
CREATE PUBLICATION my_publication FOR TABLE users, orders;

-- Or all tables
CREATE PUBLICATION all_tables FOR ALL TABLES;

在订阅者（目标）上：

-- Create subscription
CREATE SUBSCRIPTION my_subscription
    CONNECTION 'host=publisher_host dbname=mydb user=replicator password=pass'
    PUBLICATION my_publication;

-- Monitor subscription
SELECT * FROM pg_stat_subscription;

pg_basebackup：

# Full physical backup
pg_basebackup -h localhost -U postgres -D /backup/base \
              -F tar -z -P -v

# With WAL files for point-in-time recovery
pg_basebackup -h localhost -U postgres -D /backup/base \
              -X stream -F tar -z -P

连续归档（WAL 归档）：

# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/wal/%f'

# Single database
pg_dump -h localhost -U postgres -F c -b -v -f mydb.dump mydb

# All databases
pg_dumpall -h localhost -U postgres -f all_databases.sql

# Specific tables
pg_dump -h localhost -U postgres -t users -t orders -F c -f tables.dump mydb

# Schema only
pg_dump -h localhost -U postgres --schema-only -F c -f schema.dump mydb

# Restore database
pg_restore -h localhost -U postgres -d mydb -v mydb.dump

# Parallel restore
pg_restore -h localhost -U postgres -d mydb -j 4 -v mydb.dump

# Restore specific tables
pg_restore -h localhost -U postgres -d mydb -t users -v mydb.dump

时间点恢复（PITR）

进行基础备份
配置 WAL 归档
安全存储 WAL 文件

# 1. Restore base backup
tar -xzf base.tar.gz -C /var/lib/postgresql/14/main

# 2. Create recovery.signal file
touch /var/lib/postgresql/14/main/recovery.signal

# 3. Configure recovery target (postgresql.conf or postgresql.auto.conf)
restore_command = 'cp /archive/wal/%f %p'
recovery_target_time = '2024-01-15 14:30:00'
# Or: recovery_target_name = 'before_disaster'
# Or: recovery_target_lsn = '0/3000000'

# 4. Start PostgreSQL
systemctl start postgresql

3 份数据副本
2 种不同的存储介质类型
1 份异地备份

每日：增量 WAL 归档
每周：完整的 pg_basebackup
每月：长期保留

定期恢复到测试环境
验证数据完整性
测量恢复时间

要监控的关键指标

数据库健康状况：

活动连接数
事务速率
缓存命中率
死锁
检查点频率
自动清理运行

慢查询日志
查询执行时间
锁等待
顺序扫描

CPU 利用率
内存使用情况
磁盘 I/O
网络带宽

SELECT count(*) as total_connections,
       count(*) FILTER (WHERE state = 'active') as active,
       count(*) FILTER (WHERE state = 'idle') as idle,
       count(*) FILTER (WHERE state = 'idle in transaction') as idle_in_transaction
FROM pg_stat_activity;

缓存命中率：

SELECT sum(heap_blks_read) as heap_read,
       sum(heap_blks_hit) as heap_hit,
       sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS ratio
FROM pg_statio_user_tables;

SELECT schemaname, tablename,
       pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
       n_dead_tup,
       n_live_tup,
       round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

长时间运行的查询：

SELECT pid, now() - query_start AS duration, state, query
FROM pg_stat_activity
WHERE state != 'idle'
  AND query NOT LIKE '%pg_stat_activity%'
ORDER BY duration DESC;

SELECT blocked_locks.pid AS blocked_pid,
       blocked_activity.usename AS blocked_user,
       blocking_locks.pid AS blocking_pid,
       blocking_activity.usename AS blocking_user,
       blocked_activity.query AS blocked_statement,
       blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted
  AND blocking_locks.granted;

CREATE EXTENSION pg_stat_statements;

配置（postgresql.conf）：

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000

按总时间排序的顶级查询：

SELECT query,
       calls,
       total_exec_time,
       mean_exec_time,
       max_exec_time,
       rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

按平均时间排序的顶级查询：

SELECT query,
       calls,
       mean_exec_time,
       total_exec_time
FROM pg_stat_statements
WHERE calls > 100
ORDER BY mean_exec_time DESC
LIMIT 20;

事务系统规范化至 3NF
为读密集型工作负载选择性反规范化
使用外键保证数据完整性
考虑对超大型表进行分区

使用最合适的较小数据类型
大型 ID 用 BIGINT，较小范围用 INTEGER
精确十进制值用 NUMERIC
时间戳用 TIMESTAMP WITH TIME ZONE
除非需要长度约束，否则用 TEXT 而非 VARCHAR
分布式 ID 生成用 UUID
半结构化数据用 JSONB

所有表都设置主键
使用外键保证引用完整性
使用 CHECK 约束实现业务规则
在适当的地方使用 NOT NULL
使用 UNIQUE 约束保证唯一性
使用约束名称以提高可维护性

零停机迁移：

添加新列

ALTER TABLE users ADD COLUMN email_verified BOOLEAN;

回填数据（分批）

UPDATE users SET email_verified = false
WHERE email_verified IS NULL
LIMIT 10000;

添加 NOT NULL 约束

ALTER TABLE users ALTER COLUMN email_verified SET NOT NULL;

在生产环境中使用 CREATE INDEX CONCURRENTLY
不锁定表，允许读写
耗时更长但不阻塞
使用 pg_stat_progress_create_index 监控进度

大型表修改：

使用 pg_repack 进行表重写
在修改前对大型表进行分区
安排在维护窗口进行
在类似生产的数据集上测试

使用强密码或证书认证
密码加密使用 SCRAM-SHA-256
为不同应用程序使用不同的用户
避免应用程序连接使用超级用户

授予最小必需的权限
使用基于角色的访问控制
撤销 PUBLIC 访问权限
多租户使用行级安全

严格配置 pg_hba.conf
连接使用 SSL/TLS
防火墙保护数据库端口
复制使用 VPN 或私有网络

启用连接日志
记录 DDL 语句
使用 pgAudit 扩展进行详细审计
监控可疑活动

监控慢查询
检查复制延迟
审查自动清理活动
监控磁盘空间

分析顶级查询
审查索引使用情况
检查膨胀
备份验证

对关键表执行完整 VACUUM
对膨胀的索引执行 REINDEX
审查配置参数
容量规划

审查并优化索引
模式优化机会
升级规划
性能基线更新

max_parallel_workers_per_gather = 4
max_parallel_workers = 8
parallel_setup_cost = 1000
parallel_tuple_cost = 0.1
min_parallel_table_scan_size = 8MB

强制并行执行：

SET max_parallel_workers_per_gather = 4;
EXPLAIN ANALYZE SELECT COUNT(*) FROM large_table;

何时并行有帮助：

大型顺序扫描
大型聚合
大型表的哈希连接
位图堆扫描

自定义函数与过程

CREATE OR REPLACE PROCEDURE update_user_statistics()
LANGUAGE plpgsql
AS $$
BEGIN
    UPDATE users SET
        order_count = (SELECT COUNT(*) FROM orders WHERE user_id = users.id),
        last_order_date = (SELECT MAX(created_at) FROM orders WHERE user_id = users.id);

    COMMIT;
END;
$$;

具有适当错误处理的函数：

CREATE OR REPLACE FUNCTION create_user(
    p_email TEXT,
    p_name TEXT
) RETURNS INTEGER
LANGUAGE plpgsql
AS $$
DECLARE
    v_user_id INTEGER;
BEGIN
    INSERT INTO users (email, name)
    VALUES (p_email, p_name)
    RETURNING id INTO v_user_id;

    RETURN v_user_id;
EXCEPTION
    WHEN unique_violation THEN
        RAISE EXCEPTION 'Email already exists: %', p_email;
    WHEN OTHERS THEN
        RAISE EXCEPTION 'Error creating user: %', SQLERRM;
END;
$$;

外部数据包装器

访问外部数据源：

-- Install postgres_fdw
CREATE EXTENSION postgres_fdw;

-- Create server
CREATE SERVER remote_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'remote_host', dbname 'remote_database', port '5432');

-- Create user mapping
CREATE USER MAPPING FOR current_user
SERVER remote_db
OPTIONS (user 'remote_user', password 'remote_password');

-- Import foreign schema
IMPORT FOREIGN SCHEMA public
FROM SERVER remote_db
INTO local_schema;

-- Query foreign table
SELECT * FROM local_schema.remote_table;

JSON 与 JSONB 操作

-- GIN index for containment queries
CREATE INDEX idx_data_gin ON documents USING GIN (data);

-- Expression index for specific field
CREATE INDEX idx_data_status ON documents ((data->>'status'));

-- GIN index with jsonb_path_ops (smaller, faster for @> queries)
CREATE INDEX idx_data_path_ops ON documents USING GIN (data jsonb_path_ops);

高效的 JSONB 查询：

-- Containment query (uses GIN index)
SELECT * FROM documents WHERE data @> '{"status": "active"}';

-- Existence query
SELECT * FROM documents WHERE data ? 'email';

-- Path query
SELECT * FROM documents WHERE data->'user'->>'email' = 'user@example.com';

-- Array operations
SELECT * FROM documents WHERE data->'tags' @> '["sql", "postgres"]';

-- Add tsvector column
ALTER TABLE articles ADD COLUMN search_vector tsvector;

-- Generate search vector
UPDATE articles SET search_vector =
    to_tsvector('english', coalesce(title, '') || ' ' || coalesce(content, ''));

-- Create GIN index
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);

-- Trigger for automatic updates
CREATE TRIGGER articles_search_update
BEFORE INSERT OR UPDATE ON articles
FOR EACH ROW EXECUTE FUNCTION
tsvector_update_trigger(search_vector, 'pg_catalog.english', title, content);

-- Basic search
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'postgresql & database') query
WHERE search_vector @@ query
ORDER BY rank DESC;

-- Phrase search
SELECT title FROM articles
WHERE search_vector @@ phraseto_tsquery('english', 'database engineering');

-- Search with highlighting
SELECT title,
       ts_headline('english', content, query) AS snippet
FROM articles, to_tsquery('english', 'postgresql') query
WHERE search_vector @@ query;

问题：慢查询

检查 EXPLAIN ANALYZE 输出
验证索引是否存在且被使用
更新表统计信息：ANALYZE table_name
检查外键上是否缺少索引
查找对索引列的函数调用

问题：高 CPU 使用率

使用 pg_stat_statements 识别昂贵的查询
检查是否因缺少索引导致顺序扫描
审查并行查询设置
查找低效的连接或聚合

问题：连接耗尽

增加 max_connections（需要重启）
实施连接池（pgBouncer）
识别应用程序中的连接泄漏
使用 pg_stat_activity 监控

🇺🇸English

PostgreSQL Database Engineering

A comprehensive skill for professional PostgreSQL database engineering, covering everything from query optimization and indexing strategies to high availability, replication, and production database management. This skill enables you to design, optimize, and maintain high-performance PostgreSQL databases at scale.

When to Use This Skill

Use this skill when:

Designing database schemas for high-performance applications
Optimizing slow queries and improving database performance
Implementing indexing strategies for complex query patterns
Setting up partitioning for large tables (100M+ rows)
Configuring streaming replication and high availability
Tuning PostgreSQL configuration for production workloads
Implementing backup and recovery procedures
Debugging performance issues and query bottlenecks
Setting up connection pooling with pgBouncer or PgPool
Monitoring database health and performance metrics
Planning database migrations and schema changes
Implementing database security and access controls
Scaling PostgreSQL databases horizontally or vertically
Managing VACUUM operations and database maintenance
Setting up logical replication for data distribution

Core Concepts

PostgreSQL Architecture

PostgreSQL uses a process-based architecture with several key components:

Postmaster Process : Main server process that manages connections
Backend Processes : One per client connection, handles queries
Shared Memory : Shared buffers, WAL buffers, lock tables
Background Workers : Autovacuum, checkpointer, WAL writer, statistics collector
Write-Ahead Log (WAL) : Transaction log for durability and replication
Storage Layer : TOAST for large values, FSM for free space, VM for visibility

MVCC (Multi-Version Concurrency Control)

PostgreSQL's foundational concurrency mechanism:

Snapshots : Each transaction sees a consistent snapshot of data
Tuple Versions : Multiple row versions coexist for concurrent access
Transaction IDs : xmin (creating transaction), xmax (deleting transaction)
Visibility Rules : Determines which row versions are visible to transactions
VACUUM : Reclaims space from dead tuples and prevents transaction wraparound
FREEZE : Marks old rows as visible to all transactions

Key Implications:

No read locks - readers never block writers
Writers never block readers
Updates create new row versions
Regular VACUUM is essential
Dead tuples accumulate until vacuumed

Transaction Isolation Levels

PostgreSQL supports four isolation levels:

Read Uncommitted : Treated as Read Committed in PostgreSQL
Read Committed (default): Sees committed data at statement start
Repeatable Read : Sees snapshot from transaction start
Serializable : True serializable isolation with SSI

Choosing Isolation:

Read Committed: Most applications, best performance
Repeatable Read: Reports, analytics needing consistency
Serializable: Financial transactions, critical consistency needs

Index Types

PostgreSQL offers multiple index types for different use cases:

1. B-Tree (Default)

Use for : Equality, range queries, sorting
Supports : <, <=, =, >=, >, BETWEEN, IN, IS NULL
Best for : Most general-purpose indexing
Example : Primary keys, foreign keys, timestamps

2. Hash

Use for : Equality comparisons only
Supports : = operator
Best for : Large tables with equality lookups
Limitation : Not WAL-logged before PG 10, no range queries

3. GiST (Generalized Search Tree)

Use for : Geometric data, full-text search, custom types
Supports : Overlaps, contains, nearest neighbor
Best for : Spatial data, ranges, full-text search
Example : PostGIS geometries, tsvector, ranges

4. GIN (Generalized Inverted Index)

Use for : Multi-valued columns (arrays, JSONB, full-text)
Supports : Contains, exists operators
Best for : JSONB queries, array operations, full-text search
Tradeoff : Slower updates, faster queries

5. BRIN (Block Range Index)

Use for : Very large tables with natural ordering
Supports : Range queries on sorted data
Best for : Time-series data, append-only tables
Advantage : Tiny index size, scales to billions of rows

6. SP-GiST (Space-Partitioned GiST)

Use for : Non-balanced data structures
Supports : Points, ranges, IP addresses
Best for : Quadtrees, k-d trees, radix trees

Query Planning and Optimization

PostgreSQL's query planner determines execution strategies:

Planner Components:

Statistics : Table and column statistics for cardinality estimation
Cost Model : CPU, I/O, and memory cost estimation
Plan Types : Sequential scan, index scan, bitmap scan, joins
Join Methods : Nested loop, hash join, merge join
Optimization : Query rewriting, predicate pushdown, join reordering

Key Statistics:

n_distinct: Number of distinct values (for selectivity)
correlation: Physical row ordering correlation
most_common_vals: MCV list for skewed distributions
histogram_bounds: Value distribution histogram

Understanding EXPLAIN:

Cost : Startup cost .. total cost (arbitrary units)
Rows : Estimated row count
Width : Average row size in bytes
Actual Time : Real execution time (with ANALYZE)
Loops : Number of times node executed

Partitioning Strategies

Table partitioning for managing large datasets:

Range Partitioning

Use for : Time-series data, sequential values
Example : Partition by date ranges (daily, monthly, yearly)
Benefit : Easy data lifecycle management, faster queries

List Partitioning

Use for : Discrete categorical values
Example : Partition by country, region, status
Benefit : Logical data separation, partition pruning

Hash Partitioning

Use for : Even data distribution
Example : Partition by hash(user_id)
Benefit : Balanced partition sizes, parallel queries

Partition Pruning:

Planner eliminates irrelevant partitions
Drastically reduces query scope
Essential for partition performance

Partition-Wise Operations:

Partition-wise joins: Join matching partitions directly
Partition-wise aggregation: Aggregate within partitions
Parallel partition processing

Replication and High Availability

PostgreSQL replication options:

Streaming Replication (Physical)

Type : Binary WAL streaming to standby servers
Modes : Asynchronous, synchronous, quorum-based
Use for : High availability, read scalability
Failover : Automatic with tools like Patroni, repmgr

Synchronous vs Asynchronous:

Synchronous: Zero data loss, higher latency
Asynchronous: Low latency, potential data loss
Quorum: Balance between safety and performance

Logical Replication

Type : Row-level change stream
Use for : Selective replication, upgrades, multi-master
Benefit : Replicate specific tables, cross-version
Limitation : No DDL replication, overhead

Cascading Replication

Standbys replicate from other standbys
Reduces load on primary
Geographic distribution

Connection Pooling

Managing database connections efficiently:

pgBouncer

Type : Lightweight connection pooler
Modes : Session, transaction, statement pooling
Use for : High connection count applications
Benefit : Reduced connection overhead, resource limits

Pooling Modes:

Session : Client connects for entire session
Transaction : Connection per transaction
Statement : Connection per statement (rarely used)

PgPool-II

Type : Feature-rich middleware
Features : Connection pooling, load balancing, query caching
Use for : Read/write splitting, connection management
Benefit : Advanced routing, in-memory cache

VACUUM and Maintenance

Critical maintenance operations:

VACUUM

Purpose : Reclaim dead tuple space, update statistics
Types : Regular VACUUM, VACUUM FULL
When : After large updates/deletes, regularly via autovacuum
Impact : Regular VACUUM is non-blocking

ANALYZE

Purpose : Update planner statistics
When : After data changes, schema modifications
Impact : Minimal, fast on most tables

REINDEX

Purpose : Rebuild indexes, fix bloat
When : Index corruption, significant bloat
Impact : Locks table, use REINDEX CONCURRENTLY (PG 12+)

Autovacuum

Purpose : Automated VACUUM and ANALYZE
Configuration : Threshold-based triggering
Tuning : Balance resource usage vs. responsiveness
Monitoring : Track autovacuum runs, prevent wraparound

Performance Tuning

Key configuration parameters:

Memory Settings

shared_buffers: 25% of RAM (start point)
effective_cache_size: 50-75% of RAM
work_mem: Per-operation memory (sort, hash)
maintenance_work_mem: VACUUM, CREATE INDEX memory

Checkpoint and WAL

checkpoint_timeout: How often to checkpoint
max_wal_size: WAL size before checkpoint
checkpoint_completion_target: Spread checkpoint I/O
wal_buffers: WAL write buffer size

Query Planner

random_page_cost: Relative cost of random I/O
effective_io_concurrency: Concurrent I/O operations
default_statistics_target: Histogram detail level

Connection Settings

max_connections: Maximum client connections
connection_limit: Per-database/user limits

Index Strategies

Choosing the Right Index

Decision Matrix:

Query Pattern	Index Type	Reason
`WHERE id = 5`	B-tree	Equality lookup
`WHERE created_at > '2024-01-01'`	B-tree	Range query
`ORDER BY name`	B-tree	Sorting support
`WHERE tags @> ARRAY['sql']`	GIN	Array containment
`WHERE data->>'status' = 'active'`	GIN (jsonb_path_ops)

Composite Indexes

Multi-column indexes for complex queries:

Column Ordering Rules:

Equality columns first
Sort/range columns last
High-selectivity columns first
Match query patterns exactly

Example:

-- Query: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY created_at
-- Optimal index: (status, created_at)
CREATE INDEX idx_users_status_created ON users(status, created_at);

Partial Indexes

Index subset of rows:

Benefits:

Smaller index size
Faster updates on non-indexed rows
Targeted query optimization

Use Cases:

Index only active records: WHERE deleted_at IS NULL
Index recent data: WHERE created_at > NOW() - INTERVAL '90 days'
Index specific states: WHERE status IN ('pending', 'processing')

Expression Indexes

Index computed values:

Examples:

-- Case-insensitive search
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Date truncation
CREATE INDEX idx_events_date ON events(DATE(created_at));

-- JSONB field
CREATE INDEX idx_data_status ON documents((data->>'status'));

Covering Indexes (INCLUDE)

Include non-key columns for index-only scans:

CREATE INDEX idx_users_email_include
ON users(email)
INCLUDE (first_name, last_name, created_at);

Benefit: Query satisfied entirely from index, no table lookup

Index Maintenance

Monitoring Index Usage:

-- Unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

Detecting Bloat:

-- Index bloat estimation
SELECT schemaname, tablename, indexname,
       pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
       idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

Query Optimization

Using EXPLAIN ANALYZE

Understanding query execution:

-- Basic EXPLAIN
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';

-- EXPLAIN ANALYZE (actually runs query)
EXPLAIN ANALYZE SELECT * FROM users WHERE created_at > '2024-01-01';

-- Detailed output
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT u.*, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01';

Key Metrics:

Planning Time : Time to generate plan
Execution Time : Actual query runtime
Shared Hit vs Read : Buffer cache hits vs disk reads
Rows : Estimated vs actual row counts
Filter vs Index Cond : Post-scan filtering vs index usage

Common Query Anti-Patterns

1. N+1 Queries

Problem: One query per row in a loop Solution: JOIN or batch queries

2. SELECT *

Problem: Fetches unnecessary columns Solution: Select only needed columns

3. Implicit Type Conversions

Problem: Index not used due to type mismatch Solution: Ensure query types match column types

4. Function on Indexed Column

Problem: WHERE UPPER(email) = 'USER@EXAMPLE.COM' Solution: Use expression index or compare correctly

5. OR Conditions

Problem: WHERE status = 'A' OR status = 'B' Solution: Use IN: WHERE status IN ('A', 'B')

Join Optimization

Join Types:

Nested Loop
- Best for: Small outer table, indexed inner table
- How: For each outer row, scan inner table
- When: Small result sets, good indexes
Hash Join
- Best for: Large tables, no good indexes
- How: Build hash table of smaller table
- When: Equality joins, sufficient memory
Merge Join
- Best for: Pre-sorted data, equality joins
- How: Sort both inputs, merge scan
- When: Both inputs sorted or can be sorted cheaply

Join Order Matters:

Planner reorders joins for optimization
Statistics guide join order decisions
Can force order with SET join_collapse_limit

Aggregation Optimization

Techniques:

Partial Aggregates : Partition-wise aggregation
Hash Aggregates : In-memory grouping
Sorted Aggregates : Pre-sorted input
Parallel Aggregation : Multiple workers

Materialized Views:

Pre-compute expensive aggregations
Refresh on schedule or trigger
Trade freshness for query speed

Query Caching

Levels:

Shared Buffers : PostgreSQL page cache
OS Page Cache : Operating system cache
Application Cache : Redis, Memcached
Prepared Statements : Reuse query plans

Partitioning

Implementing Range Partitioning

Time-series example:

-- Create partitioned table
CREATE TABLE events (
    id BIGSERIAL,
    event_type TEXT NOT NULL,
    user_id INTEGER NOT NULL,
    data JSONB,
    created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Default partition for data outside ranges
CREATE TABLE events_default PARTITION OF events DEFAULT;

-- Indexes on partitions
CREATE INDEX idx_events_2024_01_user ON events_2024_01(user_id);
CREATE INDEX idx_events_2024_02_user ON events_2024_02(user_id);

Partition Automation

Automated partition management:

-- Function to create monthly partitions
CREATE OR REPLACE FUNCTION create_monthly_partition(
    base_table TEXT,
    partition_date DATE
) RETURNS VOID AS $$
DECLARE
    partition_name TEXT;
    start_date DATE;
    end_date DATE;
BEGIN
    partition_name := base_table || '_' || TO_CHAR(partition_date, 'YYYY_MM');
    start_date := DATE_TRUNC('month', partition_date);
    end_date := start_date + INTERVAL '1 month';

    EXECUTE format(
        'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I
         FOR VALUES FROM (%L) TO (%L)',
        partition_name, base_table, start_date, end_date
    );

    -- Create indexes
    EXECUTE format(
        'CREATE INDEX IF NOT EXISTS %I ON %I(user_id)',
        'idx_' || partition_name || '_user', partition_name
    );
END;
$$ LANGUAGE plpgsql;

Partition Maintenance

Dropping old partitions:

-- Detach partition (fast, non-blocking)
ALTER TABLE events DETACH PARTITION events_2023_01;

-- Drop detached partition
DROP TABLE events_2023_01;

-- Or archive before dropping
CREATE TABLE archive.events_2023_01 AS SELECT * FROM events_2023_01;
DROP TABLE events_2023_01;

High Availability and Replication

Setting Up Streaming Replication

Primary server configuration (postgresql.conf):

# Replication settings
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
hot_standby = on
synchronous_commit = on  # or off for async
synchronous_standby_names = 'standby1,standby2'  # for sync replication

Create replication user:

CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'secure_password';

pg_hba.conf on primary:

# Allow replication connections
host replication replicator standby_ip/32 md5

Standby server setup:

# Stop standby PostgreSQL
systemctl stop postgresql

# Remove old data directory
rm -rf /var/lib/postgresql/14/main

# Base backup from primary
pg_basebackup -h primary_host -D /var/lib/postgresql/14/main \
              -U replicator -P -v -R -X stream -C -S standby1

# Start standby
systemctl start postgresql

Standby configuration (created by -R flag):

# standby.signal file created automatically
# postgresql.auto.conf contains:
primary_conninfo = 'host=primary_host port=5432 user=replicator password=secure_password'
primary_slot_name = 'standby1'

Monitoring Replication

On primary:

-- Check replication status
SELECT client_addr, state, sync_state, replay_lag
FROM pg_stat_replication;

-- Check replication slots
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;

On standby:

-- Check replication lag
SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;

-- Check recovery status
SELECT pg_is_in_recovery();

Failover and Switchover

Promoting standby to primary:

# Trigger failover
pg_ctl promote -D /var/lib/postgresql/14/main

# Or using SQL
SELECT pg_promote();

Controlled switchover:

# 1. Stop writes on primary
# 2. Wait for standby to catch up
# 3. Promote standby
# 4. Reconfigure old primary as new standby

Logical Replication Setup

On publisher (source):

-- Create publication
CREATE PUBLICATION my_publication FOR TABLE users, orders;

-- Or all tables
CREATE PUBLICATION all_tables FOR ALL TABLES;

On subscriber (destination):

-- Create subscription
CREATE SUBSCRIPTION my_subscription
    CONNECTION 'host=publisher_host dbname=mydb user=replicator password=pass'
    PUBLICATION my_publication;

-- Monitor subscription
SELECT * FROM pg_stat_subscription;

Backup and Recovery

Physical Backups

pg_basebackup:

# Full physical backup
pg_basebackup -h localhost -U postgres -D /backup/base \
              -F tar -z -P -v

# With WAL files for point-in-time recovery
pg_basebackup -h localhost -U postgres -D /backup/base \
              -X stream -F tar -z -P

Continuous archiving (WAL archiving):

# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/wal/%f'

Logical Backups

pg_dump:

# Single database
pg_dump -h localhost -U postgres -F c -b -v -f mydb.dump mydb

# All databases
pg_dumpall -h localhost -U postgres -f all_databases.sql

# Specific tables
pg_dump -h localhost -U postgres -t users -t orders -F c -f tables.dump mydb

# Schema only
pg_dump -h localhost -U postgres --schema-only -F c -f schema.dump mydb

pg_restore:

# Restore database
pg_restore -h localhost -U postgres -d mydb -v mydb.dump

# Parallel restore
pg_restore -h localhost -U postgres -d mydb -j 4 -v mydb.dump

# Restore specific tables
pg_restore -h localhost -U postgres -d mydb -t users -v mydb.dump

Point-in-Time Recovery (PITR)

Setup:

Take base backup
Configure WAL archiving
Store WAL files safely

Recovery:

# 1. Restore base backup
tar -xzf base.tar.gz -C /var/lib/postgresql/14/main

# 2. Create recovery.signal file
touch /var/lib/postgresql/14/main/recovery.signal

# 3. Configure recovery target (postgresql.conf or postgresql.auto.conf)
restore_command = 'cp /archive/wal/%f %p'
recovery_target_time = '2024-01-15 14:30:00'
# Or: recovery_target_name = 'before_disaster'
# Or: recovery_target_lsn = '0/3000000'

# 4. Start PostgreSQL
systemctl start postgresql

Backup Strategies

3-2-1 Rule:

3 copies of data
2 different media types
1 offsite backup

Backup Schedule:

Daily : Incremental WAL archiving
Weekly : Full pg_basebackup
Monthly : Long-term retention

Testing Backups:

Regularly restore to test environment
Verify data integrity
Measure restore time

Performance Monitoring

Key Metrics to Monitor

Database Health:

Active connections
Transaction rate
Cache hit ratio
Deadlocks
Checkpoint frequency
Autovacuum runs

Query Performance:

Slow query log
Query execution time
Lock waits
Sequential scans

System Resources:

CPU utilization
Memory usage
Disk I/O
Network bandwidth

Essential Monitoring Queries

Connection stats:

SELECT count(*) as total_connections,
       count(*) FILTER (WHERE state = 'active') as active,
       count(*) FILTER (WHERE state = 'idle') as idle,
       count(*) FILTER (WHERE state = 'idle in transaction') as idle_in_transaction
FROM pg_stat_activity;

Cache hit ratio:

SELECT sum(heap_blks_read) as heap_read,
       sum(heap_blks_hit) as heap_hit,
       sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS ratio
FROM pg_statio_user_tables;

Table bloat:

SELECT schemaname, tablename,
       pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
       n_dead_tup,
       n_live_tup,
       round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Long-running queries:

SELECT pid, now() - query_start AS duration, state, query
FROM pg_stat_activity
WHERE state != 'idle'
  AND query NOT LIKE '%pg_stat_activity%'
ORDER BY duration DESC;

Lock monitoring:

SELECT blocked_locks.pid AS blocked_pid,
       blocked_activity.usename AS blocked_user,
       blocking_locks.pid AS blocking_pid,
       blocking_activity.usename AS blocking_user,
       blocked_activity.query AS blocked_statement,
       blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted
  AND blocking_locks.granted;

pg_stat_statements

Installation:

CREATE EXTENSION pg_stat_statements;

Configuration (postgresql.conf):

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000

Top queries by total time:

SELECT query,
       calls,
       total_exec_time,
       mean_exec_time,
       max_exec_time,
       rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

Top queries by average time:

SELECT query,
       calls,
       mean_exec_time,
       total_exec_time
FROM pg_stat_statements
WHERE calls > 100
ORDER BY mean_exec_time DESC
LIMIT 20;

Best Practices

Schema Design

Normalization:

Normalize to 3NF for transactional systems
Denormalize selectively for read-heavy workloads
Use foreign keys for data integrity
Consider partitioning for very large tables

Data Types:

Use smallest appropriate data type
BIGINT for large IDs, INTEGER for smaller ranges
NUMERIC for exact decimal values
TIMESTAMP WITH TIME ZONE for timestamps
TEXT over VARCHAR unless length constraint needed
UUID for distributed ID generation
JSONB for semi-structured data

Constraints:

Primary keys on all tables
Foreign keys for referential integrity
CHECK constraints for business rules
NOT NULL where appropriate
UNIQUE constraints for uniqueness
Use constraint names for maintainability

Migration Strategies

Zero-Downtime Migrations:

Add new column

ALTER TABLE users ADD COLUMN email_verified BOOLEAN;
Backfill data (in batches)

UPDATE users SET email_verified = false WHERE email_verified IS NULL LIMIT 10000;
Add NOT NULL constraint

ALTER TABLE users ALTER COLUMN email_verified SET NOT NULL;

Index Creation:

Use CREATE INDEX CONCURRENTLY in production
No table locks, allows reads/writes
Takes longer but doesn't block
Monitor progress with pg_stat_progress_create_index

Large Table Modifications:

Use pg_repack for table rewrites
Partition large tables before modifications
Schedule during maintenance windows
Test on production-like datasets

Security Best Practices

Authentication:

Use strong passwords or certificate authentication
SCRAM-SHA-256 for password encryption
Separate users for different applications
Avoid superuser for application connections

Authorization:

Grant minimal required privileges
Use role-based access control
Revoke PUBLIC access
Row-level security for multi-tenant

Network Security:

Configure pg_hba.conf restrictively
Use SSL/TLS for connections
Firewall database ports
VPN or private networks for replication

Audit Logging:

Enable connection logging
Log DDL statements
Use pgAudit extension for detailed auditing
Monitor for suspicious activity

Maintenance Schedule

Daily:

Monitor slow queries
Check replication lag
Review autovacuum activity
Monitor disk space

Weekly:

Analyze top queries
Review index usage
Check for bloat
Backup verification

Monthly:

Full VACUUM on critical tables
REINDEX bloated indexes
Review configuration parameters
Capacity planning

Quarterly:

Review and optimize indexes
Schema optimization opportunities
Upgrade planning
Performance baseline updates

Advanced Topics

Parallel Query Execution

Configuration:

max_parallel_workers_per_gather = 4
max_parallel_workers = 8
parallel_setup_cost = 1000
parallel_tuple_cost = 0.1
min_parallel_table_scan_size = 8MB

Forcing parallel execution:

SET max_parallel_workers_per_gather = 4;
EXPLAIN ANALYZE SELECT COUNT(*) FROM large_table;

When parallelism helps:

Large sequential scans
Large aggregations
Hash joins on large tables
Bitmap heap scans

Custom Functions and Procedures

Stored procedures:

CREATE OR REPLACE PROCEDURE update_user_statistics()
LANGUAGE plpgsql
AS $$
BEGIN
    UPDATE users SET
        order_count = (SELECT COUNT(*) FROM orders WHERE user_id = users.id),
        last_order_date = (SELECT MAX(created_at) FROM orders WHERE user_id = users.id);

    COMMIT;
END;
$$;

Functions with proper error handling:

CREATE OR REPLACE FUNCTION create_user(
    p_email TEXT,
    p_name TEXT
) RETURNS INTEGER
LANGUAGE plpgsql
AS $$
DECLARE
    v_user_id INTEGER;
BEGIN
    INSERT INTO users (email, name)
    VALUES (p_email, p_name)
    RETURNING id INTO v_user_id;

    RETURN v_user_id;
EXCEPTION
    WHEN unique_violation THEN
        RAISE EXCEPTION 'Email already exists: %', p_email;
    WHEN OTHERS THEN
        RAISE EXCEPTION 'Error creating user: %', SQLERRM;
END;
$$;

Foreign Data Wrappers

Access external data sources:

-- Install postgres_fdw
CREATE EXTENSION postgres_fdw;

-- Create server
CREATE SERVER remote_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'remote_host', dbname 'remote_database', port '5432');

-- Create user mapping
CREATE USER MAPPING FOR current_user
SERVER remote_db
OPTIONS (user 'remote_user', password 'remote_password');

-- Import foreign schema
IMPORT FOREIGN SCHEMA public
FROM SERVER remote_db
INTO local_schema;

-- Query foreign table
SELECT * FROM local_schema.remote_table;

JSON and JSONB Operations

Indexing JSONB:

-- GIN index for containment queries
CREATE INDEX idx_data_gin ON documents USING GIN (data);

-- Expression index for specific field
CREATE INDEX idx_data_status ON documents ((data->>'status'));

-- GIN index with jsonb_path_ops (smaller, faster for @> queries)
CREATE INDEX idx_data_path_ops ON documents USING GIN (data jsonb_path_ops);

Efficient JSONB queries:

-- Containment query (uses GIN index)
SELECT * FROM documents WHERE data @> '{"status": "active"}';

-- Existence query
SELECT * FROM documents WHERE data ? 'email';

-- Path query
SELECT * FROM documents WHERE data->'user'->>'email' = 'user@example.com';

-- Array operations
SELECT * FROM documents WHERE data->'tags' @> '["sql", "postgres"]';

Full-Text Search

Basic setup:

-- Add tsvector column
ALTER TABLE articles ADD COLUMN search_vector tsvector;

-- Generate search vector
UPDATE articles SET search_vector =
    to_tsvector('english', coalesce(title, '') || ' ' || coalesce(content, ''));

-- Create GIN index
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);

-- Trigger for automatic updates
CREATE TRIGGER articles_search_update
BEFORE INSERT OR UPDATE ON articles
FOR EACH ROW EXECUTE FUNCTION
tsvector_update_trigger(search_vector, 'pg_catalog.english', title, content);

Search queries:

-- Basic search
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'postgresql & database') query
WHERE search_vector @@ query
ORDER BY rank DESC;

-- Phrase search
SELECT title FROM articles
WHERE search_vector @@ phraseto_tsquery('english', 'database engineering');

-- Search with highlighting
SELECT title,
       ts_headline('english', content, query) AS snippet
FROM articles, to_tsquery('english', 'postgresql') query
WHERE search_vector @@ query;

Troubleshooting

Common Issues

Problem: Slow Queries

Check EXPLAIN ANALYZE output
Verify indexes exist and are used
Update table statistics: ANALYZE table_name
Check for missing indexes on foreign keys
Look for function calls on indexed columns

Problem: High CPU Usage

Identify expensive queries with pg_stat_statements
Check for missing indexes causing sequential scans
Review parallel query settings
Look for inefficient joins or aggregations

Problem: Connection Exhaustion

Increase max_connections (requires restart)
Implement connection pooling (pgBouncer)
Identify connection leaks in application
Monitor with pg_stat_activity

Problem: Autovacuum Not Keeping Up

Increase autovacuum_max_workers
Adjust autovacuum thresholds
Reduce autovacuum_naptime
Increase autovacuum_work_mem
Check for long-running transactions blocking VACUUM

Problem: Replication Lag

Check network bandwidth between primary and standby
Verify standby hardware resources
Check for long-running queries on standby
Monitor WAL generation rate
Consider increasing wal_sender_timeout

Problem: Transaction ID Wraparound

Monitor age of oldest transaction
Run VACUUM FREEZE on old tables
Check autovacuum_freeze_max_age
Increase autovacuum aggressiveness
Run manual VACUUM FREEZE if necessary

Diagnostic Queries

Find missing indexes on foreign keys:

SELECT c.conrelid::regclass AS table,
       c.confrelid::regclass AS referenced_table,
       string_agg(a.attname, ', ') AS foreign_key_columns
FROM pg_constraint c
JOIN pg_attribute a ON a.attnum = ANY(c.conkey) AND a.attrelid = c.conrelid
WHERE c.contype = 'f'
  AND NOT EXISTS (
    SELECT 1 FROM pg_index i
    WHERE i.indrelid = c.conrelid
      AND c.conkey[1:array_length(c.conkey, 1)]
          OPERATOR(pg_catalog.@>) i.indkey[0:array_length(c.conkey, 1) - 1]
  )
GROUP BY c.conrelid, c.confrelid, c.conname;

Identify blocking queries:

SELECT activity.pid,
       activity.usename,
       activity.query,
       blocking.pid AS blocking_id,
       blocking.query AS blocking_query
FROM pg_stat_activity AS activity
JOIN pg_stat_activity AS blocking ON blocking.pid = ANY(pg_blocking_pids(activity.pid));

Skill Version : 1.0.0 Last Updated : October 2025 Skill Category : Database Engineering, Performance Optimization, Data Architecture Compatible With : PostgreSQL 12+, 13, 14, 15, 16 Prerequisites : SQL knowledge, basic database concepts, Linux command line

Weekly Installs

452

Repository

manutej/luxor-c…ketplace

GitHub Stars

First Seen

Jan 22, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykFail

Installed on

opencode401

gemini-cli386

codex383

github-copilot368

cursor360

kimi-cli302

GSAP React 动画库使用指南：useGSAP Hook 与最佳实践

1,400 周安装

PostgreSQL数据库工程：查询优化、索引策略与高可用性配置实战指南

🇨🇳中文介绍

PostgreSQL 数据库工程

何时使用此技能

核心概念

PostgreSQL 架构

MVCC（多版本并发控制）

相关 Skills

事务隔离级别

索引类型

1. B-Tree（默认）

2. Hash

3. GiST（通用搜索树）

4. GIN（通用倒排索引）

5. BRIN（块范围索引）

6. SP-GiST（空间分区 GiST）

查询规划与优化

分区策略

范围分区

列表分区

哈希分区

复制与高可用性

流式复制（物理）

逻辑复制

级联复制

连接池

pgBouncer

PgPool-II

VACUUM 与维护

VACUUM

ANALYZE

REINDEX

自动清理

性能调优

内存设置

检查点与 WAL

查询规划器

连接设置

索引策略

选择正确的索引

复合索引

部分索引

表达式索引

覆盖索引（INCLUDE）

索引维护

查询优化

使用 EXPLAIN ANALYZE

常见查询反模式

1. N+1 查询

2. SELECT *

3. 隐式类型转换

4. 对索引列使用函数

5. OR 条件

连接优化

聚合优化

查询缓存

分区

实现范围分区

分区自动化

分区维护

高可用性与复制

设置流式复制

监控复制

故障转移与切换

逻辑复制设置

备份与恢复

物理备份

逻辑备份

时间点恢复（PITR）

备份策略

性能监控

要监控的关键指标

基本监控查询

pg_stat_statements

最佳实践

模式设计

迁移策略

安全最佳实践

维护计划

高级主题