重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
database-sharding by secondsky/claude-skills
npx skills add https://github.com/secondsky/claude-skills --skill database-sharding用于水平扩展的综合数据库分片模式,包含基于哈希、范围和目录的策略。
步骤1:从模板中选择分片策略:
# 基于哈希(均匀分布)
cat templates/hash-router.ts
# 基于范围(时间序列数据)
cat templates/range-router.ts
# 基于目录(多租户)
cat templates/directory-router.ts
步骤2:选择分片键标准:
步骤3:实现路由器:
import { HashRouter } from './hash-router';
const router = new HashRouter([
{ id: 'shard_0', connection: { host: 'db0.example.com' } },
{ id: 'shard_1', connection: { host: 'db1.example.com' } },
{ id: 'shard_2', connection: { host: 'db2.example.com' } },
{ id: 'shard_3', connection: { host: 'db3.example.com' } },
]);
// 查询单个分片
const user = await router.query('user_123', 'SELECT * FROM users WHERE id = $1', ['user_123']);
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 规则 | 原因 |
|---|---|
| 查询中必须包含分片键 | 避免扫描所有分片(速度慢100倍) |
| 监控分片分布 | 在热点导致故障前检测到它们 |
| 预先规划再平衡策略 | 后期无法轻松添加分片 |
| 选择不可变的分片键 | 更改分片键 = 数据迁移噩梦 |
| 使用生产数据测试分布 | 合成数据会隐藏真实热点 |
| 为数据局部性进行反规范化 | 将相关数据保留在同一分片上 |
| 反模式 | 为何有害 |
|---|---|
| 使用顺序ID进行范围分片 | 最新的分片接收所有写入(热点) |
| 使用时间戳作为分片键 | 最近的分片不堪重负 |
| 没有2PC的跨分片事务 | 数据损坏、不一致 |
| 没有一致性哈希的简单取模 | 无法在不完全重新分片的情况下添加分片 |
| 可为空的分片键 | 特殊的NULL处理会创建热点 |
| 没有分片路由层 | 硬编码的分片 = 无法再平衡 |
症状:一个分片接收80%以上的流量 修复方法:
// ❌ 错误:低基数性(状态字段)
shard_key = order.status; // 90% 为 'pending' → shard_0 过载
// ✅ 正确:高基数性(user_id)
shard_key = order.user_id; // 数百万用户,均匀分布
症状:查询扫描所有分片(极其缓慢) 修复方法:
// ❌ 错误:没有分片键
SELECT * FROM orders WHERE status = 'shipped'; // 扫描所有100个分片!
// ✅ 正确:包含分片键
SELECT * FROM orders WHERE user_id = ? AND status = 'shipped'; // 定位到1个分片
症状:最新的分片接收所有写入 修复方法:
// ❌ 错误:使用自增ID进行范围分片
// 分片 0: 1-1M, 分片 1: 1M-2M, 分片 2: 2M+ → 所有新写入都到分片 2!
// ✅ 正确:基于哈希的分片
const shardId = hash(id) % shardCount; // 均匀分布
症状:受限于初始分片数量,无法扩展 修复方法:
// ❌ 错误:简单取模
const shardId = hash(key) % shardCount; // 添加第5个分片会破坏所有键
// ✅ 正确:一致性哈希
const ring = new ConsistentHashRing(shards);
const shardId = ring.getNode(key); // 添加分片时只有约25%的键需要移动
症状:数据不一致,部分写入 修复方法:
// ❌ 错误:跨分片事务(会导致损坏)
BEGIN;
UPDATE shard_1.accounts SET balance = balance - 100 WHERE id = 'A';
UPDATE shard_2.accounts SET balance = balance + 100 WHERE id = 'B';
COMMIT; // 如果 shard_2 失败,shard_1 已经提交!
// ✅ 正确:两阶段提交或Saga模式
const txn = new TwoPhaseCommitTransaction();
txn.addOperation(shard_1, 'UPDATE accounts SET balance = balance - 100 WHERE id = ?', ['A']);
txn.addOperation(shard_2, 'UPDATE accounts SET balance = balance + 100 WHERE id = ?', ['B']);
await txn.execute(); // 跨分片原子性
症状:记录移动分片,导致重复 修复方法:
// ❌ 错误:按国家分片(用户搬迁)
shard_key = user.country; // 用户从美国搬到加拿大,现在在不同的分片!
// ✅ 正确:按不可变的 user_id 分片
shard_key = user.id; // 永不更改
症状:静默的热点,突然的性能下降 修复方法:
// ✅ 必需的指标
- 每个分片的记录数(应在20%以内)
- 查询分布(没有分片超过40%的查询)
- 每个分片的存储(在80%时告警)
- 每个分片的延迟 p99
加载 references/error-catalog.md 查看所有10个错误及其详细修复方法。
| 策略 | 最适合 | 优点 | 缺点 |
|---|---|---|---|
| 哈希 | 用户数据,负载均衡关键 | 无热点,可预测 | 范围查询分散 |
| 范围 | 时间序列、日志、仅追加 | 范围查询高效,易于归档 | 最近分片热点 |
| 目录 | 多租户,复杂路由 | 灵活,易于再平衡 | 查找开销,单点故障 |
加载 references/sharding-strategies.md 查看详细比较及生产环境示例(Instagram、Discord、Salesforce)。
| 标准 | 重要性 | 检查方法 |
|---|---|---|
| 高基数性 | 关键 | COUNT(DISTINCT shard_key) > 分片数量 × 100 |
| 均匀分布 | 关键 | 没有值超过总量的5% |
| 不可变性 | 关键 | 值永不更改 |
| 查询对齐 | 高 | 80%以上的查询包含它 |
| 数据局部性 | 中 | 相关记录在一起 |
决策树:
user_idtenant_idtimestamp(范围分片)product_id加载 references/shard-key-selection.md 查看全面的决策树和测试策略。
import { HashRouter } from './templates/hash-router';
const router = new HashRouter([
{ id: 'shard_0', connection: { /* PostgreSQL 配置 */ } },
{ id: 'shard_1', connection: { /* PostgreSQL 配置 */ } },
]);
// 自动路由到正确的分片
const user = await router.query('user_123', 'SELECT * FROM users WHERE id = $1', ['user_123']);
import { RangeRouter } from './templates/range-router';
const router = new RangeRouter(shardConfigs, [
{ start: Date.parse('2024-01-01'), end: Date.parse('2024-04-01'), shardId: 'shard_q1' },
{ start: Date.parse('2024-04-01'), end: Date.parse('2024-07-01'), shardId: 'shard_q2' },
{ start: Date.parse('2024-07-01'), end: Infinity, shardId: 'shard_q3' },
]);
// 范围查询定位到特定分片
const janEvents = await router.queryRange(
Date.parse('2024-01-01'),
Date.parse('2024-02-01'),
'SELECT * FROM events WHERE created_at BETWEEN $1 AND $2'
);
import { DirectoryRouter } from './templates/directory-router';
const router = new DirectoryRouter(directoryDBConfig, shardConfigs);
// 将租户分配到特定分片
await router.assignShard('tenant_acme', 'shard_enterprise');
// 自动路由
const users = await router.query('tenant_acme', 'SELECT * FROM users');
加载 references/sharding-strategies.md 当:
加载 references/shard-key-selection.md 当:
加载 references/implementation-patterns.md 当:
加载 references/cross-shard-queries.md 当:
加载 references/rebalancing-guide.md 当:
加载 references/error-catalog.md 当:
分片前:
路由器实现:
分片配置:
应用程序变更:
之前(单数据库不堪重负):
// 单个 PostgreSQL 实例
const db = new Pool({ host: 'db.example.com' });
// 所有1000万用户都在一台服务器上
const users = await db.query('SELECT * FROM users WHERE status = $1', ['active']);
// 查询时间:5000ms(缓慢!)
// 数据库CPU:95%
// 磁盘:500GB,持续增长
之后(分片到8台服务器):
// 基于哈希的分片,8个分片
const router = new HashRouter([
{ id: 'shard_0', connection: { host: 'db0.example.com' } },
{ id: 'shard_1', connection: { host: 'db1.example.com' } },
// ... 还有6个分片
]);
// 查询单个用户(定位到1个分片)
const user = await router.query('user_123', 'SELECT * FROM users WHERE id = $1', ['user_123']);
// 查询时间:10ms(快500倍!)
// 查询所有分片(分散-聚集)
const allActive = await router.queryAll('SELECT * FROM users WHERE status = $1', ['active']);
// 查询时间:800ms(在8个分片上并行化,比单机快6倍)
// 结果:每个分片处理约125万用户
// 每个分片的数据库CPU:20%
// 每个分片的磁盘:65GB
// 可以轻松扩展到16个分片(一致性哈希)
所有10个已记录的错误均已预防:
参见:references/error-catalog.md 获取详细修复方法
模板:
templates/hash-router.ts - 基于哈希的分片templates/range-router.ts - 基于范围的分片templates/directory-router.ts - 基于目录的分片templates/cross-shard-aggregation.ts - 聚合模式参考文档:
references/sharding-strategies.md - 策略比较references/shard-key-selection.md - 键选择指南references/implementation-patterns.md - 路由器实现references/cross-shard-queries.md - 查询模式references/rebalancing-guide.md - 迁移策略references/error-catalog.md - 所有10个错误记录生产环境示例:
生产环境测试 | 预防10个错误 | MIT许可证
每周安装次数
63
代码仓库
GitHub星标数
90
首次出现
2026年1月25日
安全审计
已安装于
claude-code55
gemini-cli50
cursor50
codex49
opencode48
github-copilot46
Comprehensive database sharding patterns for horizontal scaling with hash, range, and directory-based strategies.
Step 1 : Choose sharding strategy from templates:
# Hash-based (even distribution)
cat templates/hash-router.ts
# Range-based (time-series data)
cat templates/range-router.ts
# Directory-based (multi-tenancy)
cat templates/directory-router.ts
Step 2 : Select shard key criteria:
Step 3 : Implement router:
import { HashRouter } from './hash-router';
const router = new HashRouter([
{ id: 'shard_0', connection: { host: 'db0.example.com' } },
{ id: 'shard_1', connection: { host: 'db1.example.com' } },
{ id: 'shard_2', connection: { host: 'db2.example.com' } },
{ id: 'shard_3', connection: { host: 'db3.example.com' } },
]);
// Query single shard
const user = await router.query('user_123', 'SELECT * FROM users WHERE id = $1', ['user_123']);
| Rule | Reason |
|---|---|
| Include shard key in queries | Avoid scanning all shards (100x slower) |
| Monitor shard distribution | Detect hotspots before they cause outages |
| Plan for rebalancing upfront | Cannot easily add shards later |
| Choose immutable shard key | Changing key = data migration nightmare |
| Test distribution with production data | Synthetic data hides real hotspots |
| Denormalize for data locality | Keep related data on same shard |
| Anti-Pattern | Why It's Bad |
|---|---|
| Sequential ID with range sharding | Latest shard gets all writes (hotspot) |
| Timestamp as shard key | Recent shard overwhelmed |
| Cross-shard transactions without 2PC | Data corruption, inconsistency |
| Simple modulo without consistent hashing | Cannot add shards without full re-shard |
| Nullable shard key | Special NULL handling creates hotspots |
| No shard routing layer | Hardcoded shards = cannot rebalance |
Symptom : One shard receives 80%+ of traffic Fix :
// ❌ Bad: Low cardinality (status field)
shard_key = order.status; // 90% are 'pending' → shard_0 overloaded
// ✅ Good: High cardinality (user_id)
shard_key = order.user_id; // Millions of users, even distribution
Symptom : Queries scan ALL shards (extremely slow) Fix :
// ❌ Bad: No shard key
SELECT * FROM orders WHERE status = 'shipped'; // Scans all 100 shards!
// ✅ Good: Include shard key
SELECT * FROM orders WHERE user_id = ? AND status = 'shipped'; // Targets 1 shard
Symptom : Latest shard gets all writes Fix :
// ❌ Bad: Range sharding with auto-increment
// Shard 0: 1-1M, Shard 1: 1M-2M, Shard 2: 2M+ → All new writes to Shard 2!
// ✅ Good: Hash-based sharding
const shardId = hash(id) % shardCount; // Even distribution
Symptom : Stuck with initial shard count, cannot scale Fix :
// ❌ Bad: Simple modulo
const shardId = hash(key) % shardCount; // Adding 5th shard breaks ALL keys
// ✅ Good: Consistent hashing
const ring = new ConsistentHashRing(shards);
const shardId = ring.getNode(key); // Only ~25% of keys move when adding shard
Symptom : Data inconsistency, partial writes Fix :
// ❌ Bad: Cross-shard transaction (will corrupt)
BEGIN;
UPDATE shard_1.accounts SET balance = balance - 100 WHERE id = 'A';
UPDATE shard_2.accounts SET balance = balance + 100 WHERE id = 'B';
COMMIT; // If shard_2 fails, shard_1 already committed!
// ✅ Good: Two-Phase Commit or Saga pattern
const txn = new TwoPhaseCommitTransaction();
txn.addOperation(shard_1, 'UPDATE accounts SET balance = balance - 100 WHERE id = ?', ['A']);
txn.addOperation(shard_2, 'UPDATE accounts SET balance = balance + 100 WHERE id = ?', ['B']);
await txn.execute(); // Atomic across shards
Symptom : Records move shards, causing duplicates Fix :
// ❌ Bad: Shard by country (user relocates)
shard_key = user.country; // User moves US → CA, now in different shard!
// ✅ Good: Shard by immutable user_id
shard_key = user.id; // Never changes
Symptom : Silent hotspots, sudden performance degradation Fix :
// ✅ Required metrics
- Per-shard record counts (should be within 20%)
- Query distribution (no shard > 40% of queries)
- Storage per shard (alert at 80%)
- Latency p99 per shard
Load references/error-catalog.md for all 10 errors with detailed fixes.
| Strategy | Best For | Pros | Cons |
|---|---|---|---|
| Hash | User data, even load critical | No hotspots, predictable | Range queries scatter |
| Range | Time-series, logs, append-only | Range queries efficient, archival | Recent shard hotspot |
| Directory | Multi-tenancy, complex routing | Flexible, easy rebalancing | Lookup overhead, SPOF |
Load references/sharding-strategies.md for detailed comparisons with production examples (Instagram, Discord, Salesforce).
| Criterion | Importance | Check Method |
|---|---|---|
| High cardinality | Critical | COUNT(DISTINCT shard_key) > shard_count × 100 |
| Even distribution | Critical | No value > 5% of total |
| Immutable | Critical | Value never changes |
| Query alignment | High | 80%+ queries include it |
| Data locality | Medium | Related records together |
Decision Tree :
user_idtenant_idtimestamp (range sharding)product_idLoad references/shard-key-selection.md for comprehensive decision trees and testing strategies.
import { HashRouter } from './templates/hash-router';
const router = new HashRouter([
{ id: 'shard_0', connection: { /* PostgreSQL config */ } },
{ id: 'shard_1', connection: { /* PostgreSQL config */ } },
]);
// Automatically routes to correct shard
const user = await router.query('user_123', 'SELECT * FROM users WHERE id = $1', ['user_123']);
import { RangeRouter } from './templates/range-router';
const router = new RangeRouter(shardConfigs, [
{ start: Date.parse('2024-01-01'), end: Date.parse('2024-04-01'), shardId: 'shard_q1' },
{ start: Date.parse('2024-04-01'), end: Date.parse('2024-07-01'), shardId: 'shard_q2' },
{ start: Date.parse('2024-07-01'), end: Infinity, shardId: 'shard_q3' },
]);
// Range queries target specific shards
const janEvents = await router.queryRange(
Date.parse('2024-01-01'),
Date.parse('2024-02-01'),
'SELECT * FROM events WHERE created_at BETWEEN $1 AND $2'
);
import { DirectoryRouter } from './templates/directory-router';
const router = new DirectoryRouter(directoryDBConfig, shardConfigs);
// Assign tenant to specific shard
await router.assignShard('tenant_acme', 'shard_enterprise');
// Route automatically
const users = await router.query('tenant_acme', 'SELECT * FROM users');
Load references/sharding-strategies.md when:
Load references/shard-key-selection.md when:
Load references/implementation-patterns.md when:
Load references/cross-shard-queries.md when:
Load references/rebalancing-guide.md when:
Load references/error-catalog.md when:
Before Sharding :
Router Implementation :
Shard Configuration :
Application Changes :
Before (Single database overwhelmed):
// Single PostgreSQL instance
const db = new Pool({ host: 'db.example.com' });
// All 10M users on one server
const users = await db.query('SELECT * FROM users WHERE status = $1', ['active']);
// Query time: 5000ms (slow!)
// DB CPU: 95%
// Disk: 500GB, growing
After (Sharded across 8 servers):
// Hash-based sharding with 8 shards
const router = new HashRouter([
{ id: 'shard_0', connection: { host: 'db0.example.com' } },
{ id: 'shard_1', connection: { host: 'db1.example.com' } },
// ... 6 more shards
]);
// Query single user (targets 1 shard)
const user = await router.query('user_123', 'SELECT * FROM users WHERE id = $1', ['user_123']);
// Query time: 10ms (500x faster!)
// Query all shards (scatter-gather)
const allActive = await router.queryAll('SELECT * FROM users WHERE status = $1', ['active']);
// Query time: 800ms (parallelized across 8 shards, 6x faster than single)
// Result: Each shard handles ~1.25M users
// DB CPU per shard: 20%
// Disk per shard: 65GB
// Can scale to 16 shards easily (consistent hashing)
All 10 documented errors prevented:
See : references/error-catalog.md for detailed fixes
Templates :
templates/hash-router.ts - Hash-based shardingtemplates/range-router.ts - Range-based shardingtemplates/directory-router.ts - Directory-based shardingtemplates/cross-shard-aggregation.ts - Aggregation patternsReferences :
references/sharding-strategies.md - Strategy comparisonreferences/shard-key-selection.md - Key selection guidereferences/implementation-patterns.md - Router implementationsreferences/cross-shard-queries.md - Query patternsreferences/rebalancing-guide.md - Migration strategiesreferences/error-catalog.md - All 10 errors documentedProduction Examples :
Production-tested | 10 errors prevented | MIT License
Weekly Installs
63
Repository
GitHub Stars
90
First Seen
Jan 25, 2026
Security Audits
Gen Agent Trust HubWarnSocketPassSnykPass
Installed on
claude-code55
gemini-cli50
cursor50
codex49
opencode48
github-copilot46
Android 整洁架构指南:模块化设计、依赖注入与数据层实现
1,500 周安装