Database Design Expert by martinholovsky/claude-skills-generator
npx skills add https://github.com/martinholovsky/claude-skills-generator --skill 'Database Design Expert'关键提示:在实施任何数据库模式之前,您必须阅读相关的参考文件:
在以下情况下阅读 references/advanced-patterns.md:
在以下情况下阅读 references/security-examples.md:
风险等级:中等
理由:数据库模式设计影响数据完整性、查询性能和应用程序安全性。糟糕的设计可能导致数据损坏、性能瓶颈以及维护数据一致性困难。生产环境中的模式变更需要仔细的迁移规划。
您是数据库模式设计专家,专长于:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| SQLite 类型 | 用途 | 备注 |
|---|---|---|
| INTEGER | ID、计数、布尔值 | 主键用于自动递增 |
| TEXT | 字符串、JSON、UUID | 无长度限制 |
| REAL | 浮点数 | 8字节 IEEE 浮点数 |
| BLOB | 二进制数据 | 文件、加密数据 |
| NUMERIC | 日期、小数 | 存储为最高效的类型 |
| 范式 | 描述 | 何时使用 |
|---|---|---|
| 1NF | 原子值,无重复组 | 始终 |
| 2NF | 1NF + 无部分依赖 | 大多数表 |
| 3NF | 2NF + 无传递依赖 | 默认选择 |
| BCNF | 3NF + 每个决定因素都是键 | 复杂关系 |
CREATE TABLE entities (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL CHECK(length(name) BETWEEN 1 AND 255),
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'),
status TEXT NOT NULL DEFAULT 'active' CHECK(status IN ('active', 'inactive', 'deleted')),
created_at TEXT NOT NULL DEFAULT (datetime('now')),
deleted_at TEXT
);
CREATE INDEX idx_entities_status ON entities(status) WHERE deleted_at IS NULL;
CREATE TABLE documents (
id INTEGER PRIMARY KEY, user_id INTEGER NOT NULL, title TEXT NOT NULL,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
CREATE INDEX idx_documents_user ON documents(user_id);
CREATE TABLE document_tags (
document_id INTEGER NOT NULL, tag_id INTEGER NOT NULL,
PRIMARY KEY (document_id, tag_id),
FOREIGN KEY (document_id) REFERENCES documents(id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);
CREATE INDEX idx_doctags_tag ON document_tags(tag_id);
-- 树状结构(邻接表)
CREATE TABLE categories (
id INTEGER PRIMARY KEY,
parent_id INTEGER REFERENCES categories(id) ON DELETE CASCADE,
name TEXT NOT NULL
);
CREATE INDEX idx_categories_parent ON categories(parent_id);
-- 内容表
CREATE TABLE articles (
id INTEGER PRIMARY KEY, title TEXT NOT NULL, body TEXT NOT NULL,
created_at TEXT DEFAULT (datetime('now'))
);
-- FTS5 虚拟表
CREATE VIRTUAL TABLE articles_fts USING fts5(
title, body, content=articles, content_rowid=id,
tokenize='porter unicode61', prefix='2,3'
);
-- 同步触发器(INSERT、UPDATE、DELETE)
CREATE TRIGGER articles_ai AFTER INSERT ON articles BEGIN
INSERT INTO articles_fts(rowid, title, body) VALUES (new.id, new.title, new.body);
END;
-- 需要为 UPDATE 和 DELETE 创建类似的触发器
CREATE TABLE accounts (id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL DEFAULT 0);
CREATE TABLE accounts_audit (
id INTEGER PRIMARY KEY, account_id INTEGER NOT NULL,
field_name TEXT NOT NULL, old_value TEXT, new_value TEXT,
changed_at TEXT DEFAULT (datetime('now')),
FOREIGN KEY (account_id) REFERENCES accounts(id) ON DELETE CASCADE
);
CREATE TRIGGER accounts_audit_update AFTER UPDATE ON accounts BEGIN
INSERT INTO accounts_audit (account_id, field_name, old_value, new_value)
SELECT new.id, 'balance', old.balance, new.balance WHERE old.balance != new.balance;
END;
CREATE INDEX idx_audit_account ON accounts_audit(account_id, changed_at DESC);
-- 数字、字符串格式和枚举约束
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'),
phone TEXT CHECK(phone IS NULL OR phone GLOB '+[0-9]*'),
status TEXT NOT NULL DEFAULT 'pending' CHECK(status IN ('pending', 'active', 'deleted'))
);
-- 日期范围验证
CREATE TABLE events (
id INTEGER PRIMARY KEY, start_date TEXT NOT NULL, end_date TEXT NOT NULL,
CHECK(end_date >= start_date)
);
CREATE TABLE documents (id INTEGER PRIMARY KEY, title TEXT NOT NULL, deleted_at TEXT);
CREATE VIEW active_documents AS SELECT * FROM documents WHERE deleted_at IS NULL;
CREATE INDEX idx_documents_active ON documents(title) WHERE deleted_at IS NULL;
-- 用于相等/范围查询的单列索引 | 复合索引(先相等列,后范围列)
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC);
-- 覆盖索引(避免表查找) | 部分索引(过滤查询)
CREATE INDEX idx_users_cover ON users(email, name, status);
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';
-- 表达式索引 | 始终使用 EXPLAIN 验证
CREATE INDEX idx_users_lower ON users(LOWER(email));
EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = ?;
# tests/test_schema.py
import pytest
import sqlite3
@pytest.fixture
def db():
conn = sqlite3.connect(':memory:')
conn.execute("PRAGMA foreign_keys = ON")
yield conn
conn.close()
class TestUserSchema:
def test_email_uniqueness(self, db):
db.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT UNIQUE NOT NULL)")
db.execute("INSERT INTO users (email) VALUES ('test@example.com')")
with pytest.raises(sqlite3.IntegrityError):
db.execute("INSERT INTO users (email) VALUES ('test@example.com')")
def test_email_format_constraint(self, db):
db.execute("""CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'))""")
with pytest.raises(sqlite3.IntegrityError):
db.execute("INSERT INTO users (email) VALUES ('invalid')")
def test_index_used_for_lookup(self, db):
db.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT)")
db.execute("CREATE INDEX idx_users_email ON users(email)")
plan = db.execute("EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = ?", ('test@example.com',)).fetchone()
assert 'USING INDEX' in plan[3]
# src/database/schema.py
SCHEMA_SQL = """
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'),
name TEXT NOT NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);
"""
def init_schema(conn):
"""初始化数据库模式。"""
conn.executescript(SCHEMA_SQL)
conn.commit()
# 运行模式测试
pytest tests/test_schema.py -v
# 运行覆盖率测试
pytest tests/test_schema.py --cov=src/database --cov-report=term-missing
# tests/test_migrations.py
def test_migration_adds_column(db):
"""迁移应添加新列而不丢失数据。"""
# 设置:创建包含数据的旧模式
db.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT)")
db.execute("INSERT INTO users (email) VALUES ('test@example.com')")
# 运行迁移
db.execute("ALTER TABLE users ADD COLUMN name TEXT DEFAULT 'Unknown'")
# 验证:数据保留,新列存在
row = db.execute("SELECT id, email, name FROM users").fetchone()
assert row == (1, 'test@example.com', 'Unknown')
良好:具有正确列顺序的复合索引
-- 查询:WHERE user_id = ? AND created_at > ? ORDER BY created_at DESC
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC);
不良:错误的列顺序浪费索引
-- 范围列在前会阻止高效使用相等匹配
CREATE INDEX idx_orders_wrong ON orders(created_at, user_id);
良好:使用覆盖索引避免表查找
-- 在索引中包含所有需要的列
CREATE INDEX idx_users_email_cover ON users(email, name, status);
-- 查询仅访问索引,从不读取表
SELECT name, status FROM users WHERE email = ?;
不良:使用 SELECT * 查询大行
-- 即使有索引也会强制表查找
SELECT * FROM users WHERE email = ?;
良好:使用池重用连接
from contextlib import contextmanager
import threading
class ConnectionPool:
def __init__(self, db_path, max_connections=5):
self._pool, self._lock = [], threading.Lock()
self._db_path, self._max = db_path, max_connections
@contextmanager
def get_connection(self):
conn = self._acquire()
try:
yield conn
finally:
self._release(conn)
不良:每个查询创建新连接
def get_user(email):
conn = sqlite3.connect('app.db') # 代价高昂!
result = conn.execute("SELECT * FROM users WHERE email = ?", (email,)).fetchone()
conn.close()
return result
良好:为读密集型模式存储计算值
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
item_count INTEGER NOT NULL DEFAULT 0, -- 反规范化
total_amount REAL NOT NULL DEFAULT 0 -- 反规范化
);
-- 使用触发器维护反规范化值
不良:每次读取时计算聚合
SELECT o.id, COUNT(oi.id), SUM(oi.price * oi.quantity)
FROM orders o JOIN order_items oi ON oi.order_id = o.id GROUP BY o.id;
良好:按时间分区大表
CREATE TABLE events_2024 (id INTEGER PRIMARY KEY, event_type TEXT, created_at TEXT CHECK(created_at LIKE '2024%'));
CREATE TABLE events_2025 (id INTEGER PRIMARY KEY, event_type TEXT, created_at TEXT CHECK(created_at LIKE '2025%'));
CREATE VIEW events AS SELECT * FROM events_2024 UNION ALL SELECT * FROM events_2025;
不良:单表包含数百万行(超过1000万行会导致全表扫描)
| 错误 | 不良做法 | 良好做法 |
|---|---|---|
| 过度规范化 | 为 first_name、last_name 分别建表 | 直接存储在 users 表中 |
| 缺少外键 | user_id INTEGER(无外键) | user_id INTEGER REFERENCES users(id) |
| 错误的索引顺序 | 为 WHERE user_id=? AND created_at>? 使用 INDEX(created_at, user_id) | INDEX(user_id, created_at) |
| 列中存储 CSV | tags TEXT -- "a,b,c" | 使用带有适当外键的联结表 |
references/advanced-patterns.md、references/security-examples.md)pytest tests/test_schema.py -v 通过您的目标是创建具有以下特点的数据库模式:
您理解模式设计需要平衡:
设计提醒:从第三范式规范化开始,根据实际查询模式添加索引,并使用 EXPLAIN 验证您的假设。如有疑问,请查阅 references/advanced-patterns.md 了解复杂关系模式。
每周安装次数
–
代码仓库
GitHub 星标数
29
首次出现时间
–
安全审计
CRITICAL : Before implementing ANY database schema, you MUST read the relevant reference files:
Readreferences/advanced-patterns.md WHEN:
Readreferences/security-examples.md WHEN:
Risk Level: MEDIUM
Justification : Database schema design impacts data integrity, query performance, and application security. Poor design can lead to data corruption, performance bottlenecks, and difficulty in maintaining data consistency. Schema changes in production require careful migration planning.
You are an expert in database schema design, specializing in:
| SQLite Type | Use For | Notes |
|---|---|---|
| INTEGER | IDs, counts, booleans | PRIMARY KEY for auto-increment |
| TEXT | Strings, JSON, UUIDs | No length limit |
| REAL | Floating point | 8-byte IEEE float |
| BLOB | Binary data | Files, encrypted data |
| NUMERIC | Dates, decimals | Stored as most efficient type |
| Form | Description | When to Use |
|---|---|---|
| 1NF | Atomic values, no repeating groups | Always |
| 2NF | 1NF + no partial dependencies | Most tables |
| 3NF | 2NF + no transitive dependencies | Default choice |
| BCNF | 3NF + every determinant is a key | Complex relationships |
CREATE TABLE entities (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL CHECK(length(name) BETWEEN 1 AND 255),
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'),
status TEXT NOT NULL DEFAULT 'active' CHECK(status IN ('active', 'inactive', 'deleted')),
created_at TEXT NOT NULL DEFAULT (datetime('now')),
deleted_at TEXT
);
CREATE INDEX idx_entities_status ON entities(status) WHERE deleted_at IS NULL;
CREATE TABLE documents (
id INTEGER PRIMARY KEY, user_id INTEGER NOT NULL, title TEXT NOT NULL,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
CREATE INDEX idx_documents_user ON documents(user_id);
CREATE TABLE document_tags (
document_id INTEGER NOT NULL, tag_id INTEGER NOT NULL,
PRIMARY KEY (document_id, tag_id),
FOREIGN KEY (document_id) REFERENCES documents(id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);
CREATE INDEX idx_doctags_tag ON document_tags(tag_id);
-- Tree structure (adjacency list)
CREATE TABLE categories (
id INTEGER PRIMARY KEY,
parent_id INTEGER REFERENCES categories(id) ON DELETE CASCADE,
name TEXT NOT NULL
);
CREATE INDEX idx_categories_parent ON categories(parent_id);
-- Content table
CREATE TABLE articles (
id INTEGER PRIMARY KEY, title TEXT NOT NULL, body TEXT NOT NULL,
created_at TEXT DEFAULT (datetime('now'))
);
-- FTS5 virtual table
CREATE VIRTUAL TABLE articles_fts USING fts5(
title, body, content=articles, content_rowid=id,
tokenize='porter unicode61', prefix='2,3'
);
-- Sync triggers (INSERT, UPDATE, DELETE)
CREATE TRIGGER articles_ai AFTER INSERT ON articles BEGIN
INSERT INTO articles_fts(rowid, title, body) VALUES (new.id, new.title, new.body);
END;
-- Similar triggers needed for UPDATE and DELETE
CREATE TABLE accounts (id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL DEFAULT 0);
CREATE TABLE accounts_audit (
id INTEGER PRIMARY KEY, account_id INTEGER NOT NULL,
field_name TEXT NOT NULL, old_value TEXT, new_value TEXT,
changed_at TEXT DEFAULT (datetime('now')),
FOREIGN KEY (account_id) REFERENCES accounts(id) ON DELETE CASCADE
);
CREATE TRIGGER accounts_audit_update AFTER UPDATE ON accounts BEGIN
INSERT INTO accounts_audit (account_id, field_name, old_value, new_value)
SELECT new.id, 'balance', old.balance, new.balance WHERE old.balance != new.balance;
END;
CREATE INDEX idx_audit_account ON accounts_audit(account_id, changed_at DESC);
-- Numeric, string format, and enum constraints
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'),
phone TEXT CHECK(phone IS NULL OR phone GLOB '+[0-9]*'),
status TEXT NOT NULL DEFAULT 'pending' CHECK(status IN ('pending', 'active', 'deleted'))
);
-- Date range validation
CREATE TABLE events (
id INTEGER PRIMARY KEY, start_date TEXT NOT NULL, end_date TEXT NOT NULL,
CHECK(end_date >= start_date)
);
CREATE TABLE documents (id INTEGER PRIMARY KEY, title TEXT NOT NULL, deleted_at TEXT);
CREATE VIEW active_documents AS SELECT * FROM documents WHERE deleted_at IS NULL;
CREATE INDEX idx_documents_active ON documents(title) WHERE deleted_at IS NULL;
-- Single column for equality/range | Composite (equality first, then range)
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC);
-- Covering index (avoid table lookup) | Partial index (filtered queries)
CREATE INDEX idx_users_cover ON users(email, name, status);
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';
-- Expression index | Always verify with EXPLAIN
CREATE INDEX idx_users_lower ON users(LOWER(email));
EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = ?;
# tests/test_schema.py
import pytest
import sqlite3
@pytest.fixture
def db():
conn = sqlite3.connect(':memory:')
conn.execute("PRAGMA foreign_keys = ON")
yield conn
conn.close()
class TestUserSchema:
def test_email_uniqueness(self, db):
db.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT UNIQUE NOT NULL)")
db.execute("INSERT INTO users (email) VALUES ('test@example.com')")
with pytest.raises(sqlite3.IntegrityError):
db.execute("INSERT INTO users (email) VALUES ('test@example.com')")
def test_email_format_constraint(self, db):
db.execute("""CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'))""")
with pytest.raises(sqlite3.IntegrityError):
db.execute("INSERT INTO users (email) VALUES ('invalid')")
def test_index_used_for_lookup(self, db):
db.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT)")
db.execute("CREATE INDEX idx_users_email ON users(email)")
plan = db.execute("EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = ?", ('test@example.com',)).fetchone()
assert 'USING INDEX' in plan[3]
# src/database/schema.py
SCHEMA_SQL = """
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
email TEXT UNIQUE NOT NULL CHECK(email LIKE '%_@__%.__%'),
name TEXT NOT NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);
"""
def init_schema(conn):
"""Initialize database schema."""
conn.executescript(SCHEMA_SQL)
conn.commit()
# Run schema tests
pytest tests/test_schema.py -v
# Run with coverage
pytest tests/test_schema.py --cov=src/database --cov-report=term-missing
# tests/test_migrations.py
def test_migration_adds_column(db):
"""Migration should add new column without data loss."""
# Setup: create old schema with data
db.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT)")
db.execute("INSERT INTO users (email) VALUES ('test@example.com')")
# Run migration
db.execute("ALTER TABLE users ADD COLUMN name TEXT DEFAULT 'Unknown'")
# Verify: data preserved, new column exists
row = db.execute("SELECT id, email, name FROM users").fetchone()
assert row == (1, 'test@example.com', 'Unknown')
Good: Composite index with correct column order
-- Query: WHERE user_id = ? AND created_at > ? ORDER BY created_at DESC
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC);
Bad: Wrong column order wastes index
-- Range column first prevents using equality match efficiently
CREATE INDEX idx_orders_wrong ON orders(created_at, user_id);
Good: Use covering index to avoid table lookup
-- Include all needed columns in index
CREATE INDEX idx_users_email_cover ON users(email, name, status);
-- Query only touches index, never reads table
SELECT name, status FROM users WHERE email = ?;
Bad: SELECT * with large rows
-- Forces table lookup even with index
SELECT * FROM users WHERE email = ?;
Good: Reuse connections with pool
from contextlib import contextmanager
import threading
class ConnectionPool:
def __init__(self, db_path, max_connections=5):
self._pool, self._lock = [], threading.Lock()
self._db_path, self._max = db_path, max_connections
@contextmanager
def get_connection(self):
conn = self._acquire()
try:
yield conn
finally:
self._release(conn)
Bad: Create new connection per query
def get_user(email):
conn = sqlite3.connect('app.db') # Expensive!
result = conn.execute("SELECT * FROM users WHERE email = ?", (email,)).fetchone()
conn.close()
return result
Good: Store computed values for read-heavy patterns
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
item_count INTEGER NOT NULL DEFAULT 0, -- Denormalized
total_amount REAL NOT NULL DEFAULT 0 -- Denormalized
);
-- Use triggers to maintain denormalized values
Bad: Calculate aggregates on every read
SELECT o.id, COUNT(oi.id), SUM(oi.price * oi.quantity)
FROM orders o JOIN order_items oi ON oi.order_id = o.id GROUP BY o.id;
Good: Partition large tables by time
CREATE TABLE events_2024 (id INTEGER PRIMARY KEY, event_type TEXT, created_at TEXT CHECK(created_at LIKE '2024%'));
CREATE TABLE events_2025 (id INTEGER PRIMARY KEY, event_type TEXT, created_at TEXT CHECK(created_at LIKE '2025%'));
CREATE VIEW events AS SELECT * FROM events_2024 UNION ALL SELECT * FROM events_2025;
Bad: Single table with millions of rows (10M+ causes full table scans)
| Mistake | Bad | Good |
|---|---|---|
| Over-normalization | Separate tables for first_name, last_name | Store directly in users table |
| Missing FK | user_id INTEGER (no FK) | user_id INTEGER REFERENCES users(id) |
| Wrong index order | INDEX(created_at, user_id) for WHERE user_id=? AND created_at>? | INDEX(user_id, created_at) |
| CSV in column | tags TEXT -- "a,b,c" |
references/advanced-patterns.md, references/security-examples.md)pytest tests/test_schema.py -v passesYour goal is to create database schemas that are:
You understand that schema design requires balancing:
Design Reminder : Start with 3NF normalization, add indexes based on actual query patterns, and use EXPLAIN to verify your assumptions. When in doubt, consult references/advanced-patterns.md for complex relationship patterns.
Weekly Installs
–
Repository
GitHub Stars
29
First Seen
–
Security Audits
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
109,600 周安装
| Junction table with proper FK |