ClickHouse最佳实践指南：28条规则优化模式设计、查询性能与数据摄入

clickhouse-best-practices by clickhouse/agent-skills

1,600 周安装量

373 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/clickhouse/agent-skills --skill clickhouse-best-practices

开发数据分析开发运维

🇨🇳中文介绍

ClickHouse 最佳实践

涵盖模式设计、查询优化和数据摄入的 ClickHouse 全面指南。包含 3 个主要类别（模式、查询、插入）共 28 条规则，按影响优先级排序。

官方文档： ClickHouse Best Practices

重要提示：如何应用此技能

在回答 ClickHouse 问题之前，请遵循以下优先级顺序：

检查 rules/ 目录中是否有适用的规则
如果规则存在： 应用它们，并在你的回答中使用"根据 rule-name..."来引用它们
如果没有规则存在： 使用 LLM 的 ClickHouse 知识或搜索文档
如果不确定： 使用网络搜索获取当前最佳实践
始终注明来源： 规则名称、"通用 ClickHouse 指南" 或 URL

为什么规则优先： ClickHouse 具有特定的行为（列式存储、稀疏索引、合并树机制），在这些方面，通用的数据库直觉可能会产生误导。这些规则编码了经过验证的、针对 ClickHouse 的特定指导。

对于正式审查

当对模式、查询或数据摄入进行正式审查时：

审查流程

对于模式审查（CREATE TABLE, ALTER TABLE）

按顺序阅读这些规则文件：

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

对于查询审查（SELECT, JOIN, 聚合）

阅读这些规则文件：

rules/query-join-choose-algorithm.md - 算法选择
rules/query-join-filter-before.md - 连接前筛选
rules/query-join-use-any.md - ANY 与常规 JOIN
rules/query-index-skipping-indices.md - 二级索引用法
rules/schema-pk-filter-on-orderby.md - 筛选与 ORDER BY 对齐

筛选器使用 ORDER BY 前缀列
JOIN 在连接前筛选表（而不是之后）
针对表大小的正确 JOIN 算法
为非 ORDER BY 筛选列使用跳数索引

对于插入策略审查（数据摄入、更新、删除）

阅读这些规则文件：

rules/insert-batch-size.md - 批处理大小要求
rules/insert-mutation-avoid-update.md - UPDATE 替代方案
rules/insert-mutation-avoid-delete.md - DELETE 替代方案
rules/insert-async-small-batches.md - 异步插入用法
rules/insert-optimize-avoid-final.md - OPTIMIZE TABLE 风险

每次 INSERT 批处理大小为 10K-100K 行
对于频繁更改，不使用 ALTER TABLE UPDATE
针对更新模式使用 ReplacingMergeTree 或 CollapsingMergeTree
为高频小批量插入启用异步插入

按以下结构组织你的回答：

## 已检查的规则
- `rule-name-1` - 合规 / 发现违规
- `rule-name-2` - 合规 / 发现违规
...

## 发现

### 违规项
- **`rule-name`**: 问题描述
  - 当前情况：[代码做了什么]
  - 要求：[应该做什么]
  - 修复方法：[具体的修正]

### 合规项
- `rule-name`: 简要说明为何正确

## 建议
[按优先级排序的更改列表，引用规则]

按优先级排序的规则类别

优先级	类别	影响	前缀	规则数量
1	主键选择	关键	`schema-pk-`	4
2	数据类型选择	关键	`schema-types-`	5
3	JOIN 优化	关键	`query-join-`	5
4	插入批处理	关键	`insert-batch-`	1
5	避免数据变更	关键	`insert-mutation-`	2
6	分区策略	高	`schema-partition-`	4
7	跳数索引	高	`query-index-`	1
8	物化视图	高	`query-mv-`	2
9	异步插入	高	`insert-async-`	2
10	避免 OPTIMIZE	高	`insert-optimize-`	1
11	JSON 使用	中	`schema-json-`	1

模式设计 - 主键（关键）

schema-pk-plan-before-creation - 在创建表之前规划 ORDER BY（不可变）
schema-pk-cardinality-order - 按列基数从低到高排序
schema-pk-prioritize-filters - 包含频繁筛选的列
schema-pk-filter-on-orderby - 查询筛选必须使用 ORDER BY 前缀

模式设计 - 数据类型（关键）

schema-types-native-types - 使用原生类型，而不是所有内容都用 String
schema-types-minimize-bitwidth - 使用适合的最小数值类型
schema-types-lowcardinality - 对于 <10K 唯一字符串使用 LowCardinality
schema-types-enum - 对于有限值集使用 Enum 进行验证
schema-types-avoid-nullable - 避免使用 Nullable；改用 DEFAULT

模式设计 - 分区（高）

schema-partition-low-cardinality - 保持分区数量在 100-1,000 之间
schema-partition-lifecycle - 将分区用于数据生命周期管理，而非查询
schema-partition-query-tradeoffs - 理解分区修剪的权衡
schema-partition-start-without - 考虑开始时不分区的方案

模式设计 - JSON（中）

schema-json-when-to-use - JSON 用于动态模式；已知模式使用类型化列

查询优化 - JOINs（关键）

query-join-choose-algorithm - 根据表大小选择算法
query-join-use-any - 当只需要一个匹配时使用 ANY JOIN
query-join-filter-before - 在连接前筛选表
query-join-consider-alternatives - 字典/反规范化与 JOIN 的权衡
query-join-null-handling - 对于默认值使用 join_use_nulls=0

查询优化 - 索引（高）

query-index-skipping-indices - 为非 ORDER BY 筛选列使用跳数索引

查询优化 - 物化视图（高）

query-mv-incremental - 用于实时聚合的增量物化视图
query-mv-refreshable - 用于复杂连接的可刷新物化视图

插入策略 - 批处理（关键）

insert-batch-size - 每次 INSERT 批处理 10K-100K 行

插入策略 - 异步（高）

insert-async-small-batches - 对高频小批量使用异步插入
insert-format-native - 使用原生格式以获得最佳性能

插入策略 - 数据变更（关键）

insert-mutation-avoid-update - 使用 ReplacingMergeTree 替代 ALTER UPDATE
insert-mutation-avoid-delete - 使用轻量级 DELETE 或 DROP PARTITION

插入策略 - 优化（高）

insert-optimize-avoid-final - 让后台合并工作

当您遇到以下情况时，此技能将被激活：

CREATE TABLE 语句
ALTER TABLE 修改
ORDER BY 或 PRIMARY KEY 讨论
数据类型选择问题
慢查询故障排除
JOIN 优化请求
数据摄入管道设计
更新/删除策略问题
ReplacingMergeTree 或其他专用引擎的使用
分区策略决策

rules/ 目录中的每个规则文件包含：

YAML 前言：标题、影响级别、标签
简要说明：为什么这条规则重要
错误示例：反模式及其解释
正确示例：最佳实践及其解释
附加背景：权衡、何时应用、参考资料

如需包含所有规则内联扩展的完整指南，请查看：AGENTS.md

当您需要快速检查多个规则而无需阅读单个文件时，请使用 AGENTS.md。

2026 年 1 月 27 日

🇺🇸English

ClickHouse Best Practices

Comprehensive guidance for ClickHouse covering schema design, query optimization, and data ingestion. Contains 28 rules across 3 main categories (schema, query, insert), prioritized by impact.

Official docs: ClickHouse Best Practices

IMPORTANT: How to Apply This Skill

Before answering ClickHouse questions, follow this priority order:

Check for applicable rules in the rules/ directory
If rules exist: Apply them and cite them in your response using "Per rule-name..."
If no rule exists: Use the LLM's ClickHouse knowledge or search documentation
If uncertain: Use web search for current best practices
Always cite your source: rule name, "general ClickHouse guidance", or URL

Why rules take priority: ClickHouse has specific behaviors (columnar storage, sparse indexes, merge tree mechanics) where general database intuition can be misleading. The rules encode validated, ClickHouse-specific guidance.

For Formal Reviews

When performing a formal review of schemas, queries, or data ingestion:

Review Procedures

For Schema Reviews (CREATE TABLE, ALTER TABLE)

Read these rule files in order:

rules/schema-pk-plan-before-creation.md - ORDER BY is immutable
rules/schema-pk-cardinality-order.md - Column ordering in keys
rules/schema-pk-prioritize-filters.md - Filter column inclusion
rules/schema-types-native-types.md - Proper type selection
rules/schema-types-minimize-bitwidth.md - Numeric type sizing
rules/schema-types-lowcardinality.md - LowCardinality usage
rules/schema-types-avoid-nullable.md - Nullable vs DEFAULT
rules/schema-partition-low-cardinality.md - Partition count limits
- Partitioning purpose

Check for:

PRIMARY KEY / ORDER BY column order (low-to-high cardinality)
Data types match actual data ranges
LowCardinality applied to appropriate string columns
Partition key cardinality bounded (100-1,000 values)
ReplacingMergeTree has version column if used

For Query Reviews (SELECT, JOIN, aggregations)

Read these rule files:

rules/query-join-choose-algorithm.md - Algorithm selection
rules/query-join-filter-before.md - Pre-join filtering
rules/query-join-use-any.md - ANY vs regular JOIN
rules/query-index-skipping-indices.md - Secondary index usage
rules/schema-pk-filter-on-orderby.md - Filter alignment with ORDER BY

Check for:

Filters use ORDER BY prefix columns
JOINs filter tables before joining (not after)
Correct JOIN algorithm for table sizes
Skipping indices for non-ORDER BY filter columns

For Insert Strategy Reviews (data ingestion, updates, deletes)

Read these rule files:

rules/insert-batch-size.md - Batch sizing requirements
rules/insert-mutation-avoid-update.md - UPDATE alternatives
rules/insert-mutation-avoid-delete.md - DELETE alternatives
rules/insert-async-small-batches.md - Async insert usage
rules/insert-optimize-avoid-final.md - OPTIMIZE TABLE risks

Check for:

Batch size 10K-100K rows per INSERT
No ALTER TABLE UPDATE for frequent changes
ReplacingMergeTree or CollapsingMergeTree for update patterns
Async inserts enabled for high-frequency small batches

Output Format

Structure your response as follows:

## Rules Checked
- `rule-name-1` - Compliant / Violation found
- `rule-name-2` - Compliant / Violation found
...

## Findings

### Violations
- **`rule-name`**: Description of the issue
  - Current: [what the code does]
  - Required: [what it should do]
  - Fix: [specific correction]

### Compliant
- `rule-name`: Brief note on why it's correct

## Recommendations
[Prioritized list of changes, citing rules]

Rule Categories by Priority

Priority	Category	Impact	Prefix	Rule Count
1	Primary Key Selection	CRITICAL	`schema-pk-`	4
2	Data Type Selection	CRITICAL	`schema-types-`	5
3	JOIN Optimization	CRITICAL	`query-join-`	5
4	Insert Batching	CRITICAL	`insert-batch-`

Quick Reference

Schema Design - Primary Key (CRITICAL)

schema-pk-plan-before-creation - Plan ORDER BY before table creation (immutable)
schema-pk-cardinality-order - Order columns low-to-high cardinality
schema-pk-prioritize-filters - Include frequently filtered columns
schema-pk-filter-on-orderby - Query filters must use ORDER BY prefix

Schema Design - Data Types (CRITICAL)

schema-types-native-types - Use native types, not String for everything
schema-types-minimize-bitwidth - Use smallest numeric type that fits
schema-types-lowcardinality - LowCardinality for <10K unique strings
schema-types-enum - Enum for finite value sets with validation
schema-types-avoid-nullable - Avoid Nullable; use DEFAULT instead

Schema Design - Partitioning (HIGH)

schema-partition-low-cardinality - Keep partition count 100-1,000
schema-partition-lifecycle - Use partitioning for data lifecycle, not queries
schema-partition-query-tradeoffs - Understand partition pruning trade-offs
schema-partition-start-without - Consider starting without partitioning

Schema Design - JSON (MEDIUM)

schema-json-when-to-use - JSON for dynamic schemas; typed columns for known

Query Optimization - JOINs (CRITICAL)

query-join-choose-algorithm - Select algorithm based on table sizes
query-join-use-any - ANY JOIN when only one match needed
query-join-filter-before - Filter tables before joining
query-join-consider-alternatives - Dictionaries/denormalization vs JOIN
query-join-null-handling - join_use_nulls=0 for default values

Query Optimization - Indices (HIGH)

query-index-skipping-indices - Skipping indices for non-ORDER BY filters

Query Optimization - Materialized Views (HIGH)

query-mv-incremental - Incremental MVs for real-time aggregations
query-mv-refreshable - Refreshable MVs for complex joins

Insert Strategy - Batching (CRITICAL)

insert-batch-size - Batch 10K-100K rows per INSERT

Insert Strategy - Async (HIGH)

insert-async-small-batches - Async inserts for high-frequency small batches
insert-format-native - Native format for best performance

Insert Strategy - Mutations (CRITICAL)

insert-mutation-avoid-update - ReplacingMergeTree instead of ALTER UPDATE
insert-mutation-avoid-delete - Lightweight DELETE or DROP PARTITION

Insert Strategy - Optimization (HIGH)

insert-optimize-avoid-final - Let background merges work

When to Apply

This skill activates when you encounter:

CREATE TABLE statements
ALTER TABLE modifications
ORDER BY or PRIMARY KEY discussions
Data type selection questions
Slow query troubleshooting
JOIN optimization requests
Data ingestion pipeline design
Update/delete strategy questions
ReplacingMergeTree or other specialized engine usage
Partitioning strategy decisions

Rule File Structure

Each rule file in rules/ contains:

YAML frontmatter : title, impact level, tags
Brief explanation : Why this rule matters
Incorrect example : Anti-pattern with explanation
Correct example : Best practice with explanation
Additional context : Trade-offs, when to apply, references

Full Compiled Document

For the complete guide with all rules expanded inline: AGENTS.md

Use AGENTS.md when you need to check multiple rules quickly without reading individual files.

Weekly Installs

1.0K

Repository

clickhouse/agent-skills

GitHub Stars

349

First Seen

Jan 27, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

opencode905

codex899

gemini-cli889

github-copilot886

amp859

kimi-cli859

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

102,200 周安装

rules/schema-partition-lifecycle.md

ClickHouse最佳实践指南：28条规则优化模式设计、查询性能与数据摄入

🇨🇳中文介绍

ClickHouse 最佳实践

重要提示：如何应用此技能

对于正式审查

审查流程

对于模式审查（CREATE TABLE, ALTER TABLE）

相关 Skills

对于查询审查（SELECT, JOIN, 聚合）

对于插入策略审查（数据摄入、更新、删除）

输出格式

按优先级排序的规则类别

快速参考

模式设计 - 主键（关键）

模式设计 - 数据类型（关键）

模式设计 - 分区（高）

模式设计 - JSON（中）

查询优化 - JOINs（关键）

查询优化 - 索引（高）

查询优化 - 物化视图（高）

插入策略 - 批处理（关键）

插入策略 - 异步（高）

插入策略 - 数据变更（关键）

插入策略 - 优化（高）

何时应用

规则文件结构

完整编译文档

🇺🇸English

ClickHouse Best Practices

IMPORTANT: How to Apply This Skill

For Formal Reviews

Review Procedures

For Schema Reviews (CREATE TABLE, ALTER TABLE)

For Query Reviews (SELECT, JOIN, aggregations)

For Insert Strategy Reviews (data ingestion, updates, deletes)

Output Format

Rule Categories by Priority

Quick Reference

Schema Design - Primary Key (CRITICAL)

Schema Design - Data Types (CRITICAL)

Schema Design - Partitioning (HIGH)

Schema Design - JSON (MEDIUM)

Query Optimization - JOINs (CRITICAL)

Query Optimization - Indices (HIGH)

Query Optimization - Materialized Views (HIGH)

Insert Strategy - Batching (CRITICAL)

Insert Strategy - Async (HIGH)

Insert Strategy - Mutations (CRITICAL)

Insert Strategy - Optimization (HIGH)

When to Apply

Rule File Structure

Full Compiled Document

最新 Skills