elasticsearch-esql by elastic/agent-skills
npx skills add https://github.com/elastic/agent-skills --skill elasticsearch-esql对 Elasticsearch 执行 ES|QL 查询。
ES|QL(Elasticsearch 查询语言)是一种用于 Elasticsearch 的管道式查询语言。它并非以下语言:
ES|QL 使用管道符 (|) 来链接命令:FROM index | WHERE condition | STATS aggregation BY field | SORT field | LIMIT n
前提条件: ES|QL 要求查询的索引启用
_source。禁用_source的索引(例如,"_source": { "enabled": false })将导致 ES|QL 查询失败。版本兼容性: ES|QL 在 8.11 版本引入(技术预览版),并在 8.14 版本正式发布。
LOOKUP JOIN(8.18+)、(8.17+)和 (9.2+)等功能在后续版本中添加。在 8.18 之前的集群上,可以使用 作为 的备选方案(参见生成提示)。 和计数器字段的 在 9.2 之前。请查看 了解各版本的功能可用性。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
MATCHINLINE STATSENRICHLOOKUP JOININLINE STATSRATE()集群检测: 使用 GET / 响应来确定集群类型和版本:
build_flavor: "serverless" — Elastic Cloud Serverless。报告版本为 8.11.0,但支持所有最新的 ES|QL 功能。忽略版本号,并假设所有正式发布和预览功能都可用。build_flavor: "default" — 自管理或 Elastic Cloud Hosted。使用 version.number 来判断功能可用性。version.number 类似 9.4.0-SNAPSHOT。去掉 -SNAPSHOT 后缀,并使用主版本号.次版本号进行版本检查。快照构建包含该版本的所有功能,以及开发中可能尚未发布的功能——如果查询因未知函数/命令而失败,可能只是该功能尚未集成。Elastic 员工通常使用快照构建进行测试。Elasticsearch 连接通过环境变量配置。运行 node scripts/esql.js test 来验证连接。如果测试失败,向用户建议以下设置选项,然后停止。在连接测试成功之前,不要尝试进一步探索。
export ELASTICSEARCH_CLOUD_ID="deployment-name:base64encodedcloudid"
export ELASTICSEARCH_API_KEY="base64encodedapikey"
export ELASTICSEARCH_URL="https://elasticsearch:9200"
export ELASTICSEARCH_API_KEY="base64encodedapikey"
export ELASTICSEARCH_URL="https://elasticsearch:9200"
export ELASTICSEARCH_USERNAME="elastic"
export ELASTICSEARCH_PASSWORD="changeme"
对于本地开发和测试,使用 start-local 通过 Docker 或 Podman 快速启动 Elasticsearch 和 Kibana:
curl -fsSL https://elastic.co/start-local | sh
安装完成后,Elasticsearch 运行在 http://localhost:9200,Kibana 运行在 http://localhost:5601。脚本会为 elastic 用户生成一个随机密码和一个 API 密钥,两者都存储在创建的 elastic-start-local 文件夹内的 .env 文件中。
要为该技能配置环境变量,请加载 .env 文件并导出连接设置:
source elastic-start-local/.env
export ELASTICSEARCH_URL="$ES_LOCAL_URL"
export ELASTICSEARCH_API_KEY="$ES_LOCAL_API_KEY"
然后运行 node scripts/esql.js test 来验证连接。
export ELASTICSEARCH_INSECURE="true"
node scripts/esql.js indices # 列出所有索引
node scripts/esql.js indices "logs-*" # 列出匹配的索引
node scripts/esql.js schema "logs-2024.01.01" # 获取索引的字段映射
node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY host.name | SORT count DESC | LIMIT 5"
node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY component | SORT count DESC" --tsv
TSV 输出选项:
--tsv 或 -t:输出为制表符分隔值(简洁,无装饰)--no-header:省略标题行node scripts/esql.js test
检测部署类型:始终首先运行 node scripts/esql.js test。这会检测部署是 Serverless 项目(所有功能可用)还是版本化集群(功能取决于版本)。来自 GET / 的 build_flavor 字段是权威信号——如果它等于 "serverless",则忽略报告的版本号,自由使用所有 ES|QL 功能。
发现模式(必需——切勿猜测索引或字段名):
node scripts/esql.js indices "pattern*"
node scripts/esql.js schema "index-name"
在生成查询之前,始终运行模式发现。索引名和字段名因部署而异,无法可靠猜测。即使是听起来常见的数据(例如 "logs")也可能存在于名为 logs-test、logs-app-* 或 application_logs 的索引中。字段名可能使用 ECS 点分表示法(source.ip、service.name)或扁平的自定义名称——唯一了解的方法是检查。
schema 命令报告索引模式。如果显示 Index mode: time_series,输出将包含数据流名称和可复制粘贴的 TS 语法——使用 TS <data-stream>(而非 FROM)、TBUCKET(interval)(而非 DATE_TRUNC),并用 SUM(RATE(...)) 包装计数器字段。在编写任何时间序列查询之前,请阅读 生成提示 中的完整 TS 部分。你也可以通过 Elasticsearch 索引设置 API 直接检查索引模式:
curl -s "$ELASTICSEARCH_URL/<index-name>/_settings/index.mode" -H "Authorization: ApiKey $ELASTICSEARCH_API_KEY"
为任务选择正确的 ES|QL 功能:在编写查询之前,将用户的意图与最合适的 ES|QL 功能匹配。优先使用单个高级查询,而不是多个基本查询。
CATEGORIZE(field)CHANGE_POINT value ON keySTATS ... BY BUCKET(@timestamp, interval) 或用于 TSDB 的 TSMATCH、QSTR、KQLSTATS在生成查询之前阅读参考资料:
按照 ES|QL 语法生成查询:
FROM index-pattern 开始(对于时间序列索引,使用 TS index-pattern)WHERE 进行过滤(在 9.3+ 上使用 TRANGE 进行时间范围过滤)EVAL 计算字段STATS ... BY 进行聚合SUM(RATE(...)) 的 TS,仪表使用 AVG(...),时间分桶使用 TBUCKET(interval) — 关于三个关键语法规则,请参见 中的 TS 部分使用 TSV 标志执行:
node scripts/esql.js raw "FROM index | STATS count = COUNT(*) BY field" --tsv
版本可用性: 为便于阅读,本节省略了版本注释。请查看 ES|QL 版本历史 了解各 Elasticsearch 版本的功能可用性。
FROM index-pattern
| WHERE condition
| EVAL new_field = expression
| STATS aggregation BY grouping
| SORT field DESC
| LIMIT n
过滤和限制:
FROM logs-*
| WHERE @timestamp > NOW() - 24 hours AND level == "error"
| SORT @timestamp DESC
| LIMIT 100
按时间聚合:
FROM metrics-*
| WHERE @timestamp > NOW() - 7 days
| STATS avg_cpu = AVG(cpu.percent) BY bucket = DATE_TRUNC(1 hour, @timestamp)
| SORT bucket DESC
带计数的 Top N:
FROM web-logs
| STATS count = COUNT(*) BY response.status_code
| SORT count DESC
| LIMIT 10
文本搜索: 使用 MATCH、QSTR 或 KQL 进行全文搜索,而不是 LIKE/RLIKE — 它们速度显著更快,并支持相关性评分。完整指南请参见 ES|QL 搜索参考。
FROM documents METADATA _score
| WHERE MATCH(content, "search terms")
| SORT _score DESC
| LIMIT 20
日志分类(白金许可证): 使用 CATEGORIZE 将日志消息自动聚类到模式组中。在探索或查找非结构化文本中的模式时,优先使用此方法,而不是运行多个 STATS ... BY field 查询。
FROM logs-*
| WHERE @timestamp > NOW() - 24 hours
| STATS count = COUNT(*) BY category = CATEGORIZE(message)
| SORT count DESC
| LIMIT 20
变化点检测(白金许可证): 使用 CHANGE_POINT 检测指标序列中的峰值、谷值和趋势变化。优先使用此方法,而不是手动检查时间分桶计数。
FROM logs-*
| STATS c = COUNT(*) BY t = BUCKET(@timestamp, 30 seconds)
| SORT t
| CHANGE_POINT c ON t
| WHERE type IS NOT NULL
时间序列指标:
// 计数器指标:SUM(RATE(...)) 与 TBUCKET(duration)
TS metrics-tsds
| WHERE TRANGE(1 hour)
| STATS SUM(RATE(requests)) BY TBUCKET(1 hour), host
// 仪表指标:AVG(...) — 不需要 RATE
TS metrics-tsds
| STATS avg_cpu = AVG(cpu) BY service.name, bucket = TBUCKET(5 minutes)
| SORT bucket
使用 LOOKUP JOIN 进行数据丰富: 当连接键在源索引和查找索引中名称不同时,使用 RENAME。当查找表具有复合键时,在 ON 中使用多个字段。
// 字段名不匹配 — 在连接前使用 RENAME
FROM support_tickets
| RENAME product AS product_name
| LOOKUP JOIN knowledge_base ON product_name
// 多字段连接 (9.2+)
FROM application_logs
| LOOKUP JOIN service_registry ON service_name, environment
| KEEP service_name, environment, owner_team
多值字段过滤: 使用 MV_CONTAINS 检查多值字段是否包含特定值。使用 MV_COUNT 计算值数量。
// 按多值成员资格过滤
FROM employees
| WHERE MV_CONTAINS(languages, "Python")
// 查找匹配多个值的条目
FROM employees
| WHERE MV_CONTAINS(languages, "Java") AND MV_CONTAINS(languages, "Python")
// 计算多值条目数量
FROM employees
| EVAL num_languages = MV_COUNT(languages)
| SORT num_languages DESC
变化点检测(替代示例): 当用户询问峰值、谷值或异常时使用。需要时间分桶聚合、SORT,然后是 CHANGE_POINT。
FROM logs-*
| STATS error_count = COUNT(*) BY bucket = DATE_TRUNC(1 hour, @timestamp)
| SORT bucket
| CHANGE_POINT error_count ON bucket AS type, pvalue
有关完整的 ES|QL 语法,包括所有命令、函数和运算符,请阅读:
当查询执行失败时,脚本返回:
常见问题:
node scripts/esql.js schema "index" 检查模式# 模式发现
node scripts/esql.js test
node scripts/esql.js indices "logs-*"
node scripts/esql.js schema "logs-2024.01.01"
# 执行查询
node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY host.name | LIMIT 10"
node scripts/esql.js raw "FROM metrics-* | STATS avg = AVG(cpu.percent) BY hour = DATE_TRUNC(1 hour, @timestamp)" --tsv
每周安装数
213
仓库
GitHub 星标数
89
首次出现
11 天前
安全审计
安装于
cursor184
github-copilot175
codex173
gemini-cli172
opencode172
amp171
Execute ES|QL queries against Elasticsearch.
ES|QL (Elasticsearch Query Language) is a piped query language for Elasticsearch. It is NOT the same as:
ES|QL uses pipes (|) to chain commands: FROM index | WHERE condition | STATS aggregation BY field | SORT field | LIMIT n
Prerequisite: ES|QL requires
_sourceto be enabled on queried indices. Indices with_sourcedisabled (e.g.,"_source": { "enabled": false }) will cause ES|QL queries to fail.Version Compatibility: ES|QL was introduced in 8.11 (tech preview) and became GA in 8.14. Features like
LOOKUP JOIN(8.18+),MATCH(8.17+), andINLINE STATS(9.2+) were added in later versions. On pre-8.18 clusters, useENRICHas a fallback forLOOKUP JOIN(see generation tips).INLINE STATSand counter-fieldRATE()have no fallback before 9.2. Check references/esql-version-history.md for feature availability by version.Cluster Detection: Use the
GET /response to determine the cluster type and version:
build_flavor: "serverless"— Elastic Cloud Serverless. Reports version8.11.0but supports all latest ES|QL features. Ignore the version number and assume all GA and preview features are available.build_flavor: "default"— Self-managed or Elastic Cloud Hosted. Useversion.numberfor feature availability.- Snapshot builds have
version.numberlike9.4.0-SNAPSHOT. Strip the-SNAPSHOTsuffix and use the major.minor for version checks. Snapshot builds include all features from that version plus potentially unreleased features from development — if a query fails with an unknown function/command, it may simply not have landed yet. Elastic employees commonly use snapshot builds for testing.
Elasticsearch connection is configured via environment variables. Run node scripts/esql.js test to verify the connection. If the test fails, suggest these setup options to the user, then stop. Do not try to explore further until a successful connection test.
export ELASTICSEARCH_CLOUD_ID="deployment-name:base64encodedcloudid"
export ELASTICSEARCH_API_KEY="base64encodedapikey"
export ELASTICSEARCH_URL="https://elasticsearch:9200"
export ELASTICSEARCH_API_KEY="base64encodedapikey"
export ELASTICSEARCH_URL="https://elasticsearch:9200"
export ELASTICSEARCH_USERNAME="elastic"
export ELASTICSEARCH_PASSWORD="changeme"
For local development and testing, use start-local to quickly spin up Elasticsearch and Kibana using Docker or Podman:
curl -fsSL https://elastic.co/start-local | sh
After installation completes, Elasticsearch runs at http://localhost:9200 and Kibana at http://localhost:5601. The script generates a random password for the elastic user and an API key, both stored in the .env file inside the created elastic-start-local folder.
To configure the environment variables for this skill, source the .env file and export the connection settings:
source elastic-start-local/.env
export ELASTICSEARCH_URL="$ES_LOCAL_URL"
export ELASTICSEARCH_API_KEY="$ES_LOCAL_API_KEY"
Then run node scripts/esql.js test to verify the connection.
export ELASTICSEARCH_INSECURE="true"
node scripts/esql.js indices # List all indices
node scripts/esql.js indices "logs-*" # List matching indices
node scripts/esql.js schema "logs-2024.01.01" # Get field mappings for an index
node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY host.name | SORT count DESC | LIMIT 5"
node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY component | SORT count DESC" --tsv
TSV Output Options:
--tsv or -t: Output as tab-separated values (clean, no decorations)--no-header: Omit the header rownode scripts/esql.js test
Detect deployment type : Always run node scripts/esql.js test first. This detects whether the deployment is a Serverless project (all features available) or a versioned cluster (features depend on version). The build_flavor field from GET / is the authoritative signal — if it equals "serverless", ignore the reported version number and use all ES|QL features freely.
Discover schema (required — never guess index or field names):
node scripts/esql.js indices "pattern*"
node scripts/esql.js schema "index-name"
Always run schema discovery before generating queries. Index names and field names vary across deployments and cannot be reliably guessed. Even common-sounding data (e.g., "logs") may live in indices named logs-test, logs-app-*, or application_logs. Field names may use ECS dotted notation (source.ip, service.name) or flat custom names — the only way to know is to check.
The schema command reports the index mode. If it shows Index mode: time_series, the output includes the data stream name and copy-pasteable TS syntax — use TS <data-stream> (not FROM), TBUCKET(interval) (not DATE_TRUNC), and wrap counter fields with SUM(RATE(...)). Read the full TS section in Generation Tips before writing any time series query. You can also check the index mode directly via the Elasticsearch index settings API:
curl -s "$ELASTICSEARCH_URL/<index-name>/_settings/index.mode" -H "Authorization: ApiKey $ELASTICSEARCH_API_KEY"
3. Choose the right ES|QL feature for the task : Before writing queries, match the user's intent to the most appropriate ES|QL feature. Prefer a single advanced query over multiple basic ones.
* "find patterns," "categorize," "group similar messages" → `CATEGORIZE(field)`
* "spike," "dip," "anomaly," "when did X change" → `CHANGE_POINT value ON key`
* "trend over time," "time series" → `STATS ... BY BUCKET(@timestamp, interval)` or `TS` for TSDB
* "search," "find documents matching" → `MATCH`, `QSTR`, `KQL`
* "count," "average," "breakdown" → `STATS` with aggregation functions
4. Read the references before generating queries:
* [Generation Tips](https://github.com/elastic/agent-skills/blob/HEAD/skills/elasticsearch/elasticsearch-esql/references/generation-tips.md) \- key patterns (TS/TBUCKET/RATE, per-agg WHERE, LOOKUP JOIN, CIDR_MATCH), common templates, and ambiguity handling
* [Time Series Queries](https://github.com/elastic/agent-skills/blob/HEAD/skills/elasticsearch/elasticsearch-esql/references/time-series-queries.md) \- **read before any TS query** : inner/outer aggregation model, TBUCKET syntax, RATE constraints
* [ES|QL Complete Reference](https://github.com/elastic/agent-skills/blob/HEAD/skills/elasticsearch/elasticsearch-esql/references/esql-reference.md) \- full syntax for all commands and functions
* [ES|QL Search Reference](https://github.com/elastic/agent-skills/blob/HEAD/skills/elasticsearch/elasticsearch-esql/references/esql-search.md) — for full-text search queries (MATCH, QSTR, KQL, scoring)
5. Generate the query following ES|QL syntax:
* Start with `FROM index-pattern` (or `TS index-pattern` for time series indices)
* Add `WHERE` for filtering (use `TRANGE` for time ranges on 9.3+)
* Use `EVAL` for computed fields
* Use `STATS ... BY` for aggregations
* For time series metrics: `TS` with `SUM(RATE(...))` for counters, `AVG(...)` for gauges, and `TBUCKET(interval)` for time bucketing — see the TS section in [Generation Tips](https://github.com/elastic/agent-skills/blob/HEAD/skills/elasticsearch/elasticsearch-esql/references/generation-tips.md) for the three critical syntax rules
* For detecting spikes, dips, or anomalies, use `CHANGE_POINT` after time-bucketed aggregation
* Add `SORT` and `LIMIT` as needed
6. Execute with TSV flag :
node scripts/esql.js raw "FROM index | STATS count = COUNT(*) BY field" --tsv
Version availability: This section omits version annotations for readability. Check ES|QL Version History for feature availability by Elasticsearch version.
FROM index-pattern
| WHERE condition
| EVAL new_field = expression
| STATS aggregation BY grouping
| SORT field DESC
| LIMIT n
Filter and limit:
FROM logs-*
| WHERE @timestamp > NOW() - 24 hours AND level == "error"
| SORT @timestamp DESC
| LIMIT 100
Aggregate by time:
FROM metrics-*
| WHERE @timestamp > NOW() - 7 days
| STATS avg_cpu = AVG(cpu.percent) BY bucket = DATE_TRUNC(1 hour, @timestamp)
| SORT bucket DESC
Top N with count:
FROM web-logs
| STATS count = COUNT(*) BY response.status_code
| SORT count DESC
| LIMIT 10
Text search: Use MATCH, QSTR, or KQL for full-text search instead of LIKE/RLIKE — they are significantly faster and support relevance scoring. See ES|QL Search Reference for the full guide.
FROM documents METADATA _score
| WHERE MATCH(content, "search terms")
| SORT _score DESC
| LIMIT 20
Log categorization (Platinum license): Use CATEGORIZE to auto-cluster log messages into pattern groups. Prefer this over running multiple STATS ... BY field queries when exploring or finding patterns in unstructured text.
FROM logs-*
| WHERE @timestamp > NOW() - 24 hours
| STATS count = COUNT(*) BY category = CATEGORIZE(message)
| SORT count DESC
| LIMIT 20
Change point detection (Platinum license): Use CHANGE_POINT to detect spikes, dips, and trend shifts in a metric series. Prefer this over manual inspection of time-bucketed counts.
FROM logs-*
| STATS c = COUNT(*) BY t = BUCKET(@timestamp, 30 seconds)
| SORT t
| CHANGE_POINT c ON t
| WHERE type IS NOT NULL
Time series metrics:
// Counter metric: SUM(RATE(...)) with TBUCKET(duration)
TS metrics-tsds
| WHERE TRANGE(1 hour)
| STATS SUM(RATE(requests)) BY TBUCKET(1 hour), host
// Gauge metric: AVG(...) — no RATE needed
TS metrics-tsds
| STATS avg_cpu = AVG(cpu) BY service.name, bucket = TBUCKET(5 minutes)
| SORT bucket
Data enrichment with LOOKUP JOIN: Use RENAME when the join key has a different name in the source vs the lookup index. Use multiple fields in ON when the lookup table has a composite key.
// Field name mismatch — RENAME before joining
FROM support_tickets
| RENAME product AS product_name
| LOOKUP JOIN knowledge_base ON product_name
// Multi-field join (9.2+)
FROM application_logs
| LOOKUP JOIN service_registry ON service_name, environment
| KEEP service_name, environment, owner_team
Multivalue field filtering: Use MV_CONTAINS to check if a multivalue field contains a specific value. Use MV_COUNT to count values.
// Filter by multivalue membership
FROM employees
| WHERE MV_CONTAINS(languages, "Python")
// Find entries matching multiple values
FROM employees
| WHERE MV_CONTAINS(languages, "Java") AND MV_CONTAINS(languages, "Python")
// Count multivalue entries
FROM employees
| EVAL num_languages = MV_COUNT(languages)
| SORT num_languages DESC
Change point detection (alternate example): Use when the user asks about spikes, dips, or anomalies. Requires time-bucketed aggregation, SORT, then CHANGE_POINT.
FROM logs-*
| STATS error_count = COUNT(*) BY bucket = DATE_TRUNC(1 hour, @timestamp)
| SORT bucket
| CHANGE_POINT error_count ON bucket AS type, pvalue
For complete ES|QL syntax including all commands, functions, and operators, read:
When query execution fails, the script returns:
Common issues:
node scripts/esql.js schema "index"# Schema discovery
node scripts/esql.js test
node scripts/esql.js indices "logs-*"
node scripts/esql.js schema "logs-2024.01.01"
# Execute queries
node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY host.name | LIMIT 10"
node scripts/esql.js raw "FROM metrics-* | STATS avg = AVG(cpu.percent) BY hour = DATE_TRUNC(1 hour, @timestamp)" --tsv
Weekly Installs
213
Repository
GitHub Stars
89
First Seen
11 days ago
Security Audits
Gen Agent Trust HubFailSocketPassSnykPass
Installed on
cursor184
github-copilot175
codex173
gemini-cli172
opencode172
amp171
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
104,900 周安装
CHANGE_POINTSORT 和 LIMIT