npx skills add https://github.com/axiomhq/skills --skill controlling-costs用于优化 Axiom 使用情况的仪表板、监控器和浪费识别。
加载所需技能:
skill: axiom-sre
skill: building-dashboards
Building-dashboards 提供:dashboard-list、dashboard-get、dashboard-create、dashboard-update、dashboard-delete
查找审计数据集。首先尝试 axiom-audit:
['axiom-audit']
| where _time > ago(1h)
| summarize count() by action
| where action in ('usageCalculated', 'runAPLQueryCost')
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
axiom-audit-logs-view、audit-logsusageCalculated 事件 → 数据集错误,询问用户验证 axiom-history 访问权限(阶段 4 所需):
['axiom-history'] | where _time > ago(1h) | take 1
如果未找到,阶段 4 的优化将无法工作。
与用户确认:
替换以下所有命令中的 <deployment> 和 <audit-dataset>。
提示:
-h 查看完整用法head 或 tail — 会导致 SIGPIPE 错误jq 进行 JSON 解析axiom-query 进行临时 APL 查询,而非直接使用 CLI| 用户请求 | 运行这些阶段 |
|---|---|
| "降低成本" / "查找浪费" | 0 → 1 → 4 |
| "设置成本控制" | 0 → 1 → 2 → 3 |
| "部署仪表板" | 0 → 2 |
| "创建监控器" | 0 → 3 |
| "检查配置漂移" | 仅 0 |
# 现有仪表板?
dashboard-list <deployment> | grep -i cost
# 现有监控器?
axiom-api <deployment> GET "/v2/monitors" | jq -r '.[] | select(.name | startswith("Cost Control:")) | "\(.id)\t\(.name)"'
如果找到,使用 dashboard-get 获取并与 templates/dashboard.json 比较以检查配置漂移。
scripts/baseline-stats -d <deployment> -a <audit-dataset>
捕获每日摄入统计数据并生成分析队列(阶段 4 所需)。
scripts/deploy-dashboard -d <deployment> -a <audit-dataset>
创建包含以下内容的仪表板:摄入趋势、消耗率、预测、浪费候选、顶级用户。详细信息请参阅 reference/dashboard-panels.md。
需要合同信息。 您必须从预检步骤 4 中获取合同限制。
scripts/list-notifiers -d <deployment>
将列表呈现给用户,并询问他们希望使用哪个通知器接收成本警报。如果他们不想要通知,请在不使用 -n 的情况下继续。
scripts/create-monitors -d <deployment> -a <audit-dataset> -c <contract_tb> [-n <notifier_id>]
创建 3 个监控器:
峰值监控器使用 notifyByGroup: true,因此每个数据集都会触发单独的警报。
阈值推导方法请参阅 reference/monitor-strategy.md。
如果尚未运行,请运行 scripts/baseline-stats。它会输出一个优先级列表:
| 优先级 | 含义 |
|---|---|
| P0⛔ | 按摄入量排名前 3 或 占总摄入量 >10% — 必须处理 |
| P1 | 从未被查询 — 强烈建议删除 |
| P2 | 很少被查询(Work/GB < 100) — 可能是浪费 |
Work/GB = 查询成本 (GB·ms) / 摄入量 (GB)。数值越低,数据价值越低。
从上到下处理。对于每个数据集:
步骤 1:列分析
scripts/analyze-query-coverage -d <deployment> -D <dataset> -a <audit-dataset>
如果查询数为 0 → 建议删除,转到下一个。
步骤 2:字段值分析
从建议列表中选择一个字段(通常是 app、service 或 kubernetes.labels.app):
scripts/analyze-query-coverage -d <deployment> -D <dataset> -a <audit-dataset> -f <field>
注意那些体积大但从未被查询过的值(⚠️ 标记)。
步骤 3:处理空值
如果 (empty) 的体积 >5%,您必须使用替代字段(例如 kubernetes.namespace_name)进行深入分析。
步骤 4:记录建议
对于每个数据集,记录:名称、摄入量、Work/GB、顶级未查询值、操作(删除/采样/保留)、预计节省量。
所有 P0⛔ 和 P1 数据集都已分析完毕。然后使用 reference/analysis-report-template.md 编译报告。
# 删除监控器
axiom-api <deployment> GET "/v2/monitors" | jq -r '.[] | select(.name | startswith("Cost Control:")) | "\(.id)\t\(.name)"'
axiom-api <deployment> DELETE "/v2/monitors/<id>"
# 删除仪表板
dashboard-list <deployment> | grep -i cost
dashboard-delete <deployment> <id>
注意: 运行两次 create-monitors 会创建重复项。如果重新部署,请先删除现有的监控器。
| 字段 | 描述 |
|---|---|
action | usageCalculated 或 runAPLQueryCost |
properties.hourly_ingest_bytes | 每小时摄入量(字节) |
properties.hourly_billable_query_gbms | 每小时查询成本 |
properties.dataset | 数据集名称 |
resource.id | 组织 ID |
actor.email | 用户邮箱 |
| 数据集类型 | 主要字段 | 替代字段 |
|---|---|---|
| Kubernetes 日志 | kubernetes.labels.app | kubernetes.namespace_name、kubernetes.container_name |
| 应用程序日志 | app 或 service | level、logger、component |
| 基础设施 | host | region、instance |
| 追踪 | service.name | span.kind、http.route |
| 合同 | TB/天 | GB/月 |
|---|---|---|
| 5 PB/月 | 167 | 5,000,000 |
| 10 PB/月 | 333 | 10,000,000 |
| 15 PB/月 | 500 | 15,000,000 |
| 信号 | 操作 |
|---|---|
| Work/GB = 0 | 删除或停止摄入 |
| 高容量未查询值 | 采样或降低日志级别 |
| 来自系统命名空间的空值 | 在摄入时过滤或接受 |
| 周环比峰值 | 检查最近的部署 |
每周安装量
120
仓库
GitHub 星标数
2
首次出现
2026年1月24日
安全审计
安装于
codex107
gemini-cli105
opencode105
claude-code100
github-copilot100
cursor93
Dashboards, monitors, and waste identification for Axiom usage optimization.
Load required skills:
skill: axiom-sre
skill: building-dashboards
Building-dashboards provides: dashboard-list, dashboard-get, dashboard-create, dashboard-update, dashboard-delete
Find the audit dataset. Try axiom-audit first:
['axiom-audit']
| where _time > ago(1h)
| summarize count() by action
| where action in ('usageCalculated', 'runAPLQueryCost')
axiom-audit-logs-view, audit-logsusageCalculated events → wrong dataset, ask userVerify axiom-history access (required for Phase 4):
['axiom-history'] | where _time > ago(1h) | take 1
If not found, Phase 4 optimization will not work.
Confirm with user:
Replace <deployment> and <audit-dataset> in all commands below.
Tips:
-h for full usagehead or tail — causes SIGPIPE errorsjq for JSON parsingaxiom-query for ad-hoc APL, not direct CLI| User request | Run these phases |
|---|---|
| "reduce costs" / "find waste" | 0 → 1 → 4 |
| "set up cost control" | 0 → 1 → 2 → 3 |
| "deploy dashboard" | 0 → 2 |
| "create monitors" | 0 → 3 |
| "check for drift" | 0 only |
# Existing dashboard?
dashboard-list <deployment> | grep -i cost
# Existing monitors?
axiom-api <deployment> GET "/v2/monitors" | jq -r '.[] | select(.name | startswith("Cost Control:")) | "\(.id)\t\(.name)"'
If found, fetch with dashboard-get and compare to templates/dashboard.json for drift.
scripts/baseline-stats -d <deployment> -a <audit-dataset>
Captures daily ingest stats and produces the Analysis Queue (needed for Phase 4).
scripts/deploy-dashboard -d <deployment> -a <audit-dataset>
Creates dashboard with: ingest trends, burn rate, projections, waste candidates, top users. See reference/dashboard-panels.md for details.
Contract is required. You must have the contract limit from preflight step 4.
scripts/list-notifiers -d <deployment>
Present the list to the user and ask which notifier they want for cost alerts. If they don't want notifications, proceed without -n.
scripts/create-monitors -d <deployment> -a <audit-dataset> -c <contract_tb> [-n <notifier_id>]
Creates 3 monitors:
The spike monitors use notifyByGroup: true so each dataset triggers a separate alert.
See reference/monitor-strategy.md for threshold derivation.
Run scripts/baseline-stats if not already done. It outputs a prioritized list:
| Priority | Meaning |
|---|---|
| P0⛔ | Top 3 by ingest OR >10% of total — MANDATORY |
| P1 | Never queried — strong drop candidate |
| P2 | Rarely queried (Work/GB < 100) — likely waste |
Work/GB = query cost (GB·ms) / ingest (GB). Lower = less value from data.
Work top-to-bottom. For each dataset:
Step 1: Column analysis
scripts/analyze-query-coverage -d <deployment> -D <dataset> -a <audit-dataset>
If 0 queries → recommend DROP, move to next.
Step 2: Field value analysis
Pick a field from suggested list (usually app, service, or kubernetes.labels.app):
scripts/analyze-query-coverage -d <deployment> -D <dataset> -a <audit-dataset> -f <field>
Note values with high volume but never queried (⚠️ markers).
Step 3: Handle empty values
If (empty) has >5% volume, you MUST drill down with alternative field (e.g., kubernetes.namespace_name).
Step 4: Record recommendation
For each dataset, note: name, ingest volume, Work/GB, top unqueried values, action (DROP/SAMPLE/KEEP), estimated savings.
All P0⛔ and P1 datasets analyzed. Then compile report using reference/analysis-report-template.md.
# Delete monitors
axiom-api <deployment> GET "/v2/monitors" | jq -r '.[] | select(.name | startswith("Cost Control:")) | "\(.id)\t\(.name)"'
axiom-api <deployment> DELETE "/v2/monitors/<id>"
# Delete dashboard
dashboard-list <deployment> | grep -i cost
dashboard-delete <deployment> <id>
Note: Running create-monitors twice creates duplicates. Delete existing monitors first if re-deploying.
| Field | Description |
|---|---|
action | usageCalculated or runAPLQueryCost |
properties.hourly_ingest_bytes | Hourly ingest in bytes |
properties.hourly_billable_query_gbms | Hourly query cost |
properties.dataset | Dataset name |
resource.id |
| Dataset type | Primary field | Alternatives |
|---|---|---|
| Kubernetes logs | kubernetes.labels.app | kubernetes.namespace_name, kubernetes.container_name |
| Application logs | app or service | level, logger, component |
| Contract | TB/day | GB/month |
|---|---|---|
| 5 PB/month | 167 | 5,000,000 |
| 10 PB/month | 333 | 10,000,000 |
| 15 PB/month | 500 | 15,000,000 |
| Signal | Action |
|---|---|
| Work/GB = 0 | Drop or stop ingesting |
| High-volume unqueried values | Sample or reduce log level |
| Empty values from system namespaces | Filter at ingest or accept |
| WoW spike | Check recent deploys |
Weekly Installs
120
Repository
GitHub Stars
2
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubWarnSocketPassSnykPass
Installed on
codex107
gemini-cli105
opencode105
claude-code100
github-copilot100
cursor93
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
96,200 周安装
| Org ID |
actor.email | User email |
| Infrastructure | host | region, instance |
| Traces | service.name | span.kind, http.route |