answering-natural-language-questions-with-dbt by dbt-labs/dbt-agent-skills
npx skills add https://github.com/dbt-labs/dbt-agent-skills --skill answering-natural-language-questions-with-dbt使用最佳可用方法回答数据问题:优先使用语义层,然后是 SQL 修改,接着是模型发现,最后是清单分析。在说"无法回答"之前,务必穷尽所有选项。
适用于: 用户提出的需要数据答案的业务问题
不适用于:
dbt run、dbt test 或 dbt build 工作流flowchart TD
start([收到业务问题])
check_sl{语义层工具可用?}
list_metrics[list_metrics]
metric_exists{相关指标存在?}
get_dims[get_dimensions]
sl_sufficient{语义层能直接回答?}
query_metrics[query_metrics]
answer([返回答案])
try_compiled[get_metrics_compiled_sql<br/>修改 SQL,execute_sql]
check_discovery{模型发现工具可用?}
try_discovery[get_mart_models<br/>get_model_details<br/>编写 SQL,执行]
check_manifest{在 dbt 项目中?}
try_manifest[分析清单/目录<br/>编写 SQL]
cannot([无法回答])
suggest{在 dbt 项目中?}
improvements[建议语义层更改]
done([完成])
start --> check_sl
check_sl -->|是| list_metrics
check_sl -->|否| check_discovery
list_metrics --> metric_exists
metric_exists -->|是| get_dims
metric_exists -->|否| check_discovery
get_dims --> sl_sufficient
sl_sufficient -->|是| query_metrics
sl_sufficient -->|否| try_compiled
query_metrics --> answer
try_compiled -->|成功| answer
try_compiled -->|失败| check_discovery
check_discovery -->|是| try_discovery
check_discovery -->|否| check_manifest
try_discovery -->|成功| answer
try_discovery -->|失败| check_manifest
check_manifest -->|是| try_manifest
check_manifest -->|否| cannot
try_manifest -->|SQL 就绪| answer
answer --> suggest
cannot --> done
suggest -->|是| improvements
suggest -->|否| done
improvements --> done
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 优先级 | 条件 | 方法 | 工具 |
|---|---|---|---|
| 1 | 语义层激活 | 直接查询指标 | list_metrics, get_dimensions, query_metrics |
| 2 | 语义层激活但需要少量修改(缺少维度、自定义过滤器、case when、不同聚合方式) | 修改编译后的 SQL | get_metrics_compiled_sql, 然后 execute_sql |
| 3 | 无语义层,发现工具激活 | 探索模型,编写 SQL | get_mart_models, get_model_details, 然后 show/execute_sql |
| 4 | 无 MCP,在 dbt 项目中 | 分析工件,编写 SQL | 读取 target/manifest.json, target/catalog.json |
当 list_metrics 和 query_metrics 可用时:
list_metrics - 查找相关指标get_dimensions - 验证所需维度是否存在query_metrics - 使用适当的过滤器执行如果语义层不能直接回答(缺少维度、需要自定义逻辑)→ 转到方法 2。
当语义层有指标但需要少量修改时:
get_metrics_compiled_sql - 获取将要运行的 SQL(返回原始 SQL,不是 Jinja)execute_sql 以运行原始 SQL-- 示例:添加 sales_rep 维度
WITH base AS (
-- ... 编译后的指标逻辑(已解析为表名)...
)
SELECT base.*, reps.sales_rep_name
FROM base
JOIN analytics.dim_sales_reps reps ON base.rep_id = reps.id
GROUP BY ...
-- 示例:自定义过滤器
SELECT * FROM (compiled_metric_sql) WHERE region = 'EMEA'
-- 示例:Case when 分类
SELECT
CASE WHEN amount > 1000 THEN 'large' ELSE 'small' END as deal_size,
SUM(amount)
FROM (compiled_metric_sql)
GROUP BY 1
注意: 编译后的 SQL 包含已解析的表名,而不是 {{ ref() }}。请使用返回的原始 SQL。
当没有语义层但 get_all_models/get_model_details 可用时:
get_mart_models - 从集市模型开始,而不是暂存模型get_model_details 用于相关模型 - 理解模式{{ ref('model_name') }} 编写 SQLshow --inline "..." 或 execute_sql优先使用集市模型而非暂存模型 - 集市模型已应用业务逻辑。
当在 dbt 项目中但没有 MCP 服务器时:
target/manifest.json 和 target/catalog.json# 在清单中查找集市模型
jq '.nodes | to_entries | map(select(.key | startswith("model.") and contains("mart"))) | .[].value | {name: .name, schema: .schema, columns: .columns}' target/manifest.json
# 从目录获取列信息
jq '.nodes["model.project_name.model_name"].columns' target/catalog.json
3. 根据发现的模式编写 SQL 4. 解释:"此 SQL 应在您的数据仓库中运行。没有数据库访问权限我无法执行它。"
当在 dbt 项目中时,在回答后(或无法回答时)建议语义层更改:
| 差距 | 建议 |
|---|---|
| 指标不存在 | "将指标定义添加到您的语义模型中" |
| 缺少维度 | "将 dimension_name 添加到语义模型的维度列表中" |
| 无语义层 | "考虑为此数据添加语义层" |
保持在语义层级别。 不要建议:
| 您在想... | 现实情况 |
|---|---|
| "语义层不支持这个确切的查询" | 获取编译后的 SQL 并修改它(方法 2) |
| "没有 MCP 工具,无法帮助" | 检查本地是否有清单/目录 |
| "用户需要快速得到答案,跳过系统检查" | 系统方法才是最快的途径 |
| "直接写 SQL,更快" | 语义层存在是有原因的 - 优先使用它 |
| "数据中不存在该维度" | 也许它存在,但不在语义层配置中 |
| 错误 | 修复 |
|---|---|
| 当语义层不能直接回答时放弃 | 获取编译后的 SQL 并修改它 |
| 查询暂存模型 | 首先使用 get_mart_models |
| 读取完整的 manifest.json | 使用 jq 过滤 |
| 建议 ETL 更改 | 将建议保持在语义层级别 |
| 未检查工具可用性 | 在选择方法前列出可用工具 |
每周安装量
120
代码库
GitHub 星标
311
首次出现
2026年1月29日
安全审计
安装于
github-copilot75
opencode73
gemini-cli72
codex72
amp69
kimi-cli69
Answer data questions using the best available method: semantic layer first, then SQL modification, then model discovery, then manifest analysis. Always exhaust options before saying "cannot answer."
Use for: Business questions from users that need data answers
Not for:
dbt run, dbt test, or dbt build workflowsflowchart TD
start([Business question received])
check_sl{Semantic layer tools available?}
list_metrics[list_metrics]
metric_exists{Relevant metric exists?}
get_dims[get_dimensions]
sl_sufficient{SL can answer directly?}
query_metrics[query_metrics]
answer([Return answer])
try_compiled[get_metrics_compiled_sql<br/>Modify SQL, execute_sql]
check_discovery{Model discovery tools available?}
try_discovery[get_mart_models<br/>get_model_details<br/>Write SQL, execute]
check_manifest{In dbt project?}
try_manifest[Analyze manifest/catalog<br/>Write SQL]
cannot([Cannot answer])
suggest{In dbt project?}
improvements[Suggest semantic layer changes]
done([Done])
start --> check_sl
check_sl -->|yes| list_metrics
check_sl -->|no| check_discovery
list_metrics --> metric_exists
metric_exists -->|yes| get_dims
metric_exists -->|no| check_discovery
get_dims --> sl_sufficient
sl_sufficient -->|yes| query_metrics
sl_sufficient -->|no| try_compiled
query_metrics --> answer
try_compiled -->|success| answer
try_compiled -->|fail| check_discovery
check_discovery -->|yes| try_discovery
check_discovery -->|no| check_manifest
try_discovery -->|success| answer
try_discovery -->|fail| check_manifest
check_manifest -->|yes| try_manifest
check_manifest -->|no| cannot
try_manifest -->|SQL ready| answer
answer --> suggest
cannot --> done
suggest -->|yes| improvements
suggest -->|no| done
improvements --> done
| Priority | Condition | Approach | Tools |
|---|---|---|---|
| 1 | Semantic layer active | Query metrics directly | list_metrics, get_dimensions, query_metrics |
| 2 | SL active but minor modifications needed (missing dimension, custom filter, case when, different aggregation) | Modify compiled SQL | get_metrics_compiled_sql, then execute_sql |
| 3 | No SL, discovery tools active | Explore models, write SQL | , , then / |
When list_metrics and query_metrics are available:
list_metrics - find relevant metricget_dimensions - verify required dimensions existquery_metrics - execute with appropriate filtersIf semantic layer can't answer directly (missing dimension, need custom logic) → go to Approach 2.
When semantic layer has the metric but needs minor modifications:
get_metrics_compiled_sql - get the SQL that would run (returns raw SQL, not Jinja)execute_sql to run the raw SQL-- Example: Adding sales_rep dimension
WITH base AS (
-- ... compiled metric logic (already resolved to table names) ...
)
SELECT base.*, reps.sales_rep_name
FROM base
JOIN analytics.dim_sales_reps reps ON base.rep_id = reps.id
GROUP BY ...
-- Example: Custom filter
SELECT * FROM (compiled_metric_sql) WHERE region = 'EMEA'
-- Example: Case when categorization
SELECT
CASE WHEN amount > 1000 THEN 'large' ELSE 'small' END as deal_size,
SUM(amount)
FROM (compiled_metric_sql)
GROUP BY 1
Note: The compiled SQL contains resolved table names, not {{ ref() }}. Work with the raw SQL as returned.
When no semantic layer but get_all_models/get_model_details available:
get_mart_models - start with marts, not stagingget_model_details for relevant models - understand schema{{ ref('model_name') }}show --inline "..." or execute_sqlPrefer marts over staging - marts have business logic applied.
When in a dbt project but no MCP server:
target/manifest.json and target/catalog.json# Find mart models in manifest
jq '.nodes | to_entries | map(select(.key | startswith("model.") and contains("mart"))) | .[].value | {name: .name, schema: .schema, columns: .columns}' target/manifest.json
# Get column info from catalog
jq '.nodes["model.project_name.model_name"].columns' target/catalog.json
3. Write SQL based on discovered schema 4. Explain: "This SQL should run in your warehouse. I cannot execute it without database access."
When in a dbt project , suggest semantic layer changes after answering (or when cannot answer):
| Gap | Suggestion |
|---|---|
| Metric doesn't exist | "Add a metric definition to your semantic model" |
| Dimension missing | "Add dimension_name to the dimensions list in the semantic model" |
| No semantic layer | "Consider adding a semantic layer for this data" |
Stay at semantic layer level. Do NOT suggest:
| You're Thinking... | Reality |
|---|---|
| "Semantic layer doesn't support this exact query" | Get compiled SQL and modify it (Approach 2) |
| "No MCP tools, can't help" | Check for manifest/catalog locally |
| "User needs this quickly, skip the systematic check" | Systematic approach IS the fastest path |
| "Just write SQL, it's faster" | Semantic layer exists for a reason - use it first |
| "The dimension doesn't exist in the data" | Maybe it exists but not in semantic layer config |
| Mistake | Fix |
|---|---|
| Giving up when SL can't answer directly | Get compiled SQL and modify it |
| Querying staging models | Use get_mart_models first |
| Reading full manifest.json | Use jq to filter |
| Suggesting ETL changes | Keep suggestions at semantic layer |
| Not checking tool availability | List available tools before choosing approach |
Weekly Installs
120
Repository
GitHub Stars
311
First Seen
Jan 29, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
github-copilot75
opencode73
gemini-cli72
codex72
amp69
kimi-cli69
Python PDF处理教程:合并拆分、提取文本表格、创建PDF文件
62,000 周安装
Vercel React 最佳实践指南 | 45条Next.js性能优化规则与代码规范
175 周安装
技能创建器指南:如何为Claude AI构建高效技能模块 | Anthropic插件开发
227 周安装
Angular编译器CLI (ngtsc) 架构详解:Ivy编译、模板类型检查与AOT
324 周安装
事件溯源(Event Sourcing)技术指南:实现审计追踪、时间点查询与CQRS架构
168 周安装
数据库索引策略指南:B-tree、Hash、GiST、BRIN索引优化查询性能与维护
164 周安装
Gradio:快速构建机器学习Web界面和交互式演示的Python库
171 周安装
get_mart_modelsget_model_detailsshowexecute_sql| 4 | No MCP, in dbt project | Analyze artifacts, write SQL | Read target/manifest.json, target/catalog.json |