data-context-extractor by anthropics/knowledge-work-plugins
npx skills add https://github.com/anthropics/knowledge-work-plugins --skill data-context-extractor一种元技能,可从分析师处提取公司特定的数据知识,并生成定制化的数据分析技能。
此技能有两种模式:
使用场景:用户希望为其数据仓库创建一个新的数据上下文技能。
步骤 1:识别数据库类型
询问:"您使用的是什么数据仓库?"
常见选项:
使用 ~~data warehouse 工具(查询和模式)进行连接。如果不清楚,请检查当前会话中可用的 MCP 工具。
步骤 2:探索模式
使用 ~~data warehouse 模式工具来:
按方言的示例探索查询:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
-- BigQuery: List datasets
SELECT schema_name FROM INFORMATION_SCHEMA.SCHEMATA
-- BigQuery: List tables in a dataset
SELECT table_name FROM `project.dataset.INFORMATION_SCHEMA.TABLES`
-- Snowflake: List schemas
SHOW SCHEMAS IN DATABASE my_database
-- Snowflake: List tables
SHOW TABLES IN SCHEMA my_schema
模式发现后,以对话方式询问这些问题(不要一次性全部问完):
实体消歧(关键)
"当这里的人说'用户'或'客户'时,具体指的是什么?是否有不同类型?"
倾听:
主要标识符
"什么是[客户/用户/账户]的主要标识符?同一个实体是否有多个 ID?"
倾听:
关键指标
"人们最常询问的 2-3 个指标是什么?每个指标是如何计算的?"
倾听:
数据卫生
"哪些数据应该总是从查询中过滤掉?(测试数据、欺诈、内部用户等)"
倾听:
常见陷阱
"新分析师在使用这些数据时通常会犯什么错误?"
倾听:
创建具有以下结构的技能:
[company]-data-analyst/
├── SKILL.md
└── references/
├── entities.md # 实体定义和关系
├── metrics.md # KPI 计算
├── tables/ # 每个领域一个文件
│ ├── [domain1].md
│ └── [domain2].md
└── dashboards.json # 可选:现有仪表板目录
SKILL.md 模板:参见 references/skill-template.md
SQL 方言部分:参见 references/sql-dialects.md 并包含适当的方言说明。
参考文件模板:参见 references/domain-template.md
使用场景:用户拥有现有技能,但需要添加上下文。
请用户上传其现有技能(zip 或文件夹),或者如果已在会话中则定位它。
阅读当前的 SKILL.md 和参考文件,以了解已记录的内容。
询问:"哪个领域或主题需要更多上下文?哪些查询失败或产生了错误结果?"
常见差距:
针对已识别的领域:
探索相关表:使用 ~~data warehouse 模式工具查找该领域的表
询问领域特定问题:
* "[领域] 分析使用哪些表?"
* "[领域] 的关键指标是什么?"
* "[领域] 数据是否有特殊的过滤器或陷阱?"
3. 生成新的参考文件:使用领域模板创建 references/[domain].md
每个参考文件应包含:
在交付生成的技能之前,请验证:
每周安装次数
156
代码仓库
GitHub 星标数
8.8K
首次出现
2026年1月31日
安全审计
安装于
opencode140
codex132
gemini-cli129
claude-code124
github-copilot124
amp113
A meta-skill that extracts company-specific data knowledge from analysts and generates tailored data analysis skills.
This skill has two modes:
Use when: User wants to create a new data context skill for their warehouse.
Step 1: Identify the database type
Ask: "What data warehouse are you using?"
Common options:
Use ~~data warehouse tools (query and schema) to connect. If unclear, check available MCP tools in the current session.
Step 2: Explore the schema
Use ~~data warehouse schema tools to:
Sample exploration queries by dialect:
-- BigQuery: List datasets
SELECT schema_name FROM INFORMATION_SCHEMA.SCHEMATA
-- BigQuery: List tables in a dataset
SELECT table_name FROM `project.dataset.INFORMATION_SCHEMA.TABLES`
-- Snowflake: List schemas
SHOW SCHEMAS IN DATABASE my_database
-- Snowflake: List tables
SHOW TABLES IN SCHEMA my_schema
After schema discovery, ask these questions conversationally (not all at once):
Entity Disambiguation (Critical)
"When people here say 'user' or 'customer', what exactly do they mean? Are there different types?"
Listen for:
Primary Identifiers
"What's the main identifier for a [customer/user/account]? Are there multiple IDs for the same entity?"
Listen for:
Key Metrics
"What are the 2-3 metrics people ask about most? How is each one calculated?"
Listen for:
Data Hygiene
"What should ALWAYS be filtered out of queries? (test data, fraud, internal users, etc.)"
Listen for:
Common Gotchas
"What mistakes do new analysts typically make with this data?"
Listen for:
Create a skill with this structure:
[company]-data-analyst/
├── SKILL.md
└── references/
├── entities.md # Entity definitions and relationships
├── metrics.md # KPI calculations
├── tables/ # One file per domain
│ ├── [domain1].md
│ └── [domain2].md
└── dashboards.json # Optional: existing dashboards catalog
SKILL.md Template : See references/skill-template.md
SQL Dialect Section : See references/sql-dialects.md and include the appropriate dialect notes.
Reference File Template : See references/domain-template.md
Use when: User has an existing skill but needs to add more context.
Ask user to upload their existing skill (zip or folder), or locate it if already in the session.
Read the current SKILL.md and reference files to understand what's already documented.
Ask: "What domain or topic needs more context? What queries are failing or producing wrong results?"
Common gaps:
For the identified domain:
Explore relevant tables : Use ~~data warehouse schema tools to find tables in that domain
Ask domain-specific questions :
Generate new reference file : Create references/[domain].md using the domain template
Each reference file should include:
Before delivering a generated skill, verify:
Weekly Installs
156
Repository
GitHub Stars
8.8K
First Seen
Jan 31, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode140
codex132
gemini-cli129
claude-code124
github-copilot124
amp113
Excel财务建模规范与xlsx文件处理指南:专业格式、零错误公式与数据分析
38,800 周安装
RAG系统分块策略指南:5种方法优化文档检索与AI生成性能
375 周安装
Vite Flare Starter:开箱即用的全栈Cloudflare应用模板,集成React 19、Hono、D1
375 周安装
VectorBT + OpenAlgo Python回测环境一键配置指南 | 量化交易开发
375 周安装
Turso数据库测试指南:SQL兼容性、Rust集成与模糊测试方法详解
375 周安装
LLM硬件模型匹配器:自动检测系统配置,推荐最佳LLM模型,支持GPU/CPU/量化
375 周安装
MySQL数据库管理、优化与开发实战指南 - 生产环境技能全解析
375 周安装