npx skills add https://github.com/borghei/claude-skills --skill data-analyst为企业洞察提供专家级数据分析。
聚合:
SELECT
date_trunc('month', created_at) as month,
COUNT(*) as total_orders,
COUNT(DISTINCT customer_id) as unique_customers,
SUM(amount) as total_revenue,
AVG(amount) as avg_order_value
FROM orders
WHERE created_at >= '2024-01-01'
GROUP BY 1
ORDER BY 1;
窗口函数:
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) as running_total,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) as order_number,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_date) as previous_order
FROM orders;
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
WITH monthly_metrics AS (
SELECT
date_trunc('month', created_at) as month,
SUM(amount) as revenue
FROM orders
GROUP BY 1
),
growth_calc AS (
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) as prev_revenue
FROM monthly_metrics
)
SELECT
month,
revenue,
ROUND((revenue - prev_revenue) / prev_revenue * 100, 1) as growth_pct
FROM growth_calc;
队列分析:
WITH first_orders AS (
SELECT
customer_id,
date_trunc('month', MIN(created_at)) as cohort_month
FROM orders
GROUP BY 1
),
cohort_data AS (
SELECT
f.cohort_month,
date_trunc('month', o.created_at) as order_month,
COUNT(DISTINCT o.customer_id) as customers
FROM orders o
JOIN first_orders f ON o.customer_id = f.customer_id
GROUP BY 1, 2
)
SELECT
cohort_month,
order_month,
EXTRACT(MONTH FROM AGE(order_month, cohort_month)) as months_since_cohort,
customers
FROM cohort_data
ORDER BY 1, 2;
使用 EXPLAIN:
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;
最佳实践:
SELECT *LIMIT| 数据类型 | 最佳图表 | 备选方案 |
|---|---|---|
| 时间趋势 | 折线图 | 面积图 |
| 部分与整体 | 饼图/环形图 | 堆叠条形图 |
| 比较 | 条形图 | 柱状图 |
| 分布 | 直方图 | 箱线图 |
| 相关性 | 散点图 | 热力图 |
| 地理数据 | 地图 | 分级统计图 |
应做:
不应做:
┌─────────────────────────────────────────────────────────────┐
│ 执行摘要 │
│ [KPI 1: $X] [KPI 2: X%] [KPI 3: X] [KPI 4: X%] │
├─────────────────────────────────────────────────────────────┤
│ 趋势 │ 细分分析 │
│ [折线图 - 主要指标] │ [条形图 - 细分市场] │
│ │ │
├────────────────────────────────┼────────────────────────────┤
│ 对比 │ 明细表 │
│ [条形图 - 对比目标/去年] │ [Top N 及指标] │
│ │ │
└────────────────────────────────┴────────────────────────────┘
import pandas as pd
import numpy as np
def describe_data(df, column):
stats = {
'count': df[column].count(),
'mean': df[column].mean(),
'median': df[column].median(),
'std': df[column].std(),
'min': df[column].min(),
'max': df[column].max(),
'q25': df[column].quantile(0.25),
'q75': df[column].quantile(0.75),
'skewness': df[column].skew(),
'kurtosis': df[column].kurtosis()
}
return stats
from scipy import stats
# T检验:比较两组数据
def compare_groups(group_a, group_b, alpha=0.05):
stat, p_value = stats.ttest_ind(group_a, group_b)
result = {
't_statistic': stat,
'p_value': p_value,
'significant': p_value < alpha,
'effect_size': (group_a.mean() - group_b.mean()) / np.sqrt(
(group_a.std()**2 + group_b.std()**2) / 2
)
}
return result
# 卡方检验:检验独立性
def test_independence(observed, alpha=0.05):
stat, p_value, dof, expected = stats.chi2_contingency(observed)
return {
'chi2': stat,
'p_value': p_value,
'degrees_of_freedom': dof,
'significant': p_value < alpha
}
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_absolute_error
def simple_regression(X, y):
model = LinearRegression()
model.fit(X.reshape(-1, 1), y)
predictions = model.predict(X.reshape(-1, 1))
return {
'coefficient': model.coef_[0],
'intercept': model.intercept_,
'r_squared': r2_score(y, predictions),
'mae': mean_absolute_error(y, predictions)
}
# 分析:[主题]
## 商业问题
[我们试图回答什么问题?]
## 假设
[我们期望发现什么?]
## 数据源
- [数据源 1]:[描述]
- [数据源 2]:[描述]
## 方法论
1. [步骤 1]
2. [步骤 2]
3. [步骤 3]
## 发现
### 发现 1:[标题]
[描述及支持数据]
### 发现 2:[标题]
[描述及支持数据]
## 建议
1. [建议]:[预期影响]
2. [建议]:[预期影响]
## 局限性
- [局限性 1]
- [局限性 2]
## 后续步骤
- [待办事项]
获客:
参与度:
留存:
收入:
1. 背景
- 为什么这很重要?
- 我们要回答什么问题?
2. 关键发现
- 以洞察开头
- 使其令人印象深刻
3. 证据
- 展示数据
- 使用有效的视觉元素
4. 影响
- 这意味着什么?
- 那又怎样?
5. 建议
- 我们应该做什么?
- 清晰的后续步骤
## [标题:以行动为导向的发现]
**内容:**[对发现的一句话描述]
**意义:**[这对业务为何重要]
**行动:**[建议采取的行动]
**证据:**
[支持该发现的图表或数据]
**置信度:**[高/中/低]
references/sql_patterns.md - 高级 SQL 查询references/visualization.md - 图表选择指南references/statistics.md - 统计方法references/storytelling.md - 演示最佳实践# 数据探查器
python scripts/data_profiler.py --table orders --output profile.html
# SQL 查询分析器
python scripts/query_analyzer.py --query query.sql --explain
# 仪表板生成器
python scripts/dashboard_gen.py --config dashboard.yaml
# 报告自动化
python scripts/report_gen.py --template monthly --output report.pdf
每周安装量
141
代码仓库
GitHub 星标数
30
首次出现
2026年1月24日
安全审计
安装于
opencode111
gemini-cli104
codex98
cursor96
github-copilot95
amp89
Expert-level data analysis for business insights.
Aggregation:
SELECT
date_trunc('month', created_at) as month,
COUNT(*) as total_orders,
COUNT(DISTINCT customer_id) as unique_customers,
SUM(amount) as total_revenue,
AVG(amount) as avg_order_value
FROM orders
WHERE created_at >= '2024-01-01'
GROUP BY 1
ORDER BY 1;
Window Functions:
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) as running_total,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) as order_number,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_date) as previous_order
FROM orders;
CTEs for Clarity:
WITH monthly_metrics AS (
SELECT
date_trunc('month', created_at) as month,
SUM(amount) as revenue
FROM orders
GROUP BY 1
),
growth_calc AS (
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) as prev_revenue
FROM monthly_metrics
)
SELECT
month,
revenue,
ROUND((revenue - prev_revenue) / prev_revenue * 100, 1) as growth_pct
FROM growth_calc;
Cohort Analysis:
WITH first_orders AS (
SELECT
customer_id,
date_trunc('month', MIN(created_at)) as cohort_month
FROM orders
GROUP BY 1
),
cohort_data AS (
SELECT
f.cohort_month,
date_trunc('month', o.created_at) as order_month,
COUNT(DISTINCT o.customer_id) as customers
FROM orders o
JOIN first_orders f ON o.customer_id = f.customer_id
GROUP BY 1, 2
)
SELECT
cohort_month,
order_month,
EXTRACT(MONTH FROM AGE(order_month, cohort_month)) as months_since_cohort,
customers
FROM cohort_data
ORDER BY 1, 2;
Use EXPLAIN:
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;
Best Practices:
| Data Type | Best Chart | Alternative |
|---|---|---|
| Trend over time | Line chart | Area chart |
| Part of whole | Pie/Donut | Stacked bar |
| Comparison | Bar chart | Column chart |
| Distribution | Histogram | Box plot |
| Correlation | Scatter plot | Heatmap |
| Geographic | Map | Choropleth |
Do:
Don't:
┌─────────────────────────────────────────────────────────────┐
│ EXECUTIVE SUMMARY │
│ [KPI 1: $X] [KPI 2: X%] [KPI 3: X] [KPI 4: X%] │
├─────────────────────────────────────────────────────────────┤
│ TRENDS │ BREAKDOWN │
│ [Line Chart - Primary Metric] │ [Bar Chart - Segments] │
│ │ │
├──────────────────────────────────┼──────────────────────────┤
│ COMPARISON │ DETAIL TABLE │
│ [Bar Chart - vs Target/LY] │ [Top N with metrics] │
│ │ │
└──────────────────────────────────┴──────────────────────────┘
import pandas as pd
import numpy as np
def describe_data(df, column):
stats = {
'count': df[column].count(),
'mean': df[column].mean(),
'median': df[column].median(),
'std': df[column].std(),
'min': df[column].min(),
'max': df[column].max(),
'q25': df[column].quantile(0.25),
'q75': df[column].quantile(0.75),
'skewness': df[column].skew(),
'kurtosis': df[column].kurtosis()
}
return stats
from scipy import stats
# T-test: Compare two groups
def compare_groups(group_a, group_b, alpha=0.05):
stat, p_value = stats.ttest_ind(group_a, group_b)
result = {
't_statistic': stat,
'p_value': p_value,
'significant': p_value < alpha,
'effect_size': (group_a.mean() - group_b.mean()) / np.sqrt(
(group_a.std()**2 + group_b.std()**2) / 2
)
}
return result
# Chi-square: Test independence
def test_independence(observed, alpha=0.05):
stat, p_value, dof, expected = stats.chi2_contingency(observed)
return {
'chi2': stat,
'p_value': p_value,
'degrees_of_freedom': dof,
'significant': p_value < alpha
}
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_absolute_error
def simple_regression(X, y):
model = LinearRegression()
model.fit(X.reshape(-1, 1), y)
predictions = model.predict(X.reshape(-1, 1))
return {
'coefficient': model.coef_[0],
'intercept': model.intercept_,
'r_squared': r2_score(y, predictions),
'mae': mean_absolute_error(y, predictions)
}
# Analysis: [Topic]
## Business Question
[What are we trying to answer?]
## Hypothesis
[What do we expect to find?]
## Data Sources
- [Source 1]: [Description]
- [Source 2]: [Description]
## Methodology
1. [Step 1]
2. [Step 2]
3. [Step 3]
## Findings
### Finding 1: [Title]
[Description with supporting data]
### Finding 2: [Title]
[Description with supporting data]
## Recommendations
1. [Recommendation]: [Expected impact]
2. [Recommendation]: [Expected impact]
## Limitations
- [Limitation 1]
- [Limitation 2]
## Next Steps
- [Action item]
Acquisition:
Engagement:
Retention:
Revenue:
1. CONTEXT
- Why does this matter?
- What question are we answering?
2. KEY FINDING
- Lead with the insight
- Make it memorable
3. EVIDENCE
- Show the data
- Use effective visuals
4. IMPLICATIONS
- What does this mean?
- So what?
5. RECOMMENDATIONS
- What should we do?
- Clear next steps
## [Headline: Action-oriented finding]
**What:** [One sentence description of the finding]
**So What:** [Why this matters to the business]
**Now What:** [Recommended action]
**Evidence:**
[Chart or data supporting the finding]
**Confidence:** [High/Medium/Low]
references/sql_patterns.md - Advanced SQL queriesreferences/visualization.md - Chart selection guidereferences/statistics.md - Statistical methodsreferences/storytelling.md - Presentation best practices# Data profiler
python scripts/data_profiler.py --table orders --output profile.html
# SQL query analyzer
python scripts/query_analyzer.py --query query.sql --explain
# Dashboard generator
python scripts/dashboard_gen.py --config dashboard.yaml
# Report automation
python scripts/report_gen.py --template monthly --output report.pdf
Weekly Installs
141
Repository
GitHub Stars
30
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode111
gemini-cli104
codex98
cursor96
github-copilot95
amp89
Excel财务建模规范与xlsx文件处理指南:专业格式、零错误公式与数据分析
42,000 周安装