statistical-analysis by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill statistical-analysis统计分析是一个用于检验假设和量化关系的系统过程。通过假设检验(t检验、方差分析、卡方检验)、回归、相关性和贝叶斯分析,并包含假设检验和APA格式报告。此技能适用于学术研究。
此技能应在以下情况下使用:
使用此决策树确定您的分析路径:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
START
│
├─ 需要选择统计检验吗?
│ └─ 是 → 参见"检验选择指南"
│ └─ 否 → 继续
│
├─ 准备好检查假设了吗?
│ └─ 是 → 参见"假设检验"
│ └─ 否 → 继续
│
├─ 准备好运行分析了吗?
│ └─ 是 → 参见"运行统计检验"
│ └─ 否 → 继续
│
└─ 需要报告结果吗?
└─ 是 → 参见"报告结果"
使用 references/test_selection_guide.md 获取全面指导。快速参考:
比较两组:
比较三组及以上:
关系分析:
贝叶斯替代方法: 所有检验都有贝叶斯版本,提供:
references/bayesian_statistics.md在解释检验结果前,务必检查假设。
使用提供的 scripts/assumption_checks.py 模块进行自动化检查:
from scripts.assumption_checks import comprehensive_assumption_check
# 包含可视化的全面检查
results = comprehensive_assumption_check(
data=df,
value_col='score',
group_col='group', # 可选:用于组间比较
alpha=0.05
)
此操作执行:
针对特定检查,使用单独的函数:
from scripts.assumption_checks import (
check_normality,
check_normality_per_group,
check_homogeneity_of_variance,
check_linearity,
detect_outliers
)
# 示例:带可视化的正态性检查
result = check_normality(
data=df['score'],
name='Test Score',
alpha=0.05,
plot=True
)
print(result['interpretation'])
print(result['recommendation'])
正态性被违反:
方差齐性被违反:
线性被违反(回归):
参见 references/assumptions_and_diagnostics.md 获取全面指导。
统计分析的主要库:
import pingouin as pg
import numpy as np
# 运行独立样本t检验
result = pg.ttest(group_a, group_b, correction='auto')
# 提取结果
t_stat = result['T'].values[0]
df = result['dof'].values[0]
p_value = result['p-val'].values[0]
cohens_d = result['cohen-d'].values[0]
ci_lower = result['CI95%'].values[0][0]
ci_upper = result['CI95%'].values[0][1]
# 报告
print(f"t({df:.0f}) = {t_stat:.2f}, p = {p_value:.3f}")
print(f"Cohen's d = {cohens_d:.2f}, 95% CI [{ci_lower:.2f}, {ci_upper:.2f}]")
import pingouin as pg
# 单因素方差分析
aov = pg.anova(dv='score', between='group', data=df, detailed=True)
print(aov)
# 如果显著,进行事后检验
if aov['p-unc'].values[0] < 0.05:
posthoc = pg.pairwise_tukey(dv='score', between='group', data=df)
print(posthoc)
# 效应量
eta_squared = aov['np2'].values[0] # 偏η²
print(f"Partial η² = {eta_squared:.3f}")
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import variance_inflation_factor
# 拟合模型
X = sm.add_constant(X_predictors) # 添加截距
model = sm.OLS(y, X).fit()
# 摘要
print(model.summary())
# 检查多重共线性(VIF)
vif_data = pd.DataFrame()
vif_data["Variable"] = X.columns
vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
print(vif_data)
# 检查假设
residuals = model.resid
fitted = model.fittedvalues
# 残差图
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# 残差与拟合值
axes[0, 0].scatter(fitted, residuals, alpha=0.6)
axes[0, 0].axhline(y=0, color='r', linestyle='--')
axes[0, 0].set_xlabel('Fitted values')
axes[0, 0].set_ylabel('Residuals')
axes[0, 0].set_title('Residuals vs Fitted')
# Q-Q图
from scipy import stats
stats.probplot(residuals, dist="norm", plot=axes[0, 1])
axes[0, 1].set_title('Normal Q-Q')
# 尺度-位置图
axes[1, 0].scatter(fitted, np.sqrt(np.abs(residuals / residuals.std())), alpha=0.6)
axes[1, 0].set_xlabel('Fitted values')
axes[1, 0].set_ylabel('√|Standardized residuals|')
axes[1, 0].set_title('Scale-Location')
# 残差直方图
axes[1, 1].hist(residuals, bins=20, edgecolor='black', alpha=0.7)
axes[1, 1].set_xlabel('Residuals')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].set_title('Histogram of Residuals')
plt.tight_layout()
plt.show()
import pymc as pm
import arviz as az
import numpy as np
with pm.Model() as model:
# 先验分布
mu1 = pm.Normal('mu_group1', mu=0, sigma=10)
mu2 = pm.Normal('mu_group2', mu=0, sigma=10)
sigma = pm.HalfNormal('sigma', sigma=10)
# 似然函数
y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group_a)
y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group_b)
# 派生量
diff = pm.Deterministic('difference', mu1 - mu2)
# 采样
trace = pm.sample(2000, tune=1000, return_inferencedata=True)
# 汇总
print(az.summary(trace, var_names=['difference']))
# 组1大于组2的概率
prob_greater = np.mean(trace.posterior['difference'].values > 0)
print(f"P(μ₁ > μ₂ | data) = {prob_greater:.3f}")
# 绘制后验分布
az.plot_posterior(trace, var_names=['difference'], ref_val=0)
效应量量化效应大小,而p值仅表明效应存在。
参见 references/effect_sizes_and_power.md 获取全面指导。
| 检验 | 效应量 | 小 | 中 | 大 |
|---|---|---|---|---|
| T检验 | Cohen's d | 0.20 | 0.50 | 0.80 |
| 方差分析 | η²_p | 0.01 | 0.06 | 0.14 |
| 相关性 | r | 0.10 | 0.30 | 0.50 |
| 回归 | R² | 0.02 | 0.13 | 0.26 |
| 卡方检验 | Cramér's V | 0.07 | 0.21 | 0.35 |
重要提示:基准值是指导原则。具体情境很重要!
大多数效应量由pingouin自动计算:
# T检验返回Cohen's d
result = pg.ttest(x, y)
d = result['cohen-d'].values[0]
# 方差分析返回偏η²
aov = pg.anova(dv='score', between='group', data=df)
eta_p2 = aov['np2'].values[0]
# 相关性:r本身就是效应量
corr = pg.corr(x, y)
r = corr['r'].values[0]
始终报告置信区间以显示精确度:
from pingouin import compute_effsize_from_t
# 对于t检验
d, ci = compute_effsize_from_t(
t_statistic,
nx=len(group1),
ny=len(group2),
eftype='cohen'
)
print(f"d = {d:.2f}, 95% CI [{ci[0]:.2f}, {ci[1]:.2f}]")
在数据收集前确定所需样本量:
from statsmodels.stats.power import (
tt_ind_solve_power,
FTestAnovaPower
)
# T检验:检测d = 0.5需要多少n?
n_required = tt_ind_solve_power(
effect_size=0.5,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"Required n per group: {n_required:.0f}")
# 方差分析:检测f = 0.25需要多少n?
anova_power = FTestAnovaPower()
n_per_group = anova_power.solve_power(
effect_size=0.25,
ngroups=3,
alpha=0.05,
power=0.80
)
print(f"Required n per group: {n_per_group:.0f}")
确定可以检测到多大的效应:
# 每组n=50时,我们能检测到多大的效应?
detectable_d = tt_ind_solve_power(
effect_size=None, # 求解此项
nobs1=50,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"Study could detect d ≥ {detectable_d:.2f}")
注意:事后功效分析(研究后计算功效)通常不推荐。改用敏感性分析。
参见 references/effect_sizes_and_power.md 获取详细指导。
遵循 references/reporting_standards.md 中的指南。
Group A (n = 48, M = 75.2, SD = 8.5) scored significantly higher than
Group B (n = 52, M = 68.3, SD = 9.2), t(98) = 3.82, p < .001, d = 0.77,
95% CI [0.36, 1.18], two-tailed. Assumptions of normality (Shapiro-Wilk:
Group A W = 0.97, p = .18; Group B W = 0.96, p = .12) and homogeneity
of variance (Levene's F(1, 98) = 1.23, p = .27) were satisfied.
A one-way ANOVA revealed a significant main effect of treatment condition
on test scores, F(2, 147) = 8.45, p < .001, η²_p = .10. Post hoc
comparisons using Tukey's HSD indicated that Condition A (M = 78.2,
SD = 7.3) scored significantly higher than Condition B (M = 71.5,
SD = 8.1, p = .002, d = 0.87) and Condition C (M = 70.1, SD = 7.9,
p < .001, d = 1.07). Conditions B and C did not differ significantly
(p = .52, d = 0.18).
Multiple linear regression was conducted to predict exam scores from
study hours, prior GPA, and attendance. The overall model was significant,
F(3, 146) = 45.2, p < .001, R² = .48, adjusted R² = .47. Study hours
(B = 1.80, SE = 0.31, β = .35, t = 5.78, p < .001, 95% CI [1.18, 2.42])
and prior GPA (B = 8.52, SE = 1.95, β = .28, t = 4.37, p < .001,
95% CI [4.66, 12.38]) were significant predictors, while attendance was
not (B = 0.15, SE = 0.12, β = .08, t = 1.25, p = .21, 95% CI [-0.09, 0.39]).
Multicollinearity was not a concern (all VIF < 1.5).
A Bayesian independent samples t-test was conducted using weakly
informative priors (Normal(0, 1) for mean difference). The posterior
distribution indicated that Group A scored higher than Group B
(M_diff = 6.8, 95% credible interval [3.2, 10.4]). The Bayes Factor
BF₁₀ = 45.3 provided very strong evidence for a difference between
groups, with a 99.8% posterior probability that Group A's mean exceeded
Group B's mean. Convergence diagnostics were satisfactory (all R̂ < 1.01,
ESS > 1000).
在以下情况下考虑贝叶斯方法:
参见 references/bayesian_statistics.md 获取关于以下内容的全面指导:
此技能包含全面的参考资料:
comprehensive_assumption_check(): 完整工作流程check_normality(): 带Q-Q图的正态性检验check_homogeneity_of_variance(): 带箱线图的莱文检验check_linearity(): 回归线性检查detect_outliers(): IQR和z分数异常值检测开始统计分析时:
有关以下问题的疑问:
关键教材 :
在线资源 :
每周安装数
225
仓库
GitHub星标数
22.6K
首次出现
2026年1月21日
安全审计
安装于
opencode176
claude-code175
gemini-cli161
codex152
cursor149
github-copilot132
Statistical analysis is a systematic process for testing hypotheses and quantifying relationships. Conduct hypothesis tests (t-test, ANOVA, chi-square), regression, correlation, and Bayesian analyses with assumption checks and APA reporting. Apply this skill for academic research.
This skill should be used when:
Use this decision tree to determine your analysis path:
START
│
├─ Need to SELECT a statistical test?
│ └─ YES → See "Test Selection Guide"
│ └─ NO → Continue
│
├─ Ready to check ASSUMPTIONS?
│ └─ YES → See "Assumption Checking"
│ └─ NO → Continue
│
├─ Ready to run ANALYSIS?
│ └─ YES → See "Running Statistical Tests"
│ └─ NO → Continue
│
└─ Need to REPORT results?
└─ YES → See "Reporting Results"
Use references/test_selection_guide.md for comprehensive guidance. Quick reference:
Comparing Two Groups:
Comparing 3+ Groups:
Relationships:
Bayesian Alternatives: All tests have Bayesian versions that provide:
references/bayesian_statistics.mdALWAYS check assumptions before interpreting test results.
Use the provided scripts/assumption_checks.py module for automated checking:
from scripts.assumption_checks import comprehensive_assumption_check
# Comprehensive check with visualizations
results = comprehensive_assumption_check(
data=df,
value_col='score',
group_col='group', # Optional: for group comparisons
alpha=0.05
)
This performs:
For targeted checks, use individual functions:
from scripts.assumption_checks import (
check_normality,
check_normality_per_group,
check_homogeneity_of_variance,
check_linearity,
detect_outliers
)
# Example: Check normality with visualization
result = check_normality(
data=df['score'],
name='Test Score',
alpha=0.05,
plot=True
)
print(result['interpretation'])
print(result['recommendation'])
Normality violated:
Homogeneity of variance violated:
Linearity violated (regression):
See references/assumptions_and_diagnostics.md for comprehensive guidance.
Primary libraries for statistical analysis:
import pingouin as pg
import numpy as np
# Run independent t-test
result = pg.ttest(group_a, group_b, correction='auto')
# Extract results
t_stat = result['T'].values[0]
df = result['dof'].values[0]
p_value = result['p-val'].values[0]
cohens_d = result['cohen-d'].values[0]
ci_lower = result['CI95%'].values[0][0]
ci_upper = result['CI95%'].values[0][1]
# Report
print(f"t({df:.0f}) = {t_stat:.2f}, p = {p_value:.3f}")
print(f"Cohen's d = {cohens_d:.2f}, 95% CI [{ci_lower:.2f}, {ci_upper:.2f}]")
import pingouin as pg
# One-way ANOVA
aov = pg.anova(dv='score', between='group', data=df, detailed=True)
print(aov)
# If significant, conduct post-hoc tests
if aov['p-unc'].values[0] < 0.05:
posthoc = pg.pairwise_tukey(dv='score', between='group', data=df)
print(posthoc)
# Effect size
eta_squared = aov['np2'].values[0] # Partial eta-squared
print(f"Partial η² = {eta_squared:.3f}")
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import variance_inflation_factor
# Fit model
X = sm.add_constant(X_predictors) # Add intercept
model = sm.OLS(y, X).fit()
# Summary
print(model.summary())
# Check multicollinearity (VIF)
vif_data = pd.DataFrame()
vif_data["Variable"] = X.columns
vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
print(vif_data)
# Check assumptions
residuals = model.resid
fitted = model.fittedvalues
# Residual plots
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Residuals vs fitted
axes[0, 0].scatter(fitted, residuals, alpha=0.6)
axes[0, 0].axhline(y=0, color='r', linestyle='--')
axes[0, 0].set_xlabel('Fitted values')
axes[0, 0].set_ylabel('Residuals')
axes[0, 0].set_title('Residuals vs Fitted')
# Q-Q plot
from scipy import stats
stats.probplot(residuals, dist="norm", plot=axes[0, 1])
axes[0, 1].set_title('Normal Q-Q')
# Scale-Location
axes[1, 0].scatter(fitted, np.sqrt(np.abs(residuals / residuals.std())), alpha=0.6)
axes[1, 0].set_xlabel('Fitted values')
axes[1, 0].set_ylabel('√|Standardized residuals|')
axes[1, 0].set_title('Scale-Location')
# Residuals histogram
axes[1, 1].hist(residuals, bins=20, edgecolor='black', alpha=0.7)
axes[1, 1].set_xlabel('Residuals')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].set_title('Histogram of Residuals')
plt.tight_layout()
plt.show()
import pymc as pm
import arviz as az
import numpy as np
with pm.Model() as model:
# Priors
mu1 = pm.Normal('mu_group1', mu=0, sigma=10)
mu2 = pm.Normal('mu_group2', mu=0, sigma=10)
sigma = pm.HalfNormal('sigma', sigma=10)
# Likelihood
y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group_a)
y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group_b)
# Derived quantity
diff = pm.Deterministic('difference', mu1 - mu2)
# Sample
trace = pm.sample(2000, tune=1000, return_inferencedata=True)
# Summarize
print(az.summary(trace, var_names=['difference']))
# Probability that group1 > group2
prob_greater = np.mean(trace.posterior['difference'].values > 0)
print(f"P(μ₁ > μ₂ | data) = {prob_greater:.3f}")
# Plot posterior
az.plot_posterior(trace, var_names=['difference'], ref_val=0)
Effect sizes quantify magnitude, while p-values only indicate existence of an effect.
See references/effect_sizes_and_power.md for comprehensive guidance.
| Test | Effect Size | Small | Medium | Large |
|---|---|---|---|---|
| T-test | Cohen's d | 0.20 | 0.50 | 0.80 |
| ANOVA | η²_p | 0.01 | 0.06 | 0.14 |
| Correlation | r | 0.10 | 0.30 | 0.50 |
| Regression | R² | 0.02 | 0.13 | 0.26 |
| Chi-square | Cramér's V | 0.07 | 0.21 | 0.35 |
Important : Benchmarks are guidelines. Context matters!
Most effect sizes are automatically calculated by pingouin:
# T-test returns Cohen's d
result = pg.ttest(x, y)
d = result['cohen-d'].values[0]
# ANOVA returns partial eta-squared
aov = pg.anova(dv='score', between='group', data=df)
eta_p2 = aov['np2'].values[0]
# Correlation: r is already an effect size
corr = pg.corr(x, y)
r = corr['r'].values[0]
Always report CIs to show precision:
from pingouin import compute_effsize_from_t
# For t-test
d, ci = compute_effsize_from_t(
t_statistic,
nx=len(group1),
ny=len(group2),
eftype='cohen'
)
print(f"d = {d:.2f}, 95% CI [{ci[0]:.2f}, {ci[1]:.2f}]")
Determine required sample size before data collection:
from statsmodels.stats.power import (
tt_ind_solve_power,
FTestAnovaPower
)
# T-test: What n is needed to detect d = 0.5?
n_required = tt_ind_solve_power(
effect_size=0.5,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"Required n per group: {n_required:.0f}")
# ANOVA: What n is needed to detect f = 0.25?
anova_power = FTestAnovaPower()
n_per_group = anova_power.solve_power(
effect_size=0.25,
ngroups=3,
alpha=0.05,
power=0.80
)
print(f"Required n per group: {n_per_group:.0f}")
Determine what effect size you could detect:
# With n=50 per group, what effect could we detect?
detectable_d = tt_ind_solve_power(
effect_size=None, # Solve for this
nobs1=50,
alpha=0.05,
power=0.80,
ratio=1.0,
alternative='two-sided'
)
print(f"Study could detect d ≥ {detectable_d:.2f}")
Note : Post-hoc power analysis (calculating power after study) is generally not recommended. Use sensitivity analysis instead.
See references/effect_sizes_and_power.md for detailed guidance.
Follow guidelines in references/reporting_standards.md.
Group A (n = 48, M = 75.2, SD = 8.5) scored significantly higher than
Group B (n = 52, M = 68.3, SD = 9.2), t(98) = 3.82, p < .001, d = 0.77,
95% CI [0.36, 1.18], two-tailed. Assumptions of normality (Shapiro-Wilk:
Group A W = 0.97, p = .18; Group B W = 0.96, p = .12) and homogeneity
of variance (Levene's F(1, 98) = 1.23, p = .27) were satisfied.
A one-way ANOVA revealed a significant main effect of treatment condition
on test scores, F(2, 147) = 8.45, p < .001, η²_p = .10. Post hoc
comparisons using Tukey's HSD indicated that Condition A (M = 78.2,
SD = 7.3) scored significantly higher than Condition B (M = 71.5,
SD = 8.1, p = .002, d = 0.87) and Condition C (M = 70.1, SD = 7.9,
p < .001, d = 1.07). Conditions B and C did not differ significantly
(p = .52, d = 0.18).
Multiple linear regression was conducted to predict exam scores from
study hours, prior GPA, and attendance. The overall model was significant,
F(3, 146) = 45.2, p < .001, R² = .48, adjusted R² = .47. Study hours
(B = 1.80, SE = 0.31, β = .35, t = 5.78, p < .001, 95% CI [1.18, 2.42])
and prior GPA (B = 8.52, SE = 1.95, β = .28, t = 4.37, p < .001,
95% CI [4.66, 12.38]) were significant predictors, while attendance was
not (B = 0.15, SE = 0.12, β = .08, t = 1.25, p = .21, 95% CI [-0.09, 0.39]).
Multicollinearity was not a concern (all VIF < 1.5).
A Bayesian independent samples t-test was conducted using weakly
informative priors (Normal(0, 1) for mean difference). The posterior
distribution indicated that Group A scored higher than Group B
(M_diff = 6.8, 95% credible interval [3.2, 10.4]). The Bayes Factor
BF₁₀ = 45.3 provided very strong evidence for a difference between
groups, with a 99.8% posterior probability that Group A's mean exceeded
Group B's mean. Convergence diagnostics were satisfactory (all R̂ < 1.01,
ESS > 1000).
Consider Bayesian approaches when:
See references/bayesian_statistics.md for comprehensive guidance on:
This skill includes comprehensive reference materials:
comprehensive_assumption_check(): Complete workflowcheck_normality(): Normality testing with Q-Q plotscheck_homogeneity_of_variance(): Levene's test with box plotscheck_linearity(): Regression linearity checksdetect_outliers(): IQR and z-score outlier detectionWhen beginning a statistical analysis:
For questions about:
Key textbooks :
Online resources :
Weekly Installs
225
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode176
claude-code175
gemini-cli161
codex152
cursor149
github-copilot132
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
56,200 周安装
LarkSuite Whiteboard CLI 工具:自动化生成专业图表与画板,支持DSL和Mermaid
15,500 周安装
UI/UX原型提示词生成器 - 为AI和开发者创建结构化设计规范
264 周安装
SEO优化器 | 关键词研究、页面优化、技术审计、排名跟踪一站式解决方案
264 周安装
AudioCraft音频生成教程:使用Meta MusicGen和AudioGen实现文本到音乐/音效
264 周安装
UX文案助手:AI驱动用户体验文案撰写与评审工具,提升界面文案质量
264 周安装
品牌塑造策略指南:品牌战略、故事叙述、声调与视觉识别完整框架
264 周安装