data-analyst by ailabs-393/ai-labs-claude-skills
npx skills add https://github.com/ailabs-393/ai-labs-claude-skills --skill data-analyst此技能为 CSV 数据集上的数据分析工作流提供全面的能力。它能自动分析缺失值模式,使用适当的统计方法智能填补缺失数据,并创建交互式 Plotly Dash 仪表板以可视化趋势和模式。该技能将自动化的缺失值处理与丰富的交互式可视化相结合,以支持端到端的探索性数据分析。
data-analyst 技能提供三种主要能力,可以独立使用或作为完整工作流的一部分:
自动检测和分析数据集中的缺失值,识别模式并建议最优的填补策略。
应用针对每列数据类型和分布特征定制的复杂填补方法。
生成包含多种可视化类型的综合 Plotly Dash 仪表板,用于趋势分析和探索。
当用户请求包含缺失值处理和可视化的完整数据分析时,请遵循此工作流:
运行缺失值分析脚本以了解数据质量:
python3 scripts/analyze_missing_values.py <input_file.csv> <output_analysis.json>
此脚本的作用:
查看输出以了解:
基于分析应用自动填补:
python3 scripts/impute_missing_values.py <input_file.csv> <analysis.json> <output_imputed.csv>
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
此脚本的作用:
脚本自动执行以下操作:
生成交互式 Plotly Dash 仪表板:
python3 scripts/create_dashboard.py <imputed_file.csv> <output_dir> <port>
示例:
python3 scripts/create_dashboard.py data_imputed.csv ./visualizations 8050
此脚本的作用:
访问仪表板:http://127.0.0.1:8050(或指定的端口)
当用户希望在不进行填补的情况下了解数据质量时:
python3 scripts/analyze_missing_values.py data.csv
查看控制台输出以了解缺失值模式并获取建议。
当用户有一个包含缺失值的数据集并希望获得清理后的数据时:
python3 scripts/impute_missing_values.py data.csv
此操作在一个步骤中执行分析和填补,生成 data_imputed.csv。
当用户有一个干净的数据集并希望获得交互式可视化时:
python3 scripts/create_dashboard.py clean_data.csv ./visualizations 8050
此操作创建一个完整的仪表板,无需任何预处理。
当用户希望审查并调整填补策略时:
首先运行分析:
python3 scripts/analyze_missing_values.py data.csv analysis.json
查看 analysis.json 并与用户讨论策略
如果需要,修改脚本中的填补逻辑或参数
运行填补:
python3 scripts/impute_missing_values.py data.csv analysis.json data_imputed.csv
该技能根据数据特征使用智能填补策略。主要方法包括:
有关每种方法适用场景的详细信息,请参阅 references/imputation_methods.md。
交互式仪表板包括:
在使用此技能之前,请确保已安装依赖项:
pip install -r requirements.txt
所需软件包:
pandas - 数据操作和分析numpy - 数值计算scikit-learn - KNN 填补plotly - 交互式可视化dash - Web 仪表板框架dash-bootstrap-components - 仪表板样式脚本会自动标记缺失值 >50% 的列。选项:
如果一列包含混合类型(例如,数字和文本):
对于行数 <50 的数据集:
对于具有不规则时间戳的时间序列:
安装依赖项:pip install -r requirements.txt
指定不同的端口:python3 scripts/create_dashboard.py data.csv ./viz 8051
KNN 对于大型数据集计算密集。对于 >5 万行,考虑:
analyze_missing_values.py - 全面的缺失值分析,带有自动策略推荐impute_missing_values.py - 使用针对数据特征定制的多种方法进行智能填补create_dashboard.py - 交互式 Plotly Dash 仪表板生成器,包含多种可视化类型imputation_methods.md - 关于缺失值填补策略、决策框架和最佳实践的详细指南requirements.txt - 该技能的 Python 依赖项每周安装次数
99
仓库
GitHub 星标数
322
首次出现
2026年1月23日
安全审计
安装于
opencode84
gemini-cli72
codex71
cursor70
claude-code64
github-copilot57
This skill provides comprehensive capabilities for data analysis workflows on CSV datasets. It automatically analyzes missing value patterns, intelligently imputes missing data using appropriate statistical methods, and creates interactive Plotly Dash dashboards for visualizing trends and patterns. The skill combines automated missing value handling with rich interactive visualizations to support end-to-end exploratory data analysis.
The data-analyst skill provides three main capabilities that can be used independently or as a complete workflow:
Automatically detect and analyze missing values in datasets, identifying patterns and suggesting optimal imputation strategies.
Apply sophisticated imputation methods tailored to each column's data type and distribution characteristics.
Generate comprehensive Plotly Dash dashboards with multiple visualization types for trend analysis and exploration.
When a user requests complete data analysis with missing value handling and visualization, follow this workflow:
Run the missing value analysis script to understand the data quality:
python3 scripts/analyze_missing_values.py <input_file.csv> <output_analysis.json>
What this does :
Review the output to understand:
Apply automatic imputation based on the analysis:
python3 scripts/impute_missing_values.py <input_file.csv> <analysis.json> <output_imputed.csv>
What this does :
The script automatically :
Generate an interactive Plotly Dash dashboard:
python3 scripts/create_dashboard.py <imputed_file.csv> <output_dir> <port>
Example :
python3 scripts/create_dashboard.py data_imputed.csv ./visualizations 8050
What this does :
Access the dashboard at http://127.0.0.1:8050 (or specified port)
When the user wants to understand data quality without imputation:
python3 scripts/analyze_missing_values.py data.csv
Review the console output to understand missing value patterns and get recommendations.
When the user has a dataset with missing values and wants cleaned data:
python3 scripts/impute_missing_values.py data.csv
This performs analysis and imputation in one step, producing data_imputed.csv.
When the user has a clean dataset and wants interactive visualizations:
python3 scripts/create_dashboard.py clean_data.csv ./visualizations 8050
This creates a full dashboard without any preprocessing.
When the user wants to review and adjust imputation strategies:
Run analysis first:
python3 scripts/analyze_missing_values.py data.csv analysis.json
Review analysis.json and discuss strategies with the user
If needed, modify the imputation logic or parameters in the script
Run imputation:
python3 scripts/impute_missing_values.py data.csv analysis.json data_imputed.csv
The skill uses intelligent imputation strategies based on data characteristics. Key methods include:
For detailed information about when each method is appropriate, refer to references/imputation_methods.md.
The interactive dashboard includes:
Before using the skill, ensure dependencies are installed:
pip install -r requirements.txt
Required packages:
pandas - Data manipulation and analysisnumpy - Numerical computingscikit-learn - KNN imputationplotly - Interactive visualizationsdash - Web dashboard frameworkdash-bootstrap-components - Dashboard stylingThe scripts automatically flag columns with >50% missing values. Options:
If a column contains mixed types (e.g., numbers and text):
For datasets with <50 rows:
For time series with irregular timestamps:
Install dependencies: pip install -r requirements.txt
Specify a different port: python3 scripts/create_dashboard.py data.csv ./viz 8051
KNN is computationally intensive for large datasets. For >50k rows, consider:
analyze_missing_values.py - Comprehensive missing value analysis with automatic strategy recommendationimpute_missing_values.py - Intelligent imputation using multiple methods tailored to data characteristicscreate_dashboard.py - Interactive Plotly Dash dashboard generator with multiple visualization typesimputation_methods.md - Detailed guide to missing value imputation strategies, decision frameworks, and best practicesrequirements.txt - Python dependencies for the skillWeekly Installs
99
Repository
GitHub Stars
322
First Seen
Jan 23, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode84
gemini-cli72
codex71
cursor70
claude-code64
github-copilot57
Python PDF处理教程:合并拆分、提取文本表格、创建PDF文件
63,700 周安装