scikit-learn-best-practices by mindrally/skills
npx skills add https://github.com/mindrally/skills --skill scikit-learn-best-practices专注于机器学习工作流、模型开发、评估和最佳实践的 scikit-learn 开发专家指南。
train_test_split() 并设置 random_state 以确保可复现性stratify=yStandardScalerMinMaxScalerRobustScaler广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
OneHotEncoder、OrdinalEncoder、LabelEncoderSimpleImputer、KNNImputerPipeline 来链接预处理和建模步骤from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
pipeline = Pipeline([
('scaler', StandardScaler()),
('classifier', RandomForestClassifier(random_state=42))
])
ColumnTransformer 进行不同的预处理cross_val_score() 进行快速评估cross_validate() 获取多个指标KFoldStratifiedKFoldTimeSeriesSplitGroupKFoldGridSearchCV 进行穷举搜索RandomizedSearchCVn_jobs=-1 进行并行处理accuracy_scoreprecision_score、recall_score、f1_scoreroc_auc_scoreclassification_report() 获取全面概览confusion_matrix() 进行错误分析mean_squared_error (MSE)mean_absolute_error (MAE)r2_scoreclass_weight='balanced'SelectKBestRFE(递归特征消除)SelectFromModeljoblib 保存和加载模型n_jobs=-1 进行并行处理warm_start=True 进行迭代训练partial_fit() 进行增量学习from sklearn.ensemble import RandomForestClassifierrandom_state 以确保可复现性每周安装量
99
代码仓库
GitHub 星标数
43
首次出现
2026年1月25日
安全审计
安装于
gemini-cli84
opencode82
codex78
cursor78
github-copilot74
claude-code71
Expert guidelines for scikit-learn development, focusing on machine learning workflows, model development, evaluation, and best practices.
train_test_split() with random_state for reproducibilitystratify=yStandardScaler for normally distributed featuresMinMaxScaler for bounded featuresRobustScaler for data with outliersOneHotEncoder, OrdinalEncoder, LabelEncoderSimpleImputer, KNNImputerAlways use Pipeline to chain preprocessing and modeling
Prevents data leakage by fitting transformers only on training data
Makes code cleaner and more reproducible
Enables easy deployment and serialization
from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.ensemble import RandomForestClassifier
pipeline = Pipeline([ ('scaler', StandardScaler()), ('classifier', RandomForestClassifier(random_state=42)) ])
ColumnTransformer for different preprocessing per feature typecross_val_score() for quick evaluationcross_validate() for multiple metricsKFold for regressionStratifiedKFold for classificationTimeSeriesSplit for temporal dataGroupKFold for grouped dataGridSearchCV for exhaustive searchRandomizedSearchCV for large parameter spacesn_jobs=-1 for parallel processingaccuracy_score for balanced classesprecision_score, recall_score, f1_score for imbalancedroc_auc_score for ranking abilityclassification_report() for comprehensive overviewconfusion_matrix() for error analysismean_squared_error (MSE) for general usemean_absolute_error (MAE) for interpretabilityr2_score for explained varianceclass_weight='balanced'SelectKBest with statistical testsRFE (Recursive Feature Elimination)SelectFromModeljoblib for saving and loading modelsn_jobs=-1 for parallel processing where availablewarm_start=True for iterative trainingpartial_fit() for large datafrom sklearn.ensemble import RandomForestClassifierrandom_state for reproducibilityWeekly Installs
99
Repository
GitHub Stars
43
First Seen
Jan 25, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
gemini-cli84
opencode82
codex78
cursor78
github-copilot74
claude-code71
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
49,600 周安装