tooluniverse-crispr-screen-analysis by mims-harvard/tooluniverse
npx skills add https://github.com/mims-harvard/tooluniverse --skill tooluniverse-crispr-screen-analysis通过稳健的统计分析和通路富集,分析 CRISPR-Cas9 基因筛选以识别必需基因、合成致死相互作用和治疗靶点的综合技能。
CRISPR 筛选通过系统性扰动基因并测量适应性效应,实现全基因组功能基因组学。此技能提供了一个包含 8 个阶段的工作流程,用于:
加载 sgRNA 计数矩阵(MAGeCK 格式或通用 TSV)。预期列:sgRNA、Gene,以及样本列。创建实验设计表,将样本与条件(基线/处理)及重复分配关联起来。
评估 sgRNA 分布质量:
标准化 sgRNA 计数以考虑文库大小差异:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
使用伪计数计算处理组与对照组之间的 log2 倍数变化。
两种评分方法:
比较野生型和突变型细胞系之间的必需性评分:
使用 PubMed 搜索查询 DepMap/文献以获取已知的依赖性信息。
将排名靠前的必需基因提交至 Enrichr 进行通路富集:
综合评分结合:
查询 DGIdb 以查找每个候选基因的现有药物、相互作用类型和来源。
生成 Markdown 报告,包含:
使用的关键工具:
PubMed_search - 基因必需性文献搜索Enrichr_submit_genelist - 通路富集提交Enrichr_get_results - 检索富集结果DGIdb_query_gene - 药物-基因相互作用和成药性STRING_get_network - 蛋白质相互作用网络KEGG_get_pathway - 通路可视化表达数据集成:
GEO_get_dataset - 下载表达数据ArrayExpress_get_experiment - 替代表达数据源变异数据集成:
ClinVar_query_gene - 已知致病性变异gnomAD_get_gene - 人群等位基因频率import pandas as pd
from tooluniverse import ToolUniverse
# 1. 加载数据
counts, meta = load_sgrna_counts("sgrna_counts.txt")
design = create_design_matrix(['T0_1', 'T0_2', 'T14_1', 'T14_2'],
['baseline', 'baseline', 'treatment', 'treatment'])
# 2. 处理
filtered_counts, filtered_mapping = filter_low_count_sgrnas(counts, meta['sgrna_to_gene'])
norm_counts, _ = normalize_counts(filtered_counts)
lfc, _, _ = calculate_lfc(norm_counts, design)
# 3. 基因评分
gene_scores = mageck_gene_scoring(lfc, filtered_mapping)
# 4. 富集通路
enrichment = enrich_essential_genes(gene_scores, top_n=100)
# 5. 寻找药物靶点
drug_targets = prioritize_drug_targets(gene_scores)
# 6. 生成报告
report = generate_crispr_report(gene_scores, enrichment, drug_targets)
ANALYSIS_DETAILS.md - 所有 8 个阶段的详细代码片段USE_CASES.md - 完整用例(必需性筛选、合成致死性、药物靶点发现、表达数据整合)和最佳实践EXAMPLES.md - 示例用法和快速参考QUICK_START.md - 快速入门指南FALLBACK_PATCH.md - API 问题的备用模式每周安装次数
136
代码仓库
GitHub 星标数
1.2K
首次出现
2026年2月12日
安全审计
安装于
gemini-cli130
codex130
opencode129
github-copilot127
kimi-cli122
amp122
Comprehensive skill for analyzing CRISPR-Cas9 genetic screens to identify essential genes, synthetic lethal interactions, and therapeutic targets through robust statistical analysis and pathway enrichment.
CRISPR screens enable genome-wide functional genomics by systematically perturbing genes and measuring fitness effects. This skill provides an 8-phase workflow for:
Load sgRNA count matrix (MAGeCK format or generic TSV). Expected columns: sgRNA, Gene, plus sample columns. Create experimental design table linking samples to conditions (baseline/treatment) with replicate assignments.
Assess sgRNA distribution quality:
Normalize sgRNA counts to account for library size differences:
Calculate log2 fold changes (LFC) between treatment and control conditions with pseudocount.
Two scoring approaches:
Compare essentiality scores between wildtype and mutant cell lines:
Query DepMap/literature for known dependencies using PubMed search.
Submit top essential genes to Enrichr for pathway enrichment:
Composite scoring combining:
Query DGIdb for each candidate gene to find existing drugs, interaction types, and sources.
Generate markdown report with:
Key Tools Used :
PubMed_search - Literature search for gene essentialityEnrichr_submit_genelist - Pathway enrichment submissionEnrichr_get_results - Retrieve enrichment resultsDGIdb_query_gene - Drug-gene interactions and druggabilitySTRING_get_network - Protein interaction networksKEGG_get_pathway - Pathway visualizationExpression Integration :
GEO_get_dataset - Download expression dataArrayExpress_get_experiment - Alternative expression sourceVariant Integration :
ClinVar_query_gene - Known pathogenic variantsgnomAD_get_gene - Population allele frequenciesimport pandas as pd
from tooluniverse import ToolUniverse
# 1. Load data
counts, meta = load_sgrna_counts("sgrna_counts.txt")
design = create_design_matrix(['T0_1', 'T0_2', 'T14_1', 'T14_2'],
['baseline', 'baseline', 'treatment', 'treatment'])
# 2. Process
filtered_counts, filtered_mapping = filter_low_count_sgrnas(counts, meta['sgrna_to_gene'])
norm_counts, _ = normalize_counts(filtered_counts)
lfc, _, _ = calculate_lfc(norm_counts, design)
# 3. Score genes
gene_scores = mageck_gene_scoring(lfc, filtered_mapping)
# 4. Enrich pathways
enrichment = enrich_essential_genes(gene_scores, top_n=100)
# 5. Find drug targets
drug_targets = prioritize_drug_targets(gene_scores)
# 6. Generate report
report = generate_crispr_report(gene_scores, enrichment, drug_targets)
ANALYSIS_DETAILS.md - Detailed code snippets for all 8 phasesUSE_CASES.md - Complete use cases (essentiality screen, synthetic lethality, drug target discovery, expression integration) and best practicesEXAMPLES.md - Example usage and quick referenceQUICK_START.md - Quick start guideFALLBACK_PATCH.md - Fallback patterns for API issuesWeekly Installs
136
Repository
GitHub Stars
1.2K
First Seen
Feb 12, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
gemini-cli130
codex130
opencode129
github-copilot127
kimi-cli122
amp122
Excel财务建模规范与xlsx文件处理指南:专业格式、零错误公式与数据分析
46,700 周安装