deepTools：NGS数据分析工具包 | ChIP-seq/RNA-seq/ATAC-seq质量控制与可视化

deeptools by davila7/claude-code-templates

176 周安装量

24,200 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/davila7/claude-code-templates --skill deeptools

数据分析科研工具生物信息学

🇨🇳中文介绍

deepTools: NGS 数据分析工具包

概述

deepTools 是一套全面的 Python 命令行工具集，专为处理和分析高通量测序数据而设计。使用 deepTools 可以执行质量控制、数据标准化、样本比较，并为 ChIP-seq、RNA-seq、ATAC-seq、MNase-seq 及其他 NGS 实验生成可用于发表的图表。

核心功能：

将 BAM 比对文件转换为标准化的覆盖度轨迹文件（bigWig/bedGraph）
质量控制评估（指纹图谱、相关性、覆盖度）
样本比较和相关性分析
围绕基因组特征生成热图和谱线图
富集分析和峰区域可视化

何时使用此技能

在以下情况应使用此技能：

文件转换："将 BAM 转换为 bigWig"、"生成覆盖度轨迹"、"标准化 ChIP-seq 数据"
质量控制："检查 ChIP 质量"、"比较重复样本"、"评估测序深度"、"QC 分析"
可视化："在 TSS 周围创建热图"、"绘制 ChIP 信号"、"可视化富集"、"生成谱线图"
样本比较："比较处理组与对照组"、"关联样本"、"PCA 分析"
分析流程："分析 ChIP-seq 数据"、"RNA-seq 覆盖度"、"ATAC-seq 分析"、"完整流程"
处理特定文件类型：基因组学上下文中的 BAM 文件、bigWig 文件、BED 区域文件

快速开始

对于 deepTools 的新用户，请从文件验证和常见工作流程开始：

1. 验证输入文件

在运行任何分析之前，使用验证脚本验证 BAM、bigWig 和 BED 文件：

python scripts/validate_files.py --bam sample1.bam sample2.bam --bed regions.bed

这将检查文件是否存在、BAM 索引以及格式正确性。

2. 生成工作流程模板

对于标准分析，使用工作流程生成器创建自定义脚本：

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

ChIP-seq 质量控制工作流程

当用户请求 ChIP-seq QC 或质量评估时：

生成工作流程脚本 使用 scripts/workflow_generator.py chipseq_qc
关键 QC 步骤：
- 样本相关性（multiBamSummary + plotCorrelation）
- PCA 分析（plotPCA）
- 覆盖度评估（plotCoverage）
- 片段大小验证（bamPEFragmentSize）
- ChIP 富集强度（plotFingerprint）

相关性：重复样本应聚集在一起，具有高相关性（>0.9）
指纹图谱：强 ChIP 信号显示陡峭上升；平坦的对角线表示富集效果差
覆盖度：评估测序深度是否足以进行分析

完整工作流程详情见 references/workflows.md → "ChIP-seq 质量控制工作流程"

ChIP-seq 完整分析工作流程

用于从 BAM 到可视化的完整 ChIP-seq 分析：

生成覆盖度轨迹 并进行标准化（bamCoverage）
创建比较轨迹（bamCompare 用于 log2 比值）
计算特征周围的信号矩阵（computeMatrix）
生成可视化图表（plotHeatmap, plotProfile）
峰处的富集分析（plotEnrichment）

使用 scripts/workflow_generator.py chipseq_analysis 生成模板。

完整命令序列见 references/workflows.md → "ChIP-seq 分析工作流程"

RNA-seq 覆盖度工作流程

用于链特异性 RNA-seq 覆盖度轨迹：

使用 bamCoverage 配合 --filterRNAstrand 来分离正链和反链。

重要提示： 对于 RNA-seq，切勿使用 --extendReads（会延伸跨越剪接连接点）。

使用标准化：固定 bin 使用 CPM，基因水平分析使用 RPKM。

可用模板：scripts/workflow_generator.py rnaseq_coverage

详情见 references/workflows.md → "RNA-seq 覆盖度工作流程"

ATAC-seq 分析工作流程

ATAC-seq 需要 Tn5 偏移校正：

偏移读取 使用 alignmentSieve 配合 --ATACshift
生成覆盖度 使用 bamCoverage
分析片段大小（期望出现核小体阶梯模式）
在峰处可视化（如果可用）

模板：scripts/workflow_generator.py atacseq

完整工作流程见 references/workflows.md → "ATAC-seq 工作流程"

工具类别和常见任务

将 BAM 转换为标准化覆盖度：

bamCoverage --bam input.bam --outFileName output.bw \
    --normalizeUsing RPGC --effectiveGenomeSize 2913022398 \
    --binSize 10 --numberOfProcessors 8

比较两个样本（log2 比值）：

bamCompare -b1 treatment.bam -b2 control.bam -o ratio.bw \
    --operation log2 --scaleFactorsMethod readCount

关键工具： bamCoverage, bamCompare, multiBamSummary, multiBigwigSummary, correctGCBias, alignmentSieve

完整参考：references/tools_reference.md → "BAM 和 bigWig 文件处理工具"

检查 ChIP 富集：

plotFingerprint -b input.bam chip.bam -o fingerprint.png \
    --extendReads 200 --ignoreDuplicates

样本相关性：

multiBamSummary bins --bamfiles *.bam -o counts.npz
plotCorrelation -in counts.npz --corMethod pearson \
    --whatToShow heatmap -o correlation.png

关键工具： plotFingerprint, plotCoverage, plotCorrelation, plotPCA, bamPEFragmentSize

完整参考：references/tools_reference.md → "质量控制工具"

在 TSS 周围创建热图：

# 计算矩阵
computeMatrix reference-point -S signal.bw -R genes.bed \
    -b 3000 -a 3000 --referencePoint TSS -o matrix.gz

# 生成热图
plotHeatmap -m matrix.gz -o heatmap.png \
    --colorMap RdBu --kmeans 3

创建谱线图：

plotProfile -m matrix.gz -o profile.png \
    --plotType lines --colors blue red

关键工具： computeMatrix, plotHeatmap, plotProfile, plotEnrichment

完整参考：references/tools_reference.md → "可视化工具"

选择正确的标准化方法对于有效的比较至关重要。请查阅 references/normalization_methods.md 获取全面指导。

快速选择指南：

ChIP-seq 覆盖度：使用 RPGC 或 CPM
ChIP-seq 比较：使用 bamCompare 配合 log2 和 readCount
RNA-seq bins：使用 CPM
RNA-seq 基因：使用 RPKM（考虑基因长度）
ATAC-seq：使用 RPGC 或 CPM

标准化方法：

RPGC：1× 基因组覆盖度（需要 --effectiveGenomeSize）
CPM：每百万映射读取的计数
RPKM：每千碱基每百万读取（考虑区域长度）
BPM：每百万 bin
None：原始计数（不建议用于比较）

完整解释：references/normalization_methods.md

有效基因组大小

RPGC 标准化需要有效基因组大小。常用值：

生物体	组装版本	大小	用法
人类	GRCh38/hg38	2,913,022,398	`--effectiveGenomeSize 2913022398`
小鼠	GRCm38/mm10	2,652,783,500	`--effectiveGenomeSize 2652783500`
斑马鱼	GRCz11	1,368,780,147	`--effectiveGenomeSize 1368780147`
果蝇	dm6	142,573,017	`--effectiveGenomeSize 142573017`
秀丽隐杆线虫	ce10/ce11	100,286,401	`--effectiveGenomeSize 100286401`

包含读长特异性值的完整表格：references/effective_genome_sizes.md

跨工具通用参数

许多 deepTools 命令共享这些选项：

--numberOfProcessors, -p：启用并行处理（始终使用可用核心）
--region：处理特定区域进行测试（例如 chr1:1-1000000）

--ignoreDuplicates：移除 PCR 重复（建议用于大多数分析）
--minMappingQuality：按比对质量过滤（例如 --minMappingQuality 10）
--minFragmentLength / --maxFragmentLength：片段长度边界
--samFlagInclude / --samFlagExclude：SAM 标志过滤

--extendReads：延伸至片段长度（ChIP-seq：是，RNA-seq：否）
--centerReads：在片段中点居中，以获得更清晰的信号

始终首先验证文件 使用 scripts/validate_files.py 检查：

文件存在性和可读性
BAM 索引存在（.bai 文件）
BED 格式正确性
文件大小合理

从 QC 开始：在进行下一步之前运行相关性、覆盖度和指纹图谱分析
在小区域上测试：使用 --region chr1:1-10000000 进行参数测试
记录命令：保存完整的命令行以实现可重复性
使用一致的标准化：在比较中对所有样本应用相同的方法
验证基因组组装：确保 BAM 和 BED 文件使用匹配的基因组构建版本

始终延伸读取 对于 ChIP-seq：--extendReads 200
移除重复：在大多数情况下使用 --ignoreDuplicates
首先检查富集：在详细分析之前运行 plotFingerprint
GC 校正：仅在检测到显著偏差时应用；GC 校正后切勿使用 --ignoreDuplicates

切勿延伸读取 对于 RNA-seq（会跨越剪接连接点）
链特异性：对于链特异性文库使用 --filterRNAstrand forward/reverse
标准化：bins 使用 CPM，基因使用 RPKM

应用 Tn5 校正：使用 alignmentSieve 配合 --ATACshift
片段过滤：设置适当的 min/max 片段长度
检查核小体模式：片段大小图应显示阶梯模式

使用多个处理器：--numberOfProcessors 8（或可用核心）
增加 bin 大小 以加快处理速度并减小文件大小
单独处理染色体 用于内存受限的系统
预过滤 BAM 文件 使用 alignmentSieve 创建可重用的过滤文件
使用 bigWig 而非 bedGraph：压缩且处理速度更快

BAM 索引缺失：

samtools index input.bam

内存不足： 使用 --region 单独处理染色体：

bamCoverage --bam input.bam -o chr1.bw --region chr1

处理速度慢： 增加 --numberOfProcessors 和/或增加 --binSize

bigWig 文件过大： 增加 bin 大小：--binSize 50 或更大

运行验证脚本以识别问题：

python scripts/validate_files.py --bam *.bam --bed regions.bed

脚本输出中解释了常见错误和解决方案。

此技能包含全面的参考文档：

references/tools_reference.md

按类别组织的所有 deepTools 命令的完整文档：

BAM 和 bigWig 处理工具（9 个工具）
质量控制工具（6 个工具）
可视化工具（3 个工具）
杂项工具（2 个工具）

每个工具包括：

目的和概述
带有关键参数解释
使用示例
重要说明和最佳实践

何时使用此参考： 用户询问特定工具、参数或详细用法时。

references/workflows.md

常见分析的完整工作流程示例：

ChIP-seq 质量控制工作流程
ChIP-seq 完整分析工作流程
RNA-seq 覆盖度工作流程
ATAC-seq 分析工作流程
多样本比较工作流程
峰区域分析工作流程
故障排除和性能提示

何时使用此参考： 用户需要完整的分析流程或工作流程示例时。

references/normalization_methods.md

标准化方法的全面指南：

每种方法的详细解释（RPGC、CPM、RPKM、BPM 等）
何时使用每种方法
公式和解读
按实验类型的选择指南
常见陷阱和解决方案
快速参考表

何时使用此参考： 用户询问标准化、比较样本或使用哪种方法时。

references/effective_genome_sizes.md

有效基因组大小值及用法：

常见生物体值（人类、小鼠、果蝇、线虫、斑马鱼）
读长特异性值
计算方法
在命令中何时以及如何使用
自定义基因组计算说明

何时使用此参考： 用户需要用于 RPGC 标准化或 GC 偏差校正的基因组大小时。

scripts/validate_files.py

验证用于 deepTools 分析的 BAM、bigWig 和 BED 文件。检查文件存在性、索引和格式。

python scripts/validate_files.py --bam sample1.bam sample2.bam \
    --bed peaks.bed --bigwig signal.bw

何时使用： 在开始任何分析之前，或在故障排除错误时。

scripts/workflow_generator.py

为常见的 deepTools 工作流程生成可定制的 bash 脚本模板。

可用工作流程：

chipseq_qc：ChIP-seq 质量控制
chipseq_analysis：完整的 ChIP-seq 分析
rnaseq_coverage：链特异性 RNA-seq 覆盖度
atacseq：带 Tn5 校正的 ATAC-seq

# 列出工作流程
python scripts/workflow_generator.py --list

# 生成工作流程
python scripts/workflow_generator.py chipseq_qc -o qc.sh \
    --input-bam Input.bam --chip-bams "ChIP1.bam ChIP2.bam" \
    --genome-size 2913022398 --threads 8

# 运行生成的工作流程
chmod +x qc.sh
./qc.sh

何时使用： 用户请求标准工作流程或需要模板脚本进行自定义时。

assets/quick_reference.md

快速参考卡，包含最常用的命令、有效基因组大小和典型工作流程模式。

何时使用： 用户需要快速命令示例而无需详细文档时。

从安装验证开始
使用 scripts/validate_files.py 验证输入文件
根据实验类型推荐适当的工作流程
使用 scripts/workflow_generator.py 生成工作流程模板
指导其进行自定义和执行

对于有经验的用户

为请求的操作提供特定的工具命令
参考 references/tools_reference.md 中的适当部分
建议优化和最佳实践
提供问题故障排除

"将 BAM 转换为 bigWig"：

使用 bamCoverage 配合适当的标准化
根据用例推荐 RPGC 或 CPM
提供生物体的有效基因组大小
建议相关参数（extendReads、ignoreDuplicates、binSize）

"检查 ChIP 质量"：

运行完整的 QC 工作流程或专门使用 plotFingerprint
解释结果解读
根据结果建议后续操作

"创建热图"：

指导两步过程：computeMatrix → plotHeatmap
帮助选择适当的矩阵模式（reference-point 与 scale-regions）
建议可视化参数和聚类选项

"比较样本"：

推荐 bamCompare 用于双样本比较
建议 multiBamSummary + plotCorrelation 用于多样本
指导标准化方法选择

当用户需要详细信息时：

工具详情：指向 references/tools_reference.md 中的特定部分
工作流程：使用 references/workflows.md 获取完整的分析流程
标准化：查阅 references/normalization_methods.md 进行方法选择
基因组大小：参考 references/effective_genome_sizes.md

使用 grep 模式搜索参考：

# 查找工具文档
grep -A 20 "^### toolname" references/tools_reference.md

# 查找工作流程
grep -A 50 "^## Workflow Name" references/workflows.md

# 查找标准化方法
grep -A 15 "^### Method Name" references/normalization_methods.md

用户："我需要分析我的 ChIP-seq 数据"

询问可用文件（BAM 文件、峰、基因）
使用验证脚本验证文件
生成 chipseq_analysis 工作流程模板
为其特定文件和生物体进行自定义
在脚本运行时解释每个步骤

用户："我应该使用哪种标准化？"

询问实验类型（ChIP-seq、RNA-seq 等）
询问比较目标（样本内或样本间）
查阅 references/normalization_methods.md 选择指南
推荐适当的方法并说明理由
提供带参数的命令示例

用户："在 TSS 周围创建热图"

验证 bigWig 和基因 BED 文件是否可用
使用 computeMatrix 配合 reference-point 模式在 TSS 处
使用适当的可视化参数生成 plotHeatmap
如果数据集较大，建议进行聚类
提供谱线图作为补充

首先验证文件：始终在分析之前验证输入文件
标准化很重要：为比较类型选择适当的方法
谨慎延伸读取：ChIP-seq 是，RNA-seq 否
使用所有核心：将 --numberOfProcessors 设置为可用核心数
在区域上测试：使用 --region 进行参数测试
首先检查 QC：在详细分析之前运行质量控制
记录所有内容：保存命令以实现可重复性
参考文档：使用全面的参考文档获取详细指导

2026 年 1 月 21 日

🇺🇸English

deepTools: NGS Data Analysis Toolkit

Overview

deepTools is a comprehensive suite of Python command-line tools designed for processing and analyzing high-throughput sequencing data. Use deepTools to perform quality control, normalize data, compare samples, and generate publication-quality visualizations for ChIP-seq, RNA-seq, ATAC-seq, MNase-seq, and other NGS experiments.

Core capabilities:

Convert BAM alignments to normalized coverage tracks (bigWig/bedGraph)
Quality control assessment (fingerprint, correlation, coverage)
Sample comparison and correlation analysis
Heatmap and profile plot generation around genomic features
Enrichment analysis and peak region visualization

When to Use This Skill

This skill should be used when:

File conversion : "Convert BAM to bigWig", "generate coverage tracks", "normalize ChIP-seq data"
Quality control : "check ChIP quality", "compare replicates", "assess sequencing depth", "QC analysis"
Visualization : "create heatmap around TSS", "plot ChIP signal", "visualize enrichment", "generate profile plot"
Sample comparison : "compare treatment vs control", "correlate samples", "PCA analysis"
Analysis workflows : "analyze ChIP-seq data", "RNA-seq coverage", "ATAC-seq analysis", "complete workflow"
Working with specific file types : BAM files, bigWig files, BED region files in genomics context

Quick Start

For users new to deepTools, start with file validation and common workflows:

1. Validate Input Files

Before running any analysis, validate BAM, bigWig, and BED files using the validation script:

python scripts/validate_files.py --bam sample1.bam sample2.bam --bed regions.bed

This checks file existence, BAM indices, and format correctness.

2. Generate Workflow Template

For standard analyses, use the workflow generator to create customized scripts:

# List available workflows
python scripts/workflow_generator.py --list

# Generate ChIP-seq QC workflow
python scripts/workflow_generator.py chipseq_qc -o qc_workflow.sh \
    --input-bam Input.bam --chip-bams "ChIP1.bam ChIP2.bam" \
    --genome-size 2913022398

# Make executable and run
chmod +x qc_workflow.sh
./qc_workflow.sh

3. Most Common Operations

See assets/quick_reference.md for frequently used commands and parameters.

Installation

uv pip install deeptools

Core Workflows

deepTools workflows typically follow this pattern: QC → Normalization → Comparison/Visualization

ChIP-seq Quality Control Workflow

When users request ChIP-seq QC or quality assessment:

Generate workflow script using scripts/workflow_generator.py chipseq_qc
Key QC steps :
- Sample correlation (multiBamSummary + plotCorrelation)
- PCA analysis (plotPCA)
- Coverage assessment (plotCoverage)
- Fragment size validation (bamPEFragmentSize)
- ChIP enrichment strength (plotFingerprint)

Interpreting results:

Correlation : Replicates should cluster together with high correlation (>0.9)
Fingerprint : Strong ChIP shows steep rise; flat diagonal indicates poor enrichment
Coverage : Assess if sequencing depth is adequate for analysis

Full workflow details in references/workflows.md → "ChIP-seq Quality Control Workflow"

ChIP-seq Complete Analysis Workflow

For full ChIP-seq analysis from BAM to visualizations:

Generate coverage tracks with normalization (bamCoverage)
Create comparison tracks (bamCompare for log2 ratio)
Compute signal matrices around features (computeMatrix)
Generate visualizations (plotHeatmap, plotProfile)
Enrichment analysis at peaks (plotEnrichment)

Use scripts/workflow_generator.py chipseq_analysis to generate template.

Complete command sequences in references/workflows.md → "ChIP-seq Analysis Workflow"

RNA-seq Coverage Workflow

For strand-specific RNA-seq coverage tracks:

Use bamCoverage with --filterRNAstrand to separate forward and reverse strands.

Important: NEVER use --extendReads for RNA-seq (would extend over splice junctions).

Use normalization: CPM for fixed bins, RPKM for gene-level analysis.

Template available: scripts/workflow_generator.py rnaseq_coverage

Details in references/workflows.md → "RNA-seq Coverage Workflow"

ATAC-seq Analysis Workflow

ATAC-seq requires Tn5 offset correction:

Shift reads using alignmentSieve with --ATACshift
Generate coverage with bamCoverage
Analyze fragment sizes (expect nucleosome ladder pattern)
Visualize at peaks if available

Template: scripts/workflow_generator.py atacseq

Full workflow in references/workflows.md → "ATAC-seq Workflow"

Tool Categories and Common Tasks

BAM/bigWig Processing

Convert BAM to normalized coverage:

bamCoverage --bam input.bam --outFileName output.bw \
    --normalizeUsing RPGC --effectiveGenomeSize 2913022398 \
    --binSize 10 --numberOfProcessors 8

Compare two samples (log2 ratio):

bamCompare -b1 treatment.bam -b2 control.bam -o ratio.bw \
    --operation log2 --scaleFactorsMethod readCount

Key tools: bamCoverage, bamCompare, multiBamSummary, multiBigwigSummary, correctGCBias, alignmentSieve

Complete reference: references/tools_reference.md → "BAM and bigWig File Processing Tools"

Quality Control

Check ChIP enrichment:

plotFingerprint -b input.bam chip.bam -o fingerprint.png \
    --extendReads 200 --ignoreDuplicates

Sample correlation:

multiBamSummary bins --bamfiles *.bam -o counts.npz
plotCorrelation -in counts.npz --corMethod pearson \
    --whatToShow heatmap -o correlation.png

Key tools: plotFingerprint, plotCoverage, plotCorrelation, plotPCA, bamPEFragmentSize

Complete reference: references/tools_reference.md → "Quality Control Tools"

Visualization

Create heatmap around TSS:

# Compute matrix
computeMatrix reference-point -S signal.bw -R genes.bed \
    -b 3000 -a 3000 --referencePoint TSS -o matrix.gz

# Generate heatmap
plotHeatmap -m matrix.gz -o heatmap.png \
    --colorMap RdBu --kmeans 3

Create profile plot:

plotProfile -m matrix.gz -o profile.png \
    --plotType lines --colors blue red

Key tools: computeMatrix, plotHeatmap, plotProfile, plotEnrichment

Complete reference: references/tools_reference.md → "Visualization Tools"

Normalization Methods

Choosing the correct normalization is critical for valid comparisons. Consult references/normalization_methods.md for comprehensive guidance.

Quick selection guide:

ChIP-seq coverage : Use RPGC or CPM
ChIP-seq comparison : Use bamCompare with log2 and readCount
RNA-seq bins : Use CPM
RNA-seq genes : Use RPKM (accounts for gene length)
ATAC-seq : Use RPGC or CPM

Normalization methods:

RPGC : 1× genome coverage (requires --effectiveGenomeSize)
CPM : Counts per million mapped reads
RPKM : Reads per kb per million (accounts for region length)
BPM : Bins per million
None : Raw counts (not recommended for comparisons)

Full explanation: references/normalization_methods.md

Effective Genome Sizes

RPGC normalization requires effective genome size. Common values:

Organism	Assembly	Size	Usage
Human	GRCh38/hg38	2,913,022,398	`--effectiveGenomeSize 2913022398`
Mouse	GRCm38/mm10	2,652,783,500	`--effectiveGenomeSize 2652783500`
Zebrafish	GRCz11	1,368,780,147	`--effectiveGenomeSize 1368780147`
Drosophila	dm6	142,573,017	`--effectiveGenomeSize 142573017`

Complete table with read-length-specific values: references/effective_genome_sizes.md

Common Parameters Across Tools

Many deepTools commands share these options:

Performance:

--numberOfProcessors, -p: Enable parallel processing (always use available cores)
--region: Process specific regions for testing (e.g., chr1:1-1000000)

Read Filtering:

--ignoreDuplicates: Remove PCR duplicates (recommended for most analyses)
--minMappingQuality: Filter by alignment quality (e.g., --minMappingQuality 10)
--minFragmentLength / --maxFragmentLength: Fragment length bounds
--samFlagInclude / --samFlagExclude: SAM flag filtering

Read Processing:

--extendReads: Extend to fragment length (ChIP-seq: YES, RNA-seq: NO)
--centerReads: Center at fragment midpoint for sharper signals

Best Practices

File Validation

Always validate files first using scripts/validate_files.py to check:

File existence and readability
BAM indices present (.bai files)
BED format correctness
File sizes reasonable

Analysis Strategy

Start with QC : Run correlation, coverage, and fingerprint analysis before proceeding
Test on small regions : Use --region chr1:1-10000000 for parameter testing
Document commands : Save full command lines for reproducibility
Use consistent normalization : Apply same method across samples in comparisons
Verify genome assembly : Ensure BAM and BED files use matching genome builds

ChIP-seq Specific

Always extend reads for ChIP-seq: --extendReads 200
Remove duplicates : Use --ignoreDuplicates in most cases
Check enrichment first : Run plotFingerprint before detailed analysis
GC correction : Only apply if significant bias detected; never use --ignoreDuplicates after GC correction

RNA-seq Specific

Never extend reads for RNA-seq (would span splice junctions)
Strand-specific : Use --filterRNAstrand forward/reverse for stranded libraries
Normalization : CPM for bins, RPKM for genes

ATAC-seq Specific

Apply Tn5 correction : Use alignmentSieve with --ATACshift
Fragment filtering : Set appropriate min/max fragment lengths
Check nucleosome pattern : Fragment size plot should show ladder pattern

Performance Optimization

Use multiple processors : --numberOfProcessors 8 (or available cores)
Increase bin size for faster processing and smaller files
Process chromosomes separately for memory-limited systems
Pre-filter BAM files using alignmentSieve to create reusable filtered files
Use bigWig over bedGraph : Compressed and faster to process

Troubleshooting

Common Issues

BAM index missing:

samtools index input.bam

Out of memory: Process chromosomes individually using --region:

bamCoverage --bam input.bam -o chr1.bw --region chr1

Slow processing: Increase --numberOfProcessors and/or increase --binSize

bigWig files too large: Increase bin size: --binSize 50 or larger

Validation Errors

Run validation script to identify issues:

python scripts/validate_files.py --bam *.bam --bed regions.bed

Common errors and solutions explained in script output.

Reference Documentation

This skill includes comprehensive reference documentation:

references/tools_reference.md

Complete documentation of all deepTools commands organized by category:

BAM and bigWig processing tools (9 tools)
Quality control tools (6 tools)
Visualization tools (3 tools)
Miscellaneous tools (2 tools)

Each tool includes:

Purpose and overview
Key parameters with explanations
Usage examples
Important notes and best practices

Use this reference when: Users ask about specific tools, parameters, or detailed usage.

references/workflows.md

Complete workflow examples for common analyses:

ChIP-seq quality control workflow
ChIP-seq complete analysis workflow
RNA-seq coverage workflow
ATAC-seq analysis workflow
Multi-sample comparison workflow
Peak region analysis workflow
Troubleshooting and performance tips

Use this reference when: Users need complete analysis pipelines or workflow examples.

references/normalization_methods.md

Comprehensive guide to normalization methods:

Detailed explanation of each method (RPGC, CPM, RPKM, BPM, etc.)
When to use each method
Formulas and interpretation
Selection guide by experiment type
Common pitfalls and solutions
Quick reference table

Use this reference when: Users ask about normalization, comparing samples, or which method to use.

references/effective_genome_sizes.md

Effective genome size values and usage:

Common organism values (human, mouse, fly, worm, zebrafish)
Read-length-specific values
Calculation methods
When and how to use in commands
Custom genome calculation instructions

Use this reference when: Users need genome size for RPGC normalization or GC bias correction.

Helper Scripts

scripts/validate_files.py

Validates BAM, bigWig, and BED files for deepTools analysis. Checks file existence, indices, and format.

Usage:

python scripts/validate_files.py --bam sample1.bam sample2.bam \
    --bed peaks.bed --bigwig signal.bw

When to use: Before starting any analysis, or when troubleshooting errors.

scripts/workflow_generator.py

Generates customizable bash script templates for common deepTools workflows.

Available workflows:

chipseq_qc: ChIP-seq quality control
chipseq_analysis: Complete ChIP-seq analysis
rnaseq_coverage: Strand-specific RNA-seq coverage
atacseq: ATAC-seq with Tn5 correction

Usage:

# List workflows
python scripts/workflow_generator.py --list

# Generate workflow
python scripts/workflow_generator.py chipseq_qc -o qc.sh \
    --input-bam Input.bam --chip-bams "ChIP1.bam ChIP2.bam" \
    --genome-size 2913022398 --threads 8

# Run generated workflow
chmod +x qc.sh
./qc.sh

When to use: Users request standard workflows or need template scripts to customize.

Assets

assets/quick_reference.md

Quick reference card with most common commands, effective genome sizes, and typical workflow pattern.

When to use: Users need quick command examples without detailed documentation.

Handling User Requests

For New Users

Start with installation verification
Validate input files using scripts/validate_files.py
Recommend appropriate workflow based on experiment type
Generate workflow template using scripts/workflow_generator.py
Guide through customization and execution

For Experienced Users

Provide specific tool commands for requested operations
Reference appropriate sections in references/tools_reference.md
Suggest optimizations and best practices
Offer troubleshooting for issues

For Specific Tasks

"Convert BAM to bigWig":

Use bamCoverage with appropriate normalization
Recommend RPGC or CPM based on use case
Provide effective genome size for organism
Suggest relevant parameters (extendReads, ignoreDuplicates, binSize)

"Check ChIP quality":

Run full QC workflow or use plotFingerprint specifically
Explain interpretation of results
Suggest follow-up actions based on results

"Create heatmap":

Guide through two-step process: computeMatrix → plotHeatmap
Help choose appropriate matrix mode (reference-point vs scale-regions)
Suggest visualization parameters and clustering options

"Compare samples":

Recommend bamCompare for two-sample comparison
Suggest multiBamSummary + plotCorrelation for multiple samples
Guide normalization method selection

Referencing Documentation

When users need detailed information:

Tool details : Direct to specific sections in references/tools_reference.md
Workflows : Use references/workflows.md for complete analysis pipelines
Normalization : Consult references/normalization_methods.md for method selection
Genome sizes : Reference references/effective_genome_sizes.md

Search references using grep patterns:

# Find tool documentation
grep -A 20 "^### toolname" references/tools_reference.md

# Find workflow
grep -A 50 "^## Workflow Name" references/workflows.md

# Find normalization method
grep -A 15 "^### Method Name" references/normalization_methods.md

Example Interactions

User: "I need to analyze my ChIP-seq data"

Response approach:

Ask about files available (BAM files, peaks, genes)
Validate files using validation script
Generate chipseq_analysis workflow template
Customize for their specific files and organism
Explain each step as script runs

User: "Which normalization should I use?"

Response approach:

Ask about experiment type (ChIP-seq, RNA-seq, etc.)
Ask about comparison goal (within-sample or between-sample)
Consult references/normalization_methods.md selection guide
Recommend appropriate method with justification
Provide command example with parameters

User: "Create a heatmap around TSS"

Response approach:

Verify bigWig and gene BED files available
Use computeMatrix with reference-point mode at TSS
Generate plotHeatmap with appropriate visualization parameters
Suggest clustering if dataset is large
Offer profile plot as complement

Key Reminders

File validation first : Always validate input files before analysis
Normalization matters : Choose appropriate method for comparison type
Extend reads carefully : YES for ChIP-seq, NO for RNA-seq
Use all cores : Set --numberOfProcessors to available cores
Test on regions : Use --region for parameter testing
Check QC first : Run quality control before detailed analysis
Document everything : Save commands for reproducibility
Reference documentation : Use comprehensive references for detailed guidance

Weekly Installs

117

Repository

davila7/claude-…emplates

GitHub Stars

22.6K

First Seen

Jan 21, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

claude-code101

opencode92

gemini-cli85

cursor85

antigravity81

codex76

Excel财务建模规范与xlsx文件处理指南：专业格式、零错误公式与数据分析

45,000 周安装

deepTools：NGS数据分析工具包 | ChIP-seq/RNA-seq/ATAC-seq质量控制与可视化

🇨🇳中文介绍

deepTools: NGS 数据分析工具包

概述

何时使用此技能

快速开始

1. 验证输入文件

2. 生成工作流程模板

相关 Skills

3. 最常用操作

安装

核心工作流程

ChIP-seq 质量控制工作流程

ChIP-seq 完整分析工作流程

RNA-seq 覆盖度工作流程

ATAC-seq 分析工作流程

工具类别和常见任务

BAM/bigWig 处理

质量控制

可视化

标准化方法

有效基因组大小

跨工具通用参数

最佳实践

文件验证

分析策略

ChIP-seq 特定

RNA-seq 特定

ATAC-seq 特定

性能优化

故障排除

常见问题

验证错误

参考文档

references/tools_reference.md

references/workflows.md

references/normalization_methods.md

references/effective_genome_sizes.md

辅助脚本

scripts/validate_files.py

scripts/workflow_generator.py

资源

assets/quick_reference.md

处理用户请求

对于新用户

对于有经验的用户

对于特定任务

引用文档

示例交互

关键提醒

🇺🇸English

deepTools: NGS Data Analysis Toolkit

Overview

When to Use This Skill

Quick Start

1. Validate Input Files

2. Generate Workflow Template

3. Most Common Operations

Installation

Core Workflows

ChIP-seq Quality Control Workflow

ChIP-seq Complete Analysis Workflow

RNA-seq Coverage Workflow

ATAC-seq Analysis Workflow

Tool Categories and Common Tasks

BAM/bigWig Processing

Quality Control

Visualization

Normalization Methods

Effective Genome Sizes

Common Parameters Across Tools

Best Practices

File Validation

Analysis Strategy

ChIP-seq Specific

RNA-seq Specific

ATAC-seq Specific

Performance Optimization

Troubleshooting

Common Issues