gtars by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill gtarsGtars 是一个高性能的 Rust 工具包,用于操作、分析和处理基因组区间数据。它提供用于重叠检测、覆盖度分析、机器学习标记化以及参考序列管理的专用工具。
在以下场景中使用此技能:
安装 gtars Python 绑定:
uv uv pip install gtars
安装命令行工具(需要 Rust/Cargo):
# 安装所有功能
cargo install gtars-cli --features "uniwig overlaprs igd bbcache scoring fragsplit"
# 或仅安装特定功能
cargo install gtars-cli --features "uniwig overlaprs"
对于 Rust 项目,添加到 Cargo.toml:
[dependencies]
gtars = { version = "0.1", features = ["tokenizers", "overlaprs"] }
Gtars 按专门的模块组织,每个模块专注于特定的基因组分析任务:
使用集成基因组数据库数据结构高效检测基因组区间之间的重叠。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
使用场景:
快速示例:
import gtars
# 构建 IGD 索引并查询重叠
igd = gtars.igd.build_index("regions.bed")
overlaps = igd.query("chr1", 1000, 2000)
有关全面的重叠检测文档,请参阅 references/overlap.md。
使用 uniwig 模块从测序数据生成覆盖度轨迹。
使用场景:
快速示例:
# 生成 BigWig 覆盖度轨迹
gtars uniwig generate --input fragments.bed --output coverage.bw --format bigwig
有关详细的覆盖度分析工作流程,请参阅 references/coverage.md。
将基因组区域转换为离散标记,用于机器学习应用,特别是用于基因组数据的深度学习模型。
使用场景:
快速示例:
from gtars.tokenizers import TreeTokenizer
tokenizer = TreeTokenizer.from_bed_file("training_regions.bed")
token = tokenizer.tokenize("chr1", 1000, 2000)
有关标记化文档,请参阅 references/tokenizers.md。
处理参考基因组序列并遵循 GA4GH refget 协议计算摘要。
使用场景:
快速示例:
# 加载参考并提取序列
store = gtars.RefgetStore.from_fasta("hg38.fa")
sequence = store.get_subsequence("chr1", 1000, 2000)
有关参考序列操作,请参阅 references/refget.md。
拆分和分析片段文件,特别适用于单细胞基因组学数据。
使用场景:
快速示例:
# 按聚类拆分片段
gtars fragsplit cluster-split --input fragments.tsv --clusters clusters.txt --output-dir ./by_cluster/
有关片段处理命令,请参阅 references/cli.md。
根据参考数据集对片段重叠进行评分。
使用场景:
快速示例:
# 根据参考对片段进行评分
gtars scoring score --fragments fragments.bed --reference reference.bed --output scores.txt
识别重叠的基因组特征:
import gtars
# 加载两个区域集
peaks = gtars.RegionSet.from_bed("chip_peaks.bed")
promoters = gtars.RegionSet.from_bed("promoters.bed")
# 查找重叠
overlapping_peaks = peaks.filter_overlapping(promoters)
# 导出结果
overlapping_peaks.to_bed("peaks_in_promoters.bed")
生成用于可视化的覆盖度轨迹:
# 步骤 1:生成覆盖度
gtars uniwig generate --input atac_fragments.bed --output coverage.wig --resolution 10
# 步骤 2:为基因组浏览器转换为 BigWig
gtars uniwig generate --input atac_fragments.bed --output coverage.bw --format bigwig
为机器学习准备基因组数据:
from gtars.tokenizers import TreeTokenizer
import gtars
# 步骤 1:加载训练区域
regions = gtars.RegionSet.from_bed("training_peaks.bed")
# 步骤 2:创建标记器
tokenizer = TreeTokenizer.from_bed_file("training_peaks.bed")
# 步骤 3:标记化区域
tokens = [tokenizer.tokenize(r.chromosome, r.start, r.end) for r in regions]
# 步骤 4:在机器学习流水线中使用标记
# (与 geniml 或自定义模型集成)
在以下情况使用 Python API:
在以下情况使用 CLI:
全面的模块文档:
references/python-api.md - 完整的 Python API 参考,包含 RegionSet 操作、NumPy 集成和数据导出references/overlap.md - IGD 索引、重叠检测和集合操作references/coverage.md - 使用 uniwig 生成覆盖度轨迹references/tokenizers.md - 用于机器学习应用的基因组标记化references/refget.md - 参考序列管理和摘要references/cli.md - 命令行接口完整参考Gtars 是 geniml Python 包的基础,为机器学习工作流程提供核心的基因组区间操作。在处理与 geniml 相关的任务时,使用 gtars 进行数据预处理和标记化。
Gtars 使用标准基因组格式:
启用详细日志记录以进行故障排除:
import gtars
# 启用调试日志记录
gtars.set_log_level("DEBUG")
# CLI 详细模式
gtars --verbose <command>
每周安装次数
138
代码仓库
GitHub 星标数
23.5K
首次出现
2026年1月21日
安全审计
安装于
claude-code119
opencode113
gemini-cli109
cursor109
antigravity102
codex98
Gtars is a high-performance Rust toolkit for manipulating, analyzing, and processing genomic interval data. It provides specialized tools for overlap detection, coverage analysis, tokenization for machine learning, and reference sequence management.
Use this skill when working with:
Install gtars Python bindings:
uv uv pip install gtars
Install command-line tools (requires Rust/Cargo):
# Install with all features
cargo install gtars-cli --features "uniwig overlaprs igd bbcache scoring fragsplit"
# Or install specific features only
cargo install gtars-cli --features "uniwig overlaprs"
Add to Cargo.toml for Rust projects:
[dependencies]
gtars = { version = "0.1", features = ["tokenizers", "overlaprs"] }
Gtars is organized into specialized modules, each focused on specific genomic analysis tasks:
Efficiently detect overlaps between genomic intervals using the Integrated Genome Database (IGD) data structure.
When to use:
Quick example:
import gtars
# Build IGD index and query overlaps
igd = gtars.igd.build_index("regions.bed")
overlaps = igd.query("chr1", 1000, 2000)
See references/overlap.md for comprehensive overlap detection documentation.
Generate coverage tracks from sequencing data with the uniwig module.
When to use:
Quick example:
# Generate BigWig coverage track
gtars uniwig generate --input fragments.bed --output coverage.bw --format bigwig
See references/coverage.md for detailed coverage analysis workflows.
Convert genomic regions into discrete tokens for machine learning applications, particularly for deep learning models on genomic data.
When to use:
Quick example:
from gtars.tokenizers import TreeTokenizer
tokenizer = TreeTokenizer.from_bed_file("training_regions.bed")
token = tokenizer.tokenize("chr1", 1000, 2000)
See references/tokenizers.md for tokenization documentation.
Handle reference genome sequences and compute digests following the GA4GH refget protocol.
When to use:
Quick example:
# Load reference and extract sequences
store = gtars.RefgetStore.from_fasta("hg38.fa")
sequence = store.get_subsequence("chr1", 1000, 2000)
See references/refget.md for reference sequence operations.
Split and analyze fragment files, particularly useful for single-cell genomics data.
When to use:
Quick example:
# Split fragments by clusters
gtars fragsplit cluster-split --input fragments.tsv --clusters clusters.txt --output-dir ./by_cluster/
See references/cli.md for fragment processing commands.
Score fragment overlaps against reference datasets.
When to use:
Quick example:
# Score fragments against reference
gtars scoring score --fragments fragments.bed --reference reference.bed --output scores.txt
Identify overlapping genomic features:
import gtars
# Load two region sets
peaks = gtars.RegionSet.from_bed("chip_peaks.bed")
promoters = gtars.RegionSet.from_bed("promoters.bed")
# Find overlaps
overlapping_peaks = peaks.filter_overlapping(promoters)
# Export results
overlapping_peaks.to_bed("peaks_in_promoters.bed")
Generate coverage tracks for visualization:
# Step 1: Generate coverage
gtars uniwig generate --input atac_fragments.bed --output coverage.wig --resolution 10
# Step 2: Convert to BigWig for genome browsers
gtars uniwig generate --input atac_fragments.bed --output coverage.bw --format bigwig
Prepare genomic data for machine learning:
from gtars.tokenizers import TreeTokenizer
import gtars
# Step 1: Load training regions
regions = gtars.RegionSet.from_bed("training_peaks.bed")
# Step 2: Create tokenizer
tokenizer = TreeTokenizer.from_bed_file("training_peaks.bed")
# Step 3: Tokenize regions
tokens = [tokenizer.tokenize(r.chromosome, r.start, r.end) for r in regions]
# Step 4: Use tokens in ML pipeline
# (integrate with geniml or custom models)
Use Python API when:
Use CLI when:
Comprehensive module documentation:
references/python-api.md - Complete Python API reference with RegionSet operations, NumPy integration, and data exportreferences/overlap.md - IGD indexing, overlap detection, and set operationsreferences/coverage.md - Coverage track generation with uniwigreferences/tokenizers.md - Genomic tokenization for ML applicationsreferences/refget.md - Reference sequence management and digestsreferences/cli.md - Command-line interface complete referenceGtars serves as the foundation for the geniml Python package, providing core genomic interval operations for machine learning workflows. When working on geniml-related tasks, use gtars for data preprocessing and tokenization.
Gtars works with standard genomic formats:
Enable verbose logging for troubleshooting:
import gtars
# Enable debug logging
gtars.set_log_level("DEBUG")
# CLI verbose mode
gtars --verbose <command>
Weekly Installs
138
Repository
GitHub Stars
23.5K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
claude-code119
opencode113
gemini-cli109
cursor109
antigravity102
codex98
Python PDF 提取技能:使用 pdfplumber 库精确提取文本、表格和元数据
925 周安装