biopython by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill biopythonBiopython 是一套用于生物计算的综合性免费 Python 工具集。它提供了序列操作、文件 I/O、数据库访问、结构生物信息学、系统发育学以及许多其他生物信息学任务的功能。当前版本是 Biopython 1.85(发布于 2025 年 1 月),支持 Python 3 并需要 NumPy。
在以下情况下使用此技能:
Biopython 被组织成模块化的子包,每个子包处理特定的生物信息学领域:
使用 pip 安装 Biopython(需要 Python 3 和 NumPy):
uv pip install biopython
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
对于 NCBI 数据库访问,请始终设置您的电子邮件地址(NCBI 要求):
from Bio import Entrez
Entrez.email = "your.email@example.com"
# 可选:用于更高请求限制的 API 密钥(10 次请求/秒,而不是 3 次请求/秒)
Entrez.api_key = "your_api_key_here"
此技能提供了按功能区域组织的全面文档。在处理任务时,请查阅相关的参考文档:
参考: references/sequence_io.md
用于:
快速示例:
from Bio import SeqIO
# 从 FASTA 文件读取序列
for record in SeqIO.parse("sequences.fasta", "fasta"):
print(f"{record.id}: {len(record.seq)} bp")
# 将 GenBank 转换为 FASTA
SeqIO.convert("input.gb", "genbank", "output.fasta", "fasta")
参考: references/alignment.md
用于:
快速示例:
from Bio import Align
# 成对比对
aligner = Align.PairwiseAligner()
aligner.mode = 'global'
alignments = aligner.align("ACCGGT", "ACGGT")
print(alignments[0])
参考: references/databases.md
用于:
快速示例:
from Bio import Entrez
Entrez.email = "your.email@example.com"
# 搜索 PubMed
handle = Entrez.esearch(db="pubmed", term="biopython", retmax=10)
results = Entrez.read(handle)
handle.close()
print(f"Found {results['Count']} results")
参考: references/blast.md
用于:
快速示例:
from Bio.Blast import NCBIWWW, NCBIXML
# 运行 BLAST 搜索
result_handle = NCBIWWW.qblast("blastn", "nt", "ATCGATCGATCG")
blast_record = NCBIXML.read(result_handle)
# 显示顶部命中
for alignment in blast_record.alignments[:5]:
print(f"{alignment.title}: E-value={alignment.hsps[0].expect}")
参考: references/structure.md
用于:
快速示例:
from Bio.PDB import PDBParser
# 解析结构
parser = PDBParser(QUIET=True)
structure = parser.get_structure("1crn", "1crn.pdb")
# 计算 α 碳之间的距离
chain = structure[0]["A"]
distance = chain[10]["CA"] - chain[20]["CA"]
print(f"Distance: {distance:.2f} Å")
参考: references/phylogenetics.md
用于:
快速示例:
from Bio import Phylo
# 读取并可视化树
tree = Phylo.read("tree.nwk", "newick")
Phylo.draw_ascii(tree)
# 计算距离
distance = tree.distance("Species_A", "Species_B")
print(f"Distance: {distance:.3f}")
参考: references/advanced.md
用于:
快速示例:
from Bio.SeqUtils import gc_fraction, molecular_weight
from Bio.Seq import Seq
seq = Seq("ATCGATCGATCG")
print(f"GC content: {gc_fraction(seq):.2%}")
print(f"Molecular weight: {molecular_weight(seq, seq_type='DNA'):.2f} g/mol")
当用户询问特定的 Biopython 任务时:
参考文件的示例搜索模式:
# 查找特定函数的信息
grep -n "SeqIO.parse" references/sequence_io.md
# 查找特定任务的示例
grep -n "BLAST" references/blast.md
# 查找特定概念的信息
grep -n "alignment" references/alignment.md
编写 Biopython 代码时遵循以下原则:
显式导入模块
from Bio import SeqIO, Entrez
from Bio.Seq import Seq
使用 NCBI 数据库时设置 Entrez 邮箱
Entrez.email = "your.email@example.com"
使用适当的文件格式 - 检查哪种格式最适合任务
# 常见格式:"fasta"、"genbank"、"fastq"、"clustal"、"phylip"
正确处理文件 - 使用后关闭句柄或使用上下文管理器
with open("file.fasta") as handle:
records = SeqIO.parse(handle, "fasta")
对大文件使用迭代器 - 避免将所有内容加载到内存中
for record in SeqIO.parse("large_file.fasta", "fasta"):
# 一次处理一条记录
优雅地处理错误 - 网络操作和文件解析可能会失败
try:
handle = Entrez.efetch(db="nucleotide", id=accession)
except HTTPError as e:
print(f"Error: {e}")
from Bio import Entrez, SeqIO
Entrez.email = "your.email@example.com"
# 获取序列
handle = Entrez.efetch(db="nucleotide", id="EU490707", rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank")
handle.close()
print(f"Description: {record.description}")
print(f"Sequence length: {len(record.seq)}")
from Bio import SeqIO
from Bio.SeqUtils import gc_fraction
for record in SeqIO.parse("sequences.fasta", "fasta"):
# 计算统计信息
gc = gc_fraction(record.seq)
length = len(record.seq)
# 查找 ORF、翻译等
protein = record.seq.translate()
print(f"{record.id}: {length} bp, GC={gc:.2%}")
from Bio.Blast import NCBIWWW, NCBIXML
from Bio import Entrez, SeqIO
Entrez.email = "your.email@example.com"
# 运行 BLAST
result_handle = NCBIWWW.qblast("blastn", "nt", sequence)
blast_record = NCBIXML.read(result_handle)
# 获取顶部命中序列号
accessions = [aln.accession for aln in blast_record.alignments[:5]]
# 获取序列
for acc in accessions:
handle = Entrez.efetch(db="nucleotide", id=acc, rettype="fasta", retmode="text")
record = SeqIO.read(handle, "fasta")
handle.close()
print(f">{record.description}")
from Bio import AlignIO, Phylo
from Bio.Phylo.TreeConstruction import DistanceCalculator, DistanceTreeConstructor
# 读取比对
alignment = AlignIO.read("alignment.fasta", "fasta")
# 计算距离
calculator = DistanceCalculator("identity")
dm = calculator.get_distance(alignment)
# 构建树
constructor = DistanceTreeConstructor()
tree = constructor.nj(dm)
# 可视化
Phylo.draw_ascii(tree)
解决方案: 这只是一个警告。设置 Entrez.email 可以抑制它。
解决方案: 检查 ID/序列号是否有效且格式正确。
解决方案: 验证文件格式是否与指定的格式字符串匹配。
解决方案: 确保在使用 AlignIO 或 MultipleSeqAlignment 之前序列已对齐。
解决方案: 对于大规模搜索,使用本地 BLAST,或缓存结果。
解决方案: 使用 PDBParser(QUIET=True) 来抑制警告,或调查结构质量。
要定位参考文件中的信息,请使用以下搜索模式:
# 搜索特定函数
grep -n "function_name" references/*.md
# 查找特定任务的示例
grep -n "example" references/sequence_io.md
# 查找模块的所有出现
grep -n "Bio.Seq" references/*.md
Biopython 为计算分子生物学提供了全面的工具。使用此技能时:
references/ 目录中相应的参考文件模块化的参考文档确保了每个主要 Biopython 功能都有详细、可搜索的信息。
每周安装次数
140
代码仓库
GitHub 星标数
22.6K
首次出现
2026 年 1 月 21 日
安全审计
安装于
claude-code112
opencode111
gemini-cli105
cursor100
codex95
antigravity92
Biopython is a comprehensive set of freely available Python tools for biological computation. It provides functionality for sequence manipulation, file I/O, database access, structural bioinformatics, phylogenetics, and many other bioinformatics tasks. The current version is Biopython 1.85 (released January 2025), which supports Python 3 and requires NumPy.
Use this skill when:
Biopython is organized into modular sub-packages, each addressing specific bioinformatics domains:
Install Biopython using pip (requires Python 3 and NumPy):
uv pip install biopython
For NCBI database access, always set your email address (required by NCBI):
from Bio import Entrez
Entrez.email = "your.email@example.com"
# Optional: API key for higher rate limits (10 req/s instead of 3 req/s)
Entrez.api_key = "your_api_key_here"
This skill provides comprehensive documentation organized by functionality area. When working on a task, consult the relevant reference documentation:
Reference: references/sequence_io.md
Use for:
Quick example:
from Bio import SeqIO
# Read sequences from FASTA file
for record in SeqIO.parse("sequences.fasta", "fasta"):
print(f"{record.id}: {len(record.seq)} bp")
# Convert GenBank to FASTA
SeqIO.convert("input.gb", "genbank", "output.fasta", "fasta")
Reference: references/alignment.md
Use for:
Quick example:
from Bio import Align
# Pairwise alignment
aligner = Align.PairwiseAligner()
aligner.mode = 'global'
alignments = aligner.align("ACCGGT", "ACGGT")
print(alignments[0])
Reference: references/databases.md
Use for:
Quick example:
from Bio import Entrez
Entrez.email = "your.email@example.com"
# Search PubMed
handle = Entrez.esearch(db="pubmed", term="biopython", retmax=10)
results = Entrez.read(handle)
handle.close()
print(f"Found {results['Count']} results")
Reference: references/blast.md
Use for:
Quick example:
from Bio.Blast import NCBIWWW, NCBIXML
# Run BLAST search
result_handle = NCBIWWW.qblast("blastn", "nt", "ATCGATCGATCG")
blast_record = NCBIXML.read(result_handle)
# Display top hits
for alignment in blast_record.alignments[:5]:
print(f"{alignment.title}: E-value={alignment.hsps[0].expect}")
Reference: references/structure.md
Use for:
Quick example:
from Bio.PDB import PDBParser
# Parse structure
parser = PDBParser(QUIET=True)
structure = parser.get_structure("1crn", "1crn.pdb")
# Calculate distance between alpha carbons
chain = structure[0]["A"]
distance = chain[10]["CA"] - chain[20]["CA"]
print(f"Distance: {distance:.2f} Å")
Reference: references/phylogenetics.md
Use for:
Quick example:
from Bio import Phylo
# Read and visualize tree
tree = Phylo.read("tree.nwk", "newick")
Phylo.draw_ascii(tree)
# Calculate distance
distance = tree.distance("Species_A", "Species_B")
print(f"Distance: {distance:.3f}")
Reference: references/advanced.md
Use for:
Quick example:
from Bio.SeqUtils import gc_fraction, molecular_weight
from Bio.Seq import Seq
seq = Seq("ATCGATCGATCG")
print(f"GC content: {gc_fraction(seq):.2%}")
print(f"Molecular weight: {molecular_weight(seq, seq_type='DNA'):.2f} g/mol")
When a user asks about a specific Biopython task:
Example search patterns for reference files:
# Find information about specific functions
grep -n "SeqIO.parse" references/sequence_io.md
# Find examples of specific tasks
grep -n "BLAST" references/blast.md
# Find information about specific concepts
grep -n "alignment" references/alignment.md
Follow these principles when writing Biopython code:
Import modules explicitly
from Bio import SeqIO, Entrez
from Bio.Seq import Seq
Set Entrez email when using NCBI databases
Entrez.email = "your.email@example.com"
Use appropriate file formats - Check which format best suits the task
# Common formats: "fasta", "genbank", "fastq", "clustal", "phylip"
Handle files properly - Close handles after use or use context managers
with open("file.fasta") as handle:
records = SeqIO.parse(handle, "fasta")
Use iterators for large files - Avoid loading everything into memory
for record in SeqIO.parse("large_file.fasta", "fasta"):
# Process one record at a time
Handle errors gracefully - Network operations and file parsing can fail
from Bio import Entrez, SeqIO
Entrez.email = "your.email@example.com"
# Fetch sequence
handle = Entrez.efetch(db="nucleotide", id="EU490707", rettype="gb", retmode="text")
record = SeqIO.read(handle, "genbank")
handle.close()
print(f"Description: {record.description}")
print(f"Sequence length: {len(record.seq)}")
from Bio import SeqIO
from Bio.SeqUtils import gc_fraction
for record in SeqIO.parse("sequences.fasta", "fasta"):
# Calculate statistics
gc = gc_fraction(record.seq)
length = len(record.seq)
# Find ORFs, translate, etc.
protein = record.seq.translate()
print(f"{record.id}: {length} bp, GC={gc:.2%}")
from Bio.Blast import NCBIWWW, NCBIXML
from Bio import Entrez, SeqIO
Entrez.email = "your.email@example.com"
# Run BLAST
result_handle = NCBIWWW.qblast("blastn", "nt", sequence)
blast_record = NCBIXML.read(result_handle)
# Get top hit accessions
accessions = [aln.accession for aln in blast_record.alignments[:5]]
# Fetch sequences
for acc in accessions:
handle = Entrez.efetch(db="nucleotide", id=acc, rettype="fasta", retmode="text")
record = SeqIO.read(handle, "fasta")
handle.close()
print(f">{record.description}")
from Bio import AlignIO, Phylo
from Bio.Phylo.TreeConstruction import DistanceCalculator, DistanceTreeConstructor
# Read alignment
alignment = AlignIO.read("alignment.fasta", "fasta")
# Calculate distances
calculator = DistanceCalculator("identity")
dm = calculator.get_distance(alignment)
# Build tree
constructor = DistanceTreeConstructor()
tree = constructor.nj(dm)
# Visualize
Phylo.draw_ascii(tree)
Solution: This is just a warning. Set Entrez.email to suppress it.
Solution: Check that IDs/accessions are valid and properly formatted.
Solution: Verify file format matches the specified format string.
Solution: Ensure sequences are aligned before using AlignIO or MultipleSeqAlignment.
Solution: Use local BLAST for large-scale searches, or cache results.
Solution: Use PDBParser(QUIET=True) to suppress warnings, or investigate structure quality.
To locate information in reference files, use these search patterns:
# Search for specific functions
grep -n "function_name" references/*.md
# Find examples of specific tasks
grep -n "example" references/sequence_io.md
# Find all occurrences of a module
grep -n "Bio.Seq" references/*.md
Biopython provides comprehensive tools for computational molecular biology. When using this skill:
references/ directoryThe modular reference documentation ensures detailed, searchable information for every major Biopython capability.
Weekly Installs
140
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
claude-code112
opencode111
gemini-cli105
cursor100
codex95
antigravity92
Apify Actor 输出模式生成工具 - 自动化创建 dataset_schema.json 与 output_schema.json
1,000 周安装
try:
handle = Entrez.efetch(db="nucleotide", id=accession)
except HTTPError as e:
print(f"Error: {e}")