etetoolkit by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill etetoolkitETE(进化树探索环境)是一个用于系统发育和层次树分析的工具包。可操作进化树、分析进化事件、可视化结果,并与生物数据库集成,用于系统基因组学研究和聚类分析。
加载、操作和分析层次树结构,支持:
常见模式:
from ete3 import Tree
# 从文件加载进化树
tree = Tree("tree.nw", format=1)
# 基础统计
print(f"Leaves: {len(tree)}")
print(f"Total nodes: {len(list(tree.traverse()))}")
# 修剪至感兴趣的类群
taxa_to_keep = ["species1", "species2", "species3"]
tree.prune(taxa_to_keep, preserve_branch_length=True)
# 中点定根
midpoint = tree.get_midpoint_outgroup()
tree.set_outgroup(midpoint)
# 保存修改后的进化树
tree.write(outfile="rooted_tree.nw")
使用 scripts/tree_operations.py 进行命令行进化树操作:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
# 显示进化树统计信息
python scripts/tree_operations.py stats tree.nw
# 转换格式
python scripts/tree_operations.py convert tree.nw output.nw --in-format 0 --out-format 1
# 重新定根进化树
python scripts/tree_operations.py reroot tree.nw rooted.nw --midpoint
# 修剪至特定类群
python scripts/tree_operations.py prune tree.nw pruned.nw --keep-taxa "sp1,sp2,sp3"
# 显示 ASCII 可视化
python scripts/tree_operations.py ascii tree.nw
通过进化事件检测分析基因树:
基因树分析工作流程:
from ete3 import PhyloTree
# 加载带有比对的基因树
tree = PhyloTree("gene_tree.nw", alignment="alignment.fasta")
# 设置物种命名函数
def get_species(gene_name):
return gene_name.split("_")[0]
tree.set_species_naming_function(get_species)
# 检测进化事件
events = tree.get_descendant_evol_events()
# 分析事件
for node in tree.traverse():
if hasattr(node, "evoltype"):
if node.evoltype == "D":
print(f"Duplication at {node.name}")
elif node.evoltype == "S":
print(f"Speciation at {node.name}")
# 提取直系同源组
ortho_groups = tree.get_speciation_trees()
for i, ortho_tree in enumerate(ortho_groups):
ortho_tree.write(outfile=f"ortholog_group_{i}.nw")
寻找直系同源和旁系同源基因:
# 查找查询基因的直系同源基因
query = tree & "species1_gene1"
orthologs = []
paralogs = []
for event in events:
if query in event.in_seqs:
if event.etype == "S":
orthologs.extend([s for s in event.out_seqs if s != query])
elif event.etype == "D":
paralogs.extend([s for s in event.out_seqs if s != query])
集成来自 NCBI 分类学数据库的分类学信息:
构建基于分类学的进化树:
from ete3 import NCBITaxa
ncbi = NCBITaxa()
# 从物种名称构建进化树
species = ["Homo sapiens", "Pan troglodytes", "Mus musculus"]
name2taxid = ncbi.get_name_translator(species)
taxids = [name2taxid[sp][0] for sp in species]
# 获取连接类群的最小进化树
tree = ncbi.get_topology(taxids)
# 用分类学信息注释节点
for node in tree.traverse():
if hasattr(node, "sci_name"):
print(f"{node.sci_name} - Rank: {node.rank} - TaxID: {node.taxid}")
注释现有进化树:
# 获取进化树叶节点的分类学信息
for leaf in tree:
species = extract_species_from_name(leaf.name)
taxid = ncbi.get_name_translator([species])[species][0]
# 获取谱系
lineage = ncbi.get_lineage(taxid)
ranks = ncbi.get_rank(lineage)
names = ncbi.get_taxid_translator(lineage)
# 添加到节点
leaf.add_feature("taxid", taxid)
leaf.add_feature("lineage", [names[t] for t in lineage])
创建出版物质量的进化树可视化:
基础可视化工作流程:
from ete3 import Tree, TreeStyle, NodeStyle
tree = Tree("tree.nw")
# 配置进化树样式
ts = TreeStyle()
ts.show_leaf_name = True
ts.show_branch_support = True
ts.scale = 50 # 每分支长度单位的像素数
# 样式化节点
for node in tree.traverse():
nstyle = NodeStyle()
if node.is_leaf():
nstyle["fgcolor"] = "blue"
nstyle["size"] = 8
else:
# 根据支持度着色
if node.support > 0.9:
nstyle["fgcolor"] = "darkgreen"
else:
nstyle["fgcolor"] = "red"
nstyle["size"] = 5
node.set_style(nstyle)
# 渲染到文件
tree.render("tree.pdf", tree_style=ts)
tree.render("tree.png", w=800, h=600, units="px", dpi=300)
使用 scripts/quick_visualize.py 进行快速可视化:
# 基础可视化
python scripts/quick_visualize.py tree.nw output.pdf
# 圆形布局,自定义样式
python scripts/quick_visualize.py tree.nw output.pdf --mode c --color-by-support
# 高分辨率 PNG
python scripts/quick_visualize.py tree.nw output.png --width 1200 --height 800 --units px --dpi 300
# 自定义标题和样式
python scripts/quick_visualize.py tree.nw output.pdf --title "Species Phylogeny" --show-support
使用 Faces 进行高级可视化:
from ete3 import Tree, TreeStyle, TextFace, CircleFace
tree = Tree("tree.nw")
# 向节点添加特征
for leaf in tree:
leaf.add_feature("habitat", "marine" if "fish" in leaf.name else "land")
# 布局函数
def layout(node):
if node.is_leaf():
# 添加彩色圆圈
color = "blue" if node.habitat == "marine" else "green"
circle = CircleFace(radius=5, color=color)
node.add_face(circle, column=0, position="aligned")
# 添加标签
label = TextFace(node.name, fsize=10)
node.add_face(label, column=1, position="aligned")
ts = TreeStyle()
ts.layout_fn = layout
ts.show_leaf_name = False
tree.render("annotated_tree.pdf", tree_style=ts)
通过数据集成分析层次聚类结果:
聚类工作流程:
from ete3 import ClusterTree
# 加载带有数据矩阵的进化树
matrix = """#Names\tSample1\tSample2\tSample3
Gene1\t1.5\t2.3\t0.8
Gene2\t0.9\t1.1\t1.8
Gene3\t2.1\t2.5\t0.5"""
tree = ClusterTree("((Gene1,Gene2),Gene3);", text_array=matrix)
# 评估聚类质量
for node in tree.traverse():
if not node.is_leaf():
silhouette = node.get_silhouette()
dunn = node.get_dunn()
print(f"Cluster: {node.name}")
print(f" Silhouette: {silhouette:.3f}")
print(f" Dunn index: {dunn:.3f}")
# 使用热图可视化
tree.show("heatmap")
量化进化树之间的拓扑差异:
比较两个进化树:
from ete3 import Tree
tree1 = Tree("tree1.nw")
tree2 = Tree("tree2.nw")
# 计算 RF 距离
rf, max_rf, common_leaves, parts_t1, parts_t2 = tree1.robinson_foulds(tree2)
print(f"RF distance: {rf}/{max_rf}")
print(f"Normalized RF: {rf/max_rf:.3f}")
print(f"Common leaves: {len(common_leaves)}")
# 查找独特分区
unique_t1 = parts_t1 - parts_t2
unique_t2 = parts_t2 - parts_t1
print(f"Unique to tree1: {len(unique_t1)}")
print(f"Unique to tree2: {len(unique_t2)}")
比较多个进化树:
import numpy as np
trees = [Tree(f"tree{i}.nw") for i in range(4)]
# 创建距离矩阵
n = len(trees)
dist_matrix = np.zeros((n, n))
for i in range(n):
for j in range(i+1, n):
rf, max_rf, _, _, _ = trees[i].robinson_foulds(trees[j])
norm_rf = rf / max_rf if max_rf > 0 else 0
dist_matrix[i, j] = norm_rf
dist_matrix[j, i] = norm_rf
安装 ETE 工具包:
# 基础安装
uv pip install ete3
# 安装渲染的外部依赖(可选但推荐)
# 在 macOS 上:
brew install qt@5
# 在 Ubuntu/Debian 上:
sudo apt-get install python3-pyqt5 python3-pyqt5.qtsvg
# 包含 GUI 的完整功能
uv pip install ete3[gui]
首次 NCBI 分类学设置:
首次实例化 NCBITaxa 时,它会自动将 NCBI 分类学数据库(约 300MB)下载到 ~/.etetoolkit/taxa.sqlite。此操作仅发生一次:
from ete3 import NCBITaxa
ncbi = NCBITaxa() # 首次运行时下载数据库
更新分类学数据库:
ncbi.update_taxonomy_database() # 下载最新的 NCBI 数据
从基因树到直系同源基因识别的完整工作流程:
from ete3 import PhyloTree, NCBITaxa
# 1. 加载带有比对的基因树
tree = PhyloTree("gene_tree.nw", alignment="alignment.fasta")
# 2. 配置物种命名
tree.set_species_naming_function(lambda x: x.split("_")[0])
# 3. 检测进化事件
tree.get_descendant_evol_events()
# 4. 用分类学注释
ncbi = NCBITaxa()
for leaf in tree:
if leaf.species in species_to_taxid:
taxid = species_to_taxid[leaf.species]
lineage = ncbi.get_lineage(taxid)
leaf.add_feature("lineage", lineage)
# 5. 提取直系同源组
ortho_groups = tree.get_speciation_trees()
# 6. 保存和可视化
for i, ortho in enumerate(ortho_groups):
ortho.write(outfile=f"ortho_{i}.nw")
批量处理进化树以进行分析:
# 转换格式
python scripts/tree_operations.py convert input.nw output.nw --in-format 0 --out-format 1
# 中点定根
python scripts/tree_operations.py reroot input.nw rooted.nw --midpoint
# 修剪至核心类群
python scripts/tree_operations.py prune rooted.nw pruned.nw --keep-taxa taxa_list.txt
# 获取统计信息
python scripts/tree_operations.py stats pruned.nw
创建样式化的可视化:
from ete3 import Tree, TreeStyle, NodeStyle, TextFace
tree = Tree("tree.nw")
# 定义支系颜色
clade_colors = {
"Mammals": "red",
"Birds": "blue",
"Fish": "green"
}
def layout(node):
# 高亮支系
if node.is_leaf():
for clade, color in clade_colors.items():
if clade in node.name:
nstyle = NodeStyle()
nstyle["fgcolor"] = color
nstyle["size"] = 8
node.set_style(nstyle)
else:
# 添加支持度值
if node.support > 0.95:
support = TextFace(f"{node.support:.2f}", fsize=8)
node.add_face(support, column=0, position="branch-top")
ts = TreeStyle()
ts.layout_fn = layout
ts.show_scale = True
# 为出版物渲染
tree.render("figure.pdf", w=200, units="mm", tree_style=ts)
tree.render("figure.svg", tree_style=ts) # 可编辑矢量图
系统处理多个进化树:
from ete3 import Tree
import os
input_dir = "trees"
output_dir = "processed"
for filename in os.listdir(input_dir):
if filename.endswith(".nw"):
tree = Tree(os.path.join(input_dir, filename))
# 标准化:中点定根,解决多歧分支
midpoint = tree.get_midpoint_outgroup()
tree.set_outgroup(midpoint)
tree.resolve_polytomy(recursive=True)
# 过滤低支持度分支
for node in tree.traverse():
if hasattr(node, 'support') and node.support < 0.5:
if not node.is_leaf() and not node.is_root():
node.delete()
# 保存处理后的进化树
output_file = os.path.join(output_dir, f"processed_{filename}")
tree.write(outfile=output_file)
有关全面的 API 文档、代码示例和详细指南,请参阅 references/ 目录中的以下资源:
api_reference.md:所有 ETE 类和方法(Tree、PhyloTree、ClusterTree、NCBITaxa)的完整 API 文档,包括参数、返回类型和代码示例workflows.md:按任务组织的常见工作流程模式(进化树操作、系统发育分析、进化树比较、分类学集成、聚类分析)visualization.md:全面的可视化指南,涵盖 TreeStyle、NodeStyle、Faces、布局函数和高级可视化技术需要详细信息时加载这些参考:
# 使用 API 参考
# 阅读 references/api_reference.md 以获取完整的方法签名和参数
# 实现工作流程
# 阅读 references/workflows.md 以获取分步工作流程示例
# 创建可视化
# 阅读 references/visualization.md 以获取样式和渲染选项
导入错误:
# 如果出现 "ModuleNotFoundError: No module named 'ete3'"
uv pip install ete3
# 对于 GUI 和渲染问题
uv pip install ete3[gui]
渲染问题:
如果 tree.render() 或 tree.show() 因 Qt 相关错误而失败,请安装系统依赖项:
# macOS
brew install qt@5
# Ubuntu/Debian
sudo apt-get install python3-pyqt5 python3-pyqt5.qtsvg
NCBI 分类学数据库:
如果数据库下载失败或损坏:
from ete3 import NCBITaxa
ncbi = NCBITaxa()
ncbi.update_taxonomy_database() # 重新下载数据库
大型进化树的内存问题:
对于非常大的进化树(>10,000 个叶节点),使用迭代器而非列表推导式:
# 内存高效迭代
for leaf in tree.iter_leaves():
process(leaf)
# 而不是
for leaf in tree.get_leaves(): # 将所有加载到内存中
process(leaf)
ETE 支持多种 Newick 格式规范(0-100):
读取/写入时指定格式:
tree = Tree("tree.nw", format=1)
tree.write(outfile="output.nw", format=5)
NHX(新罕布什尔扩展)格式保留自定义特征:
tree.write(outfile="tree.nhx", features=["habitat", "temperature", "depth"])
preserve_branch_length=Trueget_cached_content()iter_* 方法对大型进化树进行内存高效处理tree.show() 在渲染到文件前测试可视化效果每周安装数
117
代码仓库
GitHub 星标数
22.6K
首次出现
2026年1月21日
安全审计
安装于
claude-code101
opencode91
cursor87
gemini-cli86
antigravity81
codex75
ETE (Environment for Tree Exploration) is a toolkit for phylogenetic and hierarchical tree analysis. Manipulate trees, analyze evolutionary events, visualize results, and integrate with biological databases for phylogenomic research and clustering analysis.
Load, manipulate, and analyze hierarchical tree structures with support for:
Common patterns:
from ete3 import Tree
# Load tree from file
tree = Tree("tree.nw", format=1)
# Basic statistics
print(f"Leaves: {len(tree)}")
print(f"Total nodes: {len(list(tree.traverse()))}")
# Prune to taxa of interest
taxa_to_keep = ["species1", "species2", "species3"]
tree.prune(taxa_to_keep, preserve_branch_length=True)
# Midpoint root
midpoint = tree.get_midpoint_outgroup()
tree.set_outgroup(midpoint)
# Save modified tree
tree.write(outfile="rooted_tree.nw")
Use scripts/tree_operations.py for command-line tree manipulation:
# Display tree statistics
python scripts/tree_operations.py stats tree.nw
# Convert format
python scripts/tree_operations.py convert tree.nw output.nw --in-format 0 --out-format 1
# Reroot tree
python scripts/tree_operations.py reroot tree.nw rooted.nw --midpoint
# Prune to specific taxa
python scripts/tree_operations.py prune tree.nw pruned.nw --keep-taxa "sp1,sp2,sp3"
# Show ASCII visualization
python scripts/tree_operations.py ascii tree.nw
Analyze gene trees with evolutionary event detection:
Workflow for gene tree analysis:
from ete3 import PhyloTree
# Load gene tree with alignment
tree = PhyloTree("gene_tree.nw", alignment="alignment.fasta")
# Set species naming function
def get_species(gene_name):
return gene_name.split("_")[0]
tree.set_species_naming_function(get_species)
# Detect evolutionary events
events = tree.get_descendant_evol_events()
# Analyze events
for node in tree.traverse():
if hasattr(node, "evoltype"):
if node.evoltype == "D":
print(f"Duplication at {node.name}")
elif node.evoltype == "S":
print(f"Speciation at {node.name}")
# Extract ortholog groups
ortho_groups = tree.get_speciation_trees()
for i, ortho_tree in enumerate(ortho_groups):
ortho_tree.write(outfile=f"ortholog_group_{i}.nw")
Finding orthologs and paralogs:
# Find orthologs to query gene
query = tree & "species1_gene1"
orthologs = []
paralogs = []
for event in events:
if query in event.in_seqs:
if event.etype == "S":
orthologs.extend([s for s in event.out_seqs if s != query])
elif event.etype == "D":
paralogs.extend([s for s in event.out_seqs if s != query])
Integrate taxonomic information from NCBI Taxonomy database:
Building taxonomy-based trees:
from ete3 import NCBITaxa
ncbi = NCBITaxa()
# Build tree from species names
species = ["Homo sapiens", "Pan troglodytes", "Mus musculus"]
name2taxid = ncbi.get_name_translator(species)
taxids = [name2taxid[sp][0] for sp in species]
# Get minimal tree connecting taxa
tree = ncbi.get_topology(taxids)
# Annotate nodes with taxonomy info
for node in tree.traverse():
if hasattr(node, "sci_name"):
print(f"{node.sci_name} - Rank: {node.rank} - TaxID: {node.taxid}")
Annotating existing trees:
# Get taxonomy info for tree leaves
for leaf in tree:
species = extract_species_from_name(leaf.name)
taxid = ncbi.get_name_translator([species])[species][0]
# Get lineage
lineage = ncbi.get_lineage(taxid)
ranks = ncbi.get_rank(lineage)
names = ncbi.get_taxid_translator(lineage)
# Add to node
leaf.add_feature("taxid", taxid)
leaf.add_feature("lineage", [names[t] for t in lineage])
Create publication-quality tree visualizations:
Basic visualization workflow:
from ete3 import Tree, TreeStyle, NodeStyle
tree = Tree("tree.nw")
# Configure tree style
ts = TreeStyle()
ts.show_leaf_name = True
ts.show_branch_support = True
ts.scale = 50 # pixels per branch length unit
# Style nodes
for node in tree.traverse():
nstyle = NodeStyle()
if node.is_leaf():
nstyle["fgcolor"] = "blue"
nstyle["size"] = 8
else:
# Color by support
if node.support > 0.9:
nstyle["fgcolor"] = "darkgreen"
else:
nstyle["fgcolor"] = "red"
nstyle["size"] = 5
node.set_style(nstyle)
# Render to file
tree.render("tree.pdf", tree_style=ts)
tree.render("tree.png", w=800, h=600, units="px", dpi=300)
Use scripts/quick_visualize.py for rapid visualization:
# Basic visualization
python scripts/quick_visualize.py tree.nw output.pdf
# Circular layout with custom styling
python scripts/quick_visualize.py tree.nw output.pdf --mode c --color-by-support
# High-resolution PNG
python scripts/quick_visualize.py tree.nw output.png --width 1200 --height 800 --units px --dpi 300
# Custom title and styling
python scripts/quick_visualize.py tree.nw output.pdf --title "Species Phylogeny" --show-support
Advanced visualization with faces:
from ete3 import Tree, TreeStyle, TextFace, CircleFace
tree = Tree("tree.nw")
# Add features to nodes
for leaf in tree:
leaf.add_feature("habitat", "marine" if "fish" in leaf.name else "land")
# Layout function
def layout(node):
if node.is_leaf():
# Add colored circle
color = "blue" if node.habitat == "marine" else "green"
circle = CircleFace(radius=5, color=color)
node.add_face(circle, column=0, position="aligned")
# Add label
label = TextFace(node.name, fsize=10)
node.add_face(label, column=1, position="aligned")
ts = TreeStyle()
ts.layout_fn = layout
ts.show_leaf_name = False
tree.render("annotated_tree.pdf", tree_style=ts)
Analyze hierarchical clustering results with data integration:
Clustering workflow:
from ete3 import ClusterTree
# Load tree with data matrix
matrix = """#Names\tSample1\tSample2\tSample3
Gene1\t1.5\t2.3\t0.8
Gene2\t0.9\t1.1\t1.8
Gene3\t2.1\t2.5\t0.5"""
tree = ClusterTree("((Gene1,Gene2),Gene3);", text_array=matrix)
# Evaluate cluster quality
for node in tree.traverse():
if not node.is_leaf():
silhouette = node.get_silhouette()
dunn = node.get_dunn()
print(f"Cluster: {node.name}")
print(f" Silhouette: {silhouette:.3f}")
print(f" Dunn index: {dunn:.3f}")
# Visualize with heatmap
tree.show("heatmap")
Quantify topological differences between trees:
Compare two trees:
from ete3 import Tree
tree1 = Tree("tree1.nw")
tree2 = Tree("tree2.nw")
# Calculate RF distance
rf, max_rf, common_leaves, parts_t1, parts_t2 = tree1.robinson_foulds(tree2)
print(f"RF distance: {rf}/{max_rf}")
print(f"Normalized RF: {rf/max_rf:.3f}")
print(f"Common leaves: {len(common_leaves)}")
# Find unique partitions
unique_t1 = parts_t1 - parts_t2
unique_t2 = parts_t2 - parts_t1
print(f"Unique to tree1: {len(unique_t1)}")
print(f"Unique to tree2: {len(unique_t2)}")
Compare multiple trees:
import numpy as np
trees = [Tree(f"tree{i}.nw") for i in range(4)]
# Create distance matrix
n = len(trees)
dist_matrix = np.zeros((n, n))
for i in range(n):
for j in range(i+1, n):
rf, max_rf, _, _, _ = trees[i].robinson_foulds(trees[j])
norm_rf = rf / max_rf if max_rf > 0 else 0
dist_matrix[i, j] = norm_rf
dist_matrix[j, i] = norm_rf
Install ETE toolkit:
# Basic installation
uv pip install ete3
# With external dependencies for rendering (optional but recommended)
# On macOS:
brew install qt@5
# On Ubuntu/Debian:
sudo apt-get install python3-pyqt5 python3-pyqt5.qtsvg
# For full features including GUI
uv pip install ete3[gui]
First-time NCBI Taxonomy setup:
The first time NCBITaxa is instantiated, it automatically downloads the NCBI taxonomy database (~300MB) to ~/.etetoolkit/taxa.sqlite. This happens only once:
from ete3 import NCBITaxa
ncbi = NCBITaxa() # Downloads database on first run
Update taxonomy database:
ncbi.update_taxonomy_database() # Download latest NCBI data
Complete workflow from gene tree to ortholog identification:
from ete3 import PhyloTree, NCBITaxa
# 1. Load gene tree with alignment
tree = PhyloTree("gene_tree.nw", alignment="alignment.fasta")
# 2. Configure species naming
tree.set_species_naming_function(lambda x: x.split("_")[0])
# 3. Detect evolutionary events
tree.get_descendant_evol_events()
# 4. Annotate with taxonomy
ncbi = NCBITaxa()
for leaf in tree:
if leaf.species in species_to_taxid:
taxid = species_to_taxid[leaf.species]
lineage = ncbi.get_lineage(taxid)
leaf.add_feature("lineage", lineage)
# 5. Extract ortholog groups
ortho_groups = tree.get_speciation_trees()
# 6. Save and visualize
for i, ortho in enumerate(ortho_groups):
ortho.write(outfile=f"ortho_{i}.nw")
Batch process trees for analysis:
# Convert format
python scripts/tree_operations.py convert input.nw output.nw --in-format 0 --out-format 1
# Root at midpoint
python scripts/tree_operations.py reroot input.nw rooted.nw --midpoint
# Prune to focal taxa
python scripts/tree_operations.py prune rooted.nw pruned.nw --keep-taxa taxa_list.txt
# Get statistics
python scripts/tree_operations.py stats pruned.nw
Create styled visualizations:
from ete3 import Tree, TreeStyle, NodeStyle, TextFace
tree = Tree("tree.nw")
# Define clade colors
clade_colors = {
"Mammals": "red",
"Birds": "blue",
"Fish": "green"
}
def layout(node):
# Highlight clades
if node.is_leaf():
for clade, color in clade_colors.items():
if clade in node.name:
nstyle = NodeStyle()
nstyle["fgcolor"] = color
nstyle["size"] = 8
node.set_style(nstyle)
else:
# Add support values
if node.support > 0.95:
support = TextFace(f"{node.support:.2f}", fsize=8)
node.add_face(support, column=0, position="branch-top")
ts = TreeStyle()
ts.layout_fn = layout
ts.show_scale = True
# Render for publication
tree.render("figure.pdf", w=200, units="mm", tree_style=ts)
tree.render("figure.svg", tree_style=ts) # Editable vector
Process multiple trees systematically:
from ete3 import Tree
import os
input_dir = "trees"
output_dir = "processed"
for filename in os.listdir(input_dir):
if filename.endswith(".nw"):
tree = Tree(os.path.join(input_dir, filename))
# Standardize: midpoint root, resolve polytomies
midpoint = tree.get_midpoint_outgroup()
tree.set_outgroup(midpoint)
tree.resolve_polytomy(recursive=True)
# Filter low support branches
for node in tree.traverse():
if hasattr(node, 'support') and node.support < 0.5:
if not node.is_leaf() and not node.is_root():
node.delete()
# Save processed tree
output_file = os.path.join(output_dir, f"processed_{filename}")
tree.write(outfile=output_file)
For comprehensive API documentation, code examples, and detailed guides, refer to the following resources in the references/ directory:
api_reference.md : Complete API documentation for all ETE classes and methods (Tree, PhyloTree, ClusterTree, NCBITaxa), including parameters, return types, and code examplesworkflows.md : Common workflow patterns organized by task (tree operations, phylogenetic analysis, tree comparison, taxonomy integration, clustering analysis)visualization.md : Comprehensive visualization guide covering TreeStyle, NodeStyle, Faces, layout functions, and advanced visualization techniquesLoad these references when detailed information is needed:
# To use API reference
# Read references/api_reference.md for complete method signatures and parameters
# To implement workflows
# Read references/workflows.md for step-by-step workflow examples
# To create visualizations
# Read references/visualization.md for styling and rendering options
Import errors:
# If "ModuleNotFoundError: No module named 'ete3'"
uv pip install ete3
# For GUI and rendering issues
uv pip install ete3[gui]
Rendering issues:
If tree.render() or tree.show() fails with Qt-related errors, install system dependencies:
# macOS
brew install qt@5
# Ubuntu/Debian
sudo apt-get install python3-pyqt5 python3-pyqt5.qtsvg
NCBI Taxonomy database:
If database download fails or becomes corrupted:
from ete3 import NCBITaxa
ncbi = NCBITaxa()
ncbi.update_taxonomy_database() # Redownload database
Memory issues with large trees:
For very large trees (>10,000 leaves), use iterators instead of list comprehensions:
# Memory-efficient iteration
for leaf in tree.iter_leaves():
process(leaf)
# Instead of
for leaf in tree.get_leaves(): # Loads all into memory
process(leaf)
ETE supports multiple Newick format specifications (0-100):
Specify format when reading/writing:
tree = Tree("tree.nw", format=1)
tree.write(outfile="output.nw", format=5)
NHX (New Hampshire eXtended) format preserves custom features:
tree.write(outfile="tree.nhx", features=["habitat", "temperature", "depth"])
preserve_branch_length=True when pruning for phylogenetic analysisget_cached_content() for repeated access to node contents on large treesiter_* methods for memory-efficient processing of large treestree.show() to test visualizations before rendering to fileWeekly Installs
117
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
claude-code101
opencode91
cursor87
gemini-cli86
antigravity81
codex75
PPTX 文件处理全攻略:Python 脚本创建、编辑、分析 .pptx 文件内容与结构
891 周安装