npx skills add https://github.com/davila7/claude-code-templates --skill esmESM 提供了最先进的蛋白质语言模型,用于理解、生成和设计蛋白质。此技能支持使用两个模型系列:ESM3 用于跨序列、结构和功能的生成式蛋白质设计,以及 ESM C 用于高效的蛋白质表示学习和嵌入。
使用多模态生成模型生成具有所需特性的新型蛋白质序列。
使用场景:
基本用法:
from esm.models.esm3 import ESM3
from esm.sdk.api import ESM3InferenceClient, ESMProtein, GenerationConfig
# 本地加载模型
model: ESM3InferenceClient = ESM3.from_pretrained("esm3-sm-open-v1").to("cuda")
# 创建蛋白质提示
protein = ESMProtein(sequence="MPRT___KEND") # '_' 代表掩码位置
# 生成补全
protein = model.generate(protein, GenerationConfig(track="sequence", num_steps=8))
print(protein.sequence)
通过 Forge API 进行远程/云端使用:
from esm.sdk.forge import ESM3ForgeInferenceClient
from esm.sdk.api import ESMProtein, GenerationConfig
# 连接到 Forge
model = ESM3ForgeInferenceClient(model="esm3-medium-2024-08", url="https://forge.evolutionaryscale.ai", token="<token>")
# 生成
protein = model.generate(protein, GenerationConfig(track="sequence", num_steps=8))
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
详细 ESM3 模型规格、高级生成配置和多模态提示示例,请参阅 references/esm3-api.md。
使用 ESM3 的结构轨迹进行从序列预测结构或逆折叠(从结构设计序列)。
结构预测:
from esm.sdk.api import ESM3InferenceClient, ESMProtein, GenerationConfig
# 从序列预测结构
protein = ESMProtein(sequence="MPRTKEINDAGLIVHSP...")
protein_with_structure = model.generate(
protein,
GenerationConfig(track="structure", num_steps=protein.sequence.count("_"))
)
# 访问预测的结构
coordinates = protein_with_structure.coordinates # 3D 坐标
pdb_string = protein_with_structure.to_pdb()
逆折叠(从结构设计序列):
# 为目标结构设计序列
protein_with_structure = ESMProtein.from_pdb("target_structure.pdb")
protein_with_structure.sequence = None # 移除序列
# 生成折叠为此结构的序列
designed_protein = model.generate(
protein_with_structure,
GenerationConfig(track="sequence", num_steps=50, temperature=0.7)
)
为下游任务(如功能预测、分类或相似性分析)生成高质量的嵌入。
使用场景:
基本用法:
from esm.models.esmc import ESMC
from esm.sdk.api import ESMProtein
# 加载 ESM C 模型
model = ESMC.from_pretrained("esmc-300m").to("cuda")
# 获取嵌入
protein = ESMProtein(sequence="MPRTKEINDAGLIVHSP...")
protein_tensor = model.encode(protein)
# 生成嵌入
embeddings = model.forward(protein_tensor)
批处理:
# 编码多个蛋白质
proteins = [
ESMProtein(sequence="MPRTKEIND..."),
ESMProtein(sequence="AGLIVHSPQ..."),
ESMProtein(sequence="KTEFLNDGR...")
]
embeddings_list = [model.logits(model.forward(model.encode(p))) for p in proteins]
ESM C 模型详情、效率比较和高级嵌入策略,请参阅 references/esm-c-api.md。
使用 ESM3 的功能轨迹生成具有特定功能注释的蛋白质,或从序列预测功能。
功能条件化生成:
from esm.sdk.api import ESMProtein, FunctionAnnotation, GenerationConfig
# 创建具有所需功能的蛋白质
protein = ESMProtein(
sequence="_" * 200, # 生成 200 个残基的蛋白质
function_annotations=[
FunctionAnnotation(label="fluorescent_protein", start=50, end=150)
]
)
# 生成具有指定功能的序列
functional_protein = model.generate(
protein,
GenerationConfig(track="sequence", num_steps=200)
)
使用 ESM3 的思维链生成方法迭代优化蛋白质设计。
from esm.sdk.api import GenerationConfig
# 多步优化
protein = ESMProtein(sequence="MPRT" + "_" * 100 + "KEND")
# 步骤 1:生成初始结构
config = GenerationConfig(track="structure", num_steps=50)
protein = model.generate(protein, config)
# 步骤 2:基于结构优化序列
config = GenerationConfig(track="sequence", num_steps=50, temperature=0.5)
protein = model.generate(protein, config)
# 步骤 3:预测功能
config = GenerationConfig(track="function", num_steps=20)
protein = model.generate(protein, config)
使用 Forge 的异步执行器高效处理多个蛋白质。
from esm.sdk.forge import ESM3ForgeInferenceClient
import asyncio
client = ESM3ForgeInferenceClient(model="esm3-medium-2024-08", token="<token>")
# 异步批处理
async def batch_generate(proteins_list):
tasks = [
client.async_generate(protein, GenerationConfig(track="sequence"))
for protein in proteins_list
]
return await asyncio.gather(*tasks)
# 执行
proteins = [ESMProtein(sequence=f"MPRT{'_' * 50}KEND") for _ in range(10)]
results = asyncio.run(batch_generate(proteins))
详细 Forge API 文档、身份验证、速率限制和批处理模式,请参阅 references/forge-api.md。
ESM3 模型(生成式):
esm3-sm-open-v1 (1.4B) - 开放权重,本地使用,适合实验esm3-medium-2024-08 (7B) - 质量与速度的最佳平衡(仅限 Forge)esm3-large-2024-03 (98B) - 最高质量,速度较慢(仅限 Forge)ESM C 模型(嵌入):
esmc-300m (30 层) - 轻量级,推理速度快esmc-600m (36 层) - 性能平衡esmc-6b (80 层) - 最大表示质量选择标准:
esm3-sm-open-v1 或 esmc-300mesm3-medium-2024-08esm3-large-2024-03 或 esmc-6b基本安装:
uv pip install esm
安装 Flash Attention(推荐用于更快的推理):
uv pip install esm
uv pip install flash-attn --no-build-isolation
用于 Forge API 访问:
uv pip install esm # SDK 包含 Forge 客户端
无需额外依赖项。在 https://forge.evolutionaryscale.ai 获取 Forge API 令牌。
详细示例和完整工作流,请参阅 references/workflows.md,其中包括:
此技能包含全面的参考文档:
references/esm3-api.md - ESM3 模型架构、API 参考、生成参数和多模态提示references/esm-c-api.md - ESM C 模型详情、嵌入策略和性能优化references/forge-api.md - Forge 平台文档、身份验证、批处理和部署references/workflows.md - 完整示例和常见工作流模式这些参考资料包含详细的 API 规范、参数描述和高级使用模式。根据具体任务需要加载它们。
对于生成任务:
esm3-sm-open-v1)对于嵌入任务:
对于生产部署:
ESM 专为蛋白质工程、药物发现和科学研究中的有益应用而设计。设计新型蛋白质时,请遵循负责任生物设计框架 (https://responsiblebiodesign.ai/)。在进行实验验证之前,请考虑蛋白质设计的生物安全性和伦理影响。
每周安装数
117
仓库
GitHub 星标数
22.6K
首次出现
2026 年 1 月 21 日
安全审计
安装于
claude-code100
opencode91
cursor88
gemini-cli87
antigravity84
codex76
ESM provides state-of-the-art protein language models for understanding, generating, and designing proteins. This skill enables working with two model families: ESM3 for generative protein design across sequence, structure, and function, and ESM C for efficient protein representation learning and embeddings.
Generate novel protein sequences with desired properties using multimodal generative modeling.
When to use:
Basic usage:
from esm.models.esm3 import ESM3
from esm.sdk.api import ESM3InferenceClient, ESMProtein, GenerationConfig
# Load model locally
model: ESM3InferenceClient = ESM3.from_pretrained("esm3-sm-open-v1").to("cuda")
# Create protein prompt
protein = ESMProtein(sequence="MPRT___KEND") # '_' represents masked positions
# Generate completion
protein = model.generate(protein, GenerationConfig(track="sequence", num_steps=8))
print(protein.sequence)
For remote/cloud usage via Forge API:
from esm.sdk.forge import ESM3ForgeInferenceClient
from esm.sdk.api import ESMProtein, GenerationConfig
# Connect to Forge
model = ESM3ForgeInferenceClient(model="esm3-medium-2024-08", url="https://forge.evolutionaryscale.ai", token="<token>")
# Generate
protein = model.generate(protein, GenerationConfig(track="sequence", num_steps=8))
See references/esm3-api.md for detailed ESM3 model specifications, advanced generation configurations, and multimodal prompting examples.
Use ESM3's structure track for structure prediction from sequence or inverse folding (sequence design from structure).
Structure prediction:
from esm.sdk.api import ESM3InferenceClient, ESMProtein, GenerationConfig
# Predict structure from sequence
protein = ESMProtein(sequence="MPRTKEINDAGLIVHSP...")
protein_with_structure = model.generate(
protein,
GenerationConfig(track="structure", num_steps=protein.sequence.count("_"))
)
# Access predicted structure
coordinates = protein_with_structure.coordinates # 3D coordinates
pdb_string = protein_with_structure.to_pdb()
Inverse folding (sequence from structure):
# Design sequence for a target structure
protein_with_structure = ESMProtein.from_pdb("target_structure.pdb")
protein_with_structure.sequence = None # Remove sequence
# Generate sequence that folds to this structure
designed_protein = model.generate(
protein_with_structure,
GenerationConfig(track="sequence", num_steps=50, temperature=0.7)
)
Generate high-quality embeddings for downstream tasks like function prediction, classification, or similarity analysis.
When to use:
Basic usage:
from esm.models.esmc import ESMC
from esm.sdk.api import ESMProtein
# Load ESM C model
model = ESMC.from_pretrained("esmc-300m").to("cuda")
# Get embeddings
protein = ESMProtein(sequence="MPRTKEINDAGLIVHSP...")
protein_tensor = model.encode(protein)
# Generate embeddings
embeddings = model.forward(protein_tensor)
Batch processing:
# Encode multiple proteins
proteins = [
ESMProtein(sequence="MPRTKEIND..."),
ESMProtein(sequence="AGLIVHSPQ..."),
ESMProtein(sequence="KTEFLNDGR...")
]
embeddings_list = [model.logits(model.forward(model.encode(p))) for p in proteins]
See references/esm-c-api.md for ESM C model details, efficiency comparisons, and advanced embedding strategies.
Use ESM3's function track to generate proteins with specific functional annotations or predict function from sequence.
Function-conditioned generation:
from esm.sdk.api import ESMProtein, FunctionAnnotation, GenerationConfig
# Create protein with desired function
protein = ESMProtein(
sequence="_" * 200, # Generate 200 residue protein
function_annotations=[
FunctionAnnotation(label="fluorescent_protein", start=50, end=150)
]
)
# Generate sequence with specified function
functional_protein = model.generate(
protein,
GenerationConfig(track="sequence", num_steps=200)
)
Iteratively refine protein designs using ESM3's chain-of-thought generation approach.
from esm.sdk.api import GenerationConfig
# Multi-step refinement
protein = ESMProtein(sequence="MPRT" + "_" * 100 + "KEND")
# Step 1: Generate initial structure
config = GenerationConfig(track="structure", num_steps=50)
protein = model.generate(protein, config)
# Step 2: Refine sequence based on structure
config = GenerationConfig(track="sequence", num_steps=50, temperature=0.5)
protein = model.generate(protein, config)
# Step 3: Predict function
config = GenerationConfig(track="function", num_steps=20)
protein = model.generate(protein, config)
Process multiple proteins efficiently using Forge's async executor.
from esm.sdk.forge import ESM3ForgeInferenceClient
import asyncio
client = ESM3ForgeInferenceClient(model="esm3-medium-2024-08", token="<token>")
# Async batch processing
async def batch_generate(proteins_list):
tasks = [
client.async_generate(protein, GenerationConfig(track="sequence"))
for protein in proteins_list
]
return await asyncio.gather(*tasks)
# Execute
proteins = [ESMProtein(sequence=f"MPRT{'_' * 50}KEND") for _ in range(10)]
results = asyncio.run(batch_generate(proteins))
See references/forge-api.md for detailed Forge API documentation, authentication, rate limits, and batch processing patterns.
ESM3 Models (Generative):
esm3-sm-open-v1 (1.4B) - Open weights, local usage, good for experimentationesm3-medium-2024-08 (7B) - Best balance of quality and speed (Forge only)esm3-large-2024-03 (98B) - Highest quality, slower (Forge only)ESM C Models (Embeddings):
esmc-300m (30 layers) - Lightweight, fast inferenceesmc-600m (36 layers) - Balanced performanceesmc-6b (80 layers) - Maximum representation qualitySelection criteria:
esm3-sm-open-v1 or esmc-300mesm3-medium-2024-08 via Forgeesm3-large-2024-03 or esmc-6bBasic installation:
uv pip install esm
With Flash Attention (recommended for faster inference):
uv pip install esm
uv pip install flash-attn --no-build-isolation
For Forge API access:
uv pip install esm # SDK includes Forge client
No additional dependencies needed. Obtain Forge API token at https://forge.evolutionaryscale.ai
For detailed examples and complete workflows, see references/workflows.md which includes:
This skill includes comprehensive reference documentation:
references/esm3-api.md - ESM3 model architecture, API reference, generation parameters, and multimodal promptingreferences/esm-c-api.md - ESM C model details, embedding strategies, and performance optimizationreferences/forge-api.md - Forge platform documentation, authentication, batch processing, and deploymentreferences/workflows.md - Complete examples and common workflow patternsThese references contain detailed API specifications, parameter descriptions, and advanced usage patterns. Load them as needed for specific tasks.
For generation tasks:
esm3-sm-open-v1)For embedding tasks:
For production deployment:
ESM is designed for beneficial applications in protein engineering, drug discovery, and scientific research. Follow the Responsible Biodesign Framework (https://responsiblebiodesign.ai/) when designing novel proteins. Consider biosafety and ethical implications of protein designs before experimental validation.
Weekly Installs
117
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
claude-code100
opencode91
cursor88
gemini-cli87
antigravity84
codex76
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
50,900 周安装