pdb-database by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill pdb-databaseRCSB PDB 是全球生物大分子三维结构数据的存储库。可搜索结构、获取坐标和元数据,在超过 20 万个实验测定结构和计算模型中进行序列和结构相似性搜索。
此技能应在以下情况下使用:
使用各种搜索条件查找 PDB 条目:
文本搜索: 按蛋白质名称、关键词或描述搜索
from rcsbapi.search import TextQuery
query = TextQuery("hemoglobin")
results = list(query())
print(f"Found {len(results)} structures")
属性搜索: 查询特定属性(生物体、分辨率、方法等)
from rcsbapi.search import AttributeQuery
from rcsbapi.search.attrs import rcsb_entity_source_organism
# 查找人类蛋白质结构
query = AttributeQuery(
attribute=rcsb_entity_source_organism.scientific_name,
operator="exact_match",
value="Homo sapiens"
)
results = list(query())
序列相似性: 查找与给定序列相似的结构
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
from rcsbapi.search import SequenceQuery
query = SequenceQuery(
value="MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHHYREQIKRVKDSEDVPMVLVGNKCDLPSRTVDTKQAQDLARSYGIPFIETSAKTRQGVDDAFYTLVREIRKHKEKMSKDGKKKKKKSKTKCVIM",
evalue_cutoff=0.1,
identity_cutoff=0.9
)
results = list(query())
结构相似性: 查找具有相似三维几何形状的结构
from rcsbapi.search import StructSimilarityQuery
query = StructSimilarityQuery(
structure_search_type="entry",
entry_id="4HHB" # 血红蛋白
)
results = list(query())
组合查询: 使用逻辑运算符构建复杂搜索
from rcsbapi.search import TextQuery, AttributeQuery
from rcsbapi.search.attrs import rcsb_entry_info
# 高分辨率人类蛋白质
query1 = AttributeQuery(
attribute=rcsb_entity_source_organism.scientific_name,
operator="exact_match",
value="Homo sapiens"
)
query2 = AttributeQuery(
attribute=rcsb_entry_info.resolution_combined,
operator="less",
value=2.0
)
combined_query = query1 & query2 # AND 操作
results = list(combined_query())
访问特定 PDB 条目的详细信息:
基本条目信息:
from rcsbapi.data import Schema, fetch
# 获取条目级数据
entry_data = fetch("4HHB", schema=Schema.ENTRY)
print(entry_data["struct"]["title"])
print(entry_data["exptl"][0]["method"])
聚合物实体信息:
# 获取蛋白质/核酸信息
entity_data = fetch("4HHB_1", schema=Schema.POLYMER_ENTITY)
print(entity_data["entity_poly"]["pdbx_seq_one_letter_code"])
使用 GraphQL 进行灵活查询:
from rcsbapi.data import fetch
# 自定义 GraphQL 查询
query = """
{
entry(entry_id: "4HHB") {
struct {
title
}
exptl {
method
}
rcsb_entry_info {
resolution_combined
deposited_atom_count
}
}
}
"""
data = fetch(query_type="graphql", query=query)
以各种格式检索坐标文件:
下载方法:
https://files.rcsb.org/download/{PDB_ID}.pdbhttps://files.rcsb.org/download/{PDB_ID}.cifhttps://files.rcsb.org/download/{PDB_ID}.pdb1(针对组装体 1)下载示例:
import requests
pdb_id = "4HHB"
# 下载 PDB 格式
pdb_url = f"https://files.rcsb.org/download/{pdb_id}.pdb"
response = requests.get(pdb_url)
with open(f"{pdb_id}.pdb", "w") as f:
f.write(response.text)
# 下载 mmCIF 格式
cif_url = f"https://files.rcsb.org/download/{pdb_id}.cif"
response = requests.get(cif_url)
with open(f"{pdb_id}.cif", "w") as f:
f.write(response.text)
对检索到的结构进行常见操作:
解析和分析坐标: 使用 BioPython 或其他结构生物学库处理下载的文件:
from Bio.PDB import PDBParser
parser = PDBParser()
structure = parser.get_structure("protein", "4HHB.pdb")
# 遍历原子
for model in structure:
for chain in model:
for residue in chain:
for atom in residue:
print(atom.get_coord())
提取元数据:
from rcsbapi.data import fetch, Schema
# 获取实验细节
data = fetch("4HHB", schema=Schema.ENTRY)
resolution = data.get("rcsb_entry_info", {}).get("resolution_combined")
method = data.get("exptl", [{}])[0].get("method")
deposition_date = data.get("rcsb_accession_info", {}).get("deposit_date")
print(f"Resolution: {resolution} Å")
print(f"Method: {method}")
print(f"Deposited: {deposition_date}")
高效处理多个结构:
from rcsbapi.data import fetch, Schema
pdb_ids = ["4HHB", "1MBN", "1GZX"] # 血红蛋白、肌红蛋白等
results = {}
for pdb_id in pdb_ids:
try:
data = fetch(pdb_id, schema=Schema.ENTRY)
results[pdb_id] = {
"title": data["struct"]["title"],
"resolution": data.get("rcsb_entry_info", {}).get("resolution_combined"),
"organism": data.get("rcsb_entity_source_organism", [{}])[0].get("scientific_name")
}
except Exception as e:
print(f"Error fetching {pdb_id}: {e}")
# 显示结果
for pdb_id, info in results.items():
print(f"\n{pdb_id}: {info['title']}")
print(f" Resolution: {info['resolution']} Å")
print(f" Organism: {info['organism']}")
安装官方的 RCSB PDB Python API 客户端:
# 当前推荐包
uv pip install rcsb-api
# 对于遗留代码(已弃用,请使用 rcsb-api)
uv pip install rcsbsearchapi
rcsb-api 包通过 rcsbapi.search 和 rcsbapi.data 模块提供对搜索和数据 API 的统一访问。
PDB ID: 每个结构条目的唯一 4 字符标识符(例如,"4HHB")。AlphaFold 和 ModelArchive 条目以 "AF_" 或 "MA_" 前缀开头。
mmCIF/PDBx: 使用键值结构的现代文件格式,取代了用于大型结构的传统 PDB 格式。
生物组装体: 大分子的功能形式,可能包含来自不对称单元的多个链副本。
分辨率: 晶体学结构中细节的度量(数值越低 = 细节越高)。高质量结构的典型范围:1.5-3.5 Å。
实体: 结构中的独特分子组分(蛋白质链、DNA、配体等)。
此技能在 references/ 目录中包含参考文档:
全面的 API 文档,涵盖:
当您需要关于 API 功能、复杂查询构建或详细数据模式信息的深入信息时,请使用此参考。
每周安装数
123
仓库
GitHub 星标数
22.6K
首次出现
2026 年 1 月 21 日
安全审计
安装于
claude-code105
opencode96
cursor92
gemini-cli91
antigravity83
codex81
RCSB PDB is the worldwide repository for 3D structural data of biological macromolecules. Search for structures, retrieve coordinates and metadata, perform sequence and structure similarity searches across 200,000+ experimentally determined structures and computed models.
This skill should be used when:
Find PDB entries using various search criteria:
Text Search: Search by protein name, keywords, or descriptions
from rcsbapi.search import TextQuery
query = TextQuery("hemoglobin")
results = list(query())
print(f"Found {len(results)} structures")
Attribute Search: Query specific properties (organism, resolution, method, etc.)
from rcsbapi.search import AttributeQuery
from rcsbapi.search.attrs import rcsb_entity_source_organism
# Find human protein structures
query = AttributeQuery(
attribute=rcsb_entity_source_organism.scientific_name,
operator="exact_match",
value="Homo sapiens"
)
results = list(query())
Sequence Similarity: Find structures similar to a given sequence
from rcsbapi.search import SequenceQuery
query = SequenceQuery(
value="MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHHYREQIKRVKDSEDVPMVLVGNKCDLPSRTVDTKQAQDLARSYGIPFIETSAKTRQGVDDAFYTLVREIRKHKEKMSKDGKKKKKKSKTKCVIM",
evalue_cutoff=0.1,
identity_cutoff=0.9
)
results = list(query())
Structure Similarity: Find structures with similar 3D geometry
from rcsbapi.search import StructSimilarityQuery
query = StructSimilarityQuery(
structure_search_type="entry",
entry_id="4HHB" # Hemoglobin
)
results = list(query())
Combining Queries: Use logical operators to build complex searches
from rcsbapi.search import TextQuery, AttributeQuery
from rcsbapi.search.attrs import rcsb_entry_info
# High-resolution human proteins
query1 = AttributeQuery(
attribute=rcsb_entity_source_organism.scientific_name,
operator="exact_match",
value="Homo sapiens"
)
query2 = AttributeQuery(
attribute=rcsb_entry_info.resolution_combined,
operator="less",
value=2.0
)
combined_query = query1 & query2 # AND operation
results = list(combined_query())
Access detailed information about specific PDB entries:
Basic Entry Information:
from rcsbapi.data import Schema, fetch
# Get entry-level data
entry_data = fetch("4HHB", schema=Schema.ENTRY)
print(entry_data["struct"]["title"])
print(entry_data["exptl"][0]["method"])
Polymer Entity Information:
# Get protein/nucleic acid information
entity_data = fetch("4HHB_1", schema=Schema.POLYMER_ENTITY)
print(entity_data["entity_poly"]["pdbx_seq_one_letter_code"])
Using GraphQL for Flexible Queries:
from rcsbapi.data import fetch
# Custom GraphQL query
query = """
{
entry(entry_id: "4HHB") {
struct {
title
}
exptl {
method
}
rcsb_entry_info {
resolution_combined
deposited_atom_count
}
}
}
"""
data = fetch(query_type="graphql", query=query)
Retrieve coordinate files in various formats:
Download Methods:
https://files.rcsb.org/download/{PDB_ID}.pdbhttps://files.rcsb.org/download/{PDB_ID}.cifhttps://files.rcsb.org/download/{PDB_ID}.pdb1 (for assembly 1)Example Download:
import requests
pdb_id = "4HHB"
# Download PDB format
pdb_url = f"https://files.rcsb.org/download/{pdb_id}.pdb"
response = requests.get(pdb_url)
with open(f"{pdb_id}.pdb", "w") as f:
f.write(response.text)
# Download mmCIF format
cif_url = f"https://files.rcsb.org/download/{pdb_id}.cif"
response = requests.get(cif_url)
with open(f"{pdb_id}.cif", "w") as f:
f.write(response.text)
Common operations with retrieved structures:
Parse and Analyze Coordinates: Use BioPython or other structural biology libraries to work with downloaded files:
from Bio.PDB import PDBParser
parser = PDBParser()
structure = parser.get_structure("protein", "4HHB.pdb")
# Iterate through atoms
for model in structure:
for chain in model:
for residue in chain:
for atom in residue:
print(atom.get_coord())
Extract Metadata:
from rcsbapi.data import fetch, Schema
# Get experimental details
data = fetch("4HHB", schema=Schema.ENTRY)
resolution = data.get("rcsb_entry_info", {}).get("resolution_combined")
method = data.get("exptl", [{}])[0].get("method")
deposition_date = data.get("rcsb_accession_info", {}).get("deposit_date")
print(f"Resolution: {resolution} Å")
print(f"Method: {method}")
print(f"Deposited: {deposition_date}")
Process multiple structures efficiently:
from rcsbapi.data import fetch, Schema
pdb_ids = ["4HHB", "1MBN", "1GZX"] # Hemoglobin, myoglobin, etc.
results = {}
for pdb_id in pdb_ids:
try:
data = fetch(pdb_id, schema=Schema.ENTRY)
results[pdb_id] = {
"title": data["struct"]["title"],
"resolution": data.get("rcsb_entry_info", {}).get("resolution_combined"),
"organism": data.get("rcsb_entity_source_organism", [{}])[0].get("scientific_name")
}
except Exception as e:
print(f"Error fetching {pdb_id}: {e}")
# Display results
for pdb_id, info in results.items():
print(f"\n{pdb_id}: {info['title']}")
print(f" Resolution: {info['resolution']} Å")
print(f" Organism: {info['organism']}")
Install the official RCSB PDB Python API client:
# Current recommended package
uv pip install rcsb-api
# For legacy code (deprecated, use rcsb-api instead)
uv pip install rcsbsearchapi
The rcsb-api package provides unified access to both Search and Data APIs through the rcsbapi.search and rcsbapi.data modules.
PDB ID: Unique 4-character identifier (e.g., "4HHB") for each structure entry. AlphaFold and ModelArchive entries start with "AF_" or "MA_" prefixes.
mmCIF/PDBx: Modern file format that uses key-value structure, replacing legacy PDB format for large structures.
Biological Assembly: The functional form of a macromolecule, which may contain multiple copies of chains from the asymmetric unit.
Resolution: Measure of detail in crystallographic structures (lower values = higher detail). Typical range: 1.5-3.5 Å for high-quality structures.
Entity: A unique molecular component in a structure (protein chain, DNA, ligand, etc.).
This skill includes reference documentation in the references/ directory:
Comprehensive API documentation covering:
Use this reference when you need in-depth information about API capabilities, complex query construction, or detailed data schema information.
Weekly Installs
123
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
claude-code105
opencode96
cursor92
gemini-cli91
antigravity83
codex81
lark-cli 共享规则:飞书资源操作指南与权限配置详解
39,000 周安装