zinc-database by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill zinc-databaseZINC 是由 UCSF 维护的一个免费访问的、包含 2.3 亿以上可购买化合物的存储库。可通过 ZINC ID 或 SMILES 进行搜索,执行相似性搜索,下载用于对接的 3D 就绪结构,为虚拟筛选和药物发现寻找类似物。
此技能应在以下情况使用:
ZINC 已经历多个版本演变:
此技能主要关注 ZINC22,这是最新且最全面的版本。
主要访问点:https://zinc.docking.org/ 交互式搜索:https://cartblanche22.docking.org/
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
所有 ZINC22 搜索都可以通过 CartBlanche22 API 以编程方式执行:
基础 URL:https://cartblanche22.docking.org/
所有 API 端点均以文本或 JSON 格式返回数据,字段可自定义。
使用 ZINC 标识符检索特定化合物。
API 端点:
curl "https://cartblanche22.docking.org/[email protected]_fields=smiles,zinc_id"
多个 ID:
curl "https://cartblanche22.docking.org/substances.txt:zinc_id=ZINC000000000001,ZINC000000000002&output_fields=smiles,zinc_id,tranche"
响应字段:zinc_id, smiles, sub_id, supplier_code, catalogs, tranche(包含 H-计数、LogP、MW、相)
使用 SMILES 表示法通过化学结构查找化合物,并可选择距离参数进行类似物搜索。
API 端点:
curl "https://cartblanche22.docking.org/[email protected]=4-Fadist=4"
参数:
smiles:查询 SMILES 字符串(必要时进行 URL 编码)dist:Tanimoto 距离阈值(默认值:0,表示精确匹配)adist:用于更广泛搜索的替代距离参数(默认值:0)output_fields:所需输出字段的逗号分隔列表示例 - 精确匹配:
curl "https://cartblanche22.docking.org/smiles.txt:smiles=c1ccccc1"
示例 - 相似性搜索:
curl "https://cartblanche22.docking.org/smiles.txt:smiles=c1ccccc1&dist=3&output_fields=zinc_id,smiles,tranche"
查询来自特定化学品供应商的化合物,或从特定目录检索所有分子。
API 端点:
curl "https://cartblanche22.docking.org/catitems.txt:catitem_id=SUPPLIER-CODE-123"
使用场景:
为筛选或基准测试目的生成随机化合物集。
API 端点:
curl "https://cartblanche22.docking.org/substance/random.txt:count=100"
参数:
count:要检索的随机化合物数量(默认值:100)subset:按子集过滤(例如 'lead-like'、'drug-like'、'fragment')output_fields:自定义返回的数据字段示例 - 随机类先导分子:
curl "https://cartblanche22.docking.org/substance/random.txt:count=1000&subset=lead-like&output_fields=zinc_id,smiles,tranche"
定义搜索标准:基于目标属性或所需化学空间
查询 ZINC22:使用适当的搜索方法:
# 示例:获取具有特定 LogP 和 MW 的类药化合物
curl "https://cartblanche22.docking.org/substance/random.txt:count=10000&subset=drug-like&output_fields=zinc_id,smiles,tranche" > docking_library.txt
解析结果:提取 ZINC ID 和 SMILES:
import pandas as pd
# 加载结果
df = pd.read_csv('docking_library.txt', sep='\t')
# 根据 tranche 数据中的属性进行过滤
# Tranche 格式:H##P###M###-phase
# H = 氢键供体数,P = LogP*10,M = 分子量
下载 3D 结构:使用 ZINC ID 下载或从文件存储库下载,用于对接
获取命中化合物的 SMILES:
hit_smiles = "CC(C)Cc1ccc(cc1)C(C)C(=O)O" # 示例:布洛芬
执行相似性搜索:使用距离阈值:
curl "https://cartblanche22.docking.org/smiles.txt:smiles=CC(C)Cc1ccc(cc1)C(C)C(=O)O&dist=5&output_fields=zinc_id,smiles,catalogs" > analogs.txt
分析结果:识别可购买的类似物:
import pandas as pd
analogs = pd.read_csv('analogs.txt', sep='\t')
print(f"Found {len(analogs)} analogs")
print(analogs[['zinc_id', 'smiles', 'catalogs']].head(10))
检索最有前景的类似物的 3D 结构
编译 ZINC ID 列表:从文献、数据库或之前的筛选中:
zinc_ids = [
"ZINC000000000001",
"ZINC000000000002",
"ZINC000000000003"
]
zinc_ids_str = ",".join(zinc_ids)
查询 ZINC22 API:
curl "https://cartblanche22.docking.org/substances.txt:zinc_id=ZINC000000000001,ZINC000000000002&output_fields=zinc_id,smiles,supplier_code,catalogs"
处理结果:用于下游分析或采购
选择子集参数:基于筛选目标:
生成随机样本:
curl "https://cartblanche22.docking.org/substance/random.txt:count=5000&subset=lead-like&output_fields=zinc_id,smiles,tranche" > chemical_space_sample.txt
分析化学多样性:并为虚拟筛选做准备
使用 output_fields 参数自定义 API 响应:
可用字段:
zinc_id:ZINC 标识符smiles:SMILES 字符串表示sub_id:内部物质 IDsupplier_code:供应商目录号catalogs:提供该化合物的供应商列表tranche:编码的分子属性(H-计数、LogP、MW、反应性相)示例:
curl "https://cartblanche22.docking.org/substances.txt:zinc_id=ZINC000000000001&output_fields=zinc_id,smiles,catalogs,tranche"
ZINC 根据分子属性将化合物组织成 "tranches":
格式:H##P###M###-phase
示例 tranche:H05P035M400-0
使用 tranche 数据根据类药性标准过滤化合物。
对于分子对接,可通过文件存储库获取 3D 结构:
结构按 tranche 组织,并提供多种格式:
有关下载协议和批量访问方法,请参阅 https://wiki.docking.org 上的 ZINC 文档。
import subprocess
import json
def query_zinc_by_id(zinc_id, output_fields="zinc_id,smiles,catalogs"):
"""通过 ZINC ID 查询 ZINC22。"""
url = f"https://cartblanche22.docking.org/[email protected]_id={zinc_id}&output_fields={output_fields}"
result = subprocess.run(['curl', url], capture_output=True, text=True)
return result.stdout
def search_by_smiles(smiles, dist=0, adist=0, output_fields="zinc_id,smiles"):
"""通过 SMILES 搜索 ZINC22,可选择距离参数。"""
url = f"https://cartblanche22.docking.org/smiles.txt:smiles={smiles}&dist={dist}&adist={adist}&output_fields={output_fields}"
result = subprocess.run(['curl', url], capture_output=True, text=True)
return result.stdout
def get_random_compounds(count=100, subset=None, output_fields="zinc_id,smiles,tranche"):
"""从 ZINC22 获取随机化合物。"""
url = f"https://cartblanche22.docking.org/substance/random.txt:count={count}&output_fields={output_fields}"
if subset:
url += f"&subset={subset}"
result = subprocess.run(['curl', url], capture_output=True, text=True)
return result.stdout
import pandas as pd
from io import StringIO
# 查询 ZINC 并解析为 DataFrame
result = query_zinc_by_id("ZINC000000000001")
df = pd.read_csv(StringIO(result), sep='\t')
# 提取 tranche 属性
def parse_tranche(tranche_str):
"""解析 ZINC tranche 代码以提取属性。"""
# 格式:H##P###M###-phase
import re
match = re.match(r'H(\d+)P(\d+)M(\d+)-(\d+)', tranche_str)
if match:
return {
'h_donors': int(match.group(1)),
'logP': int(match.group(2)) / 10.0,
'mw': int(match.group(3)),
'phase': int(match.group(4))
}
return None
df['tranche_props'] = df['tranche'].apply(parse_tranche)
全面的文档,包括:
有关详细的技术信息和高级使用模式,请查阅此文档。
ZINC 明确声明:"我们不保证任何分子用于任何目的的质量,并且对使用此数据库产生的错误不承担任何责任。"
在出版物中使用 ZINC 时,请引用相应版本:
ZINC22:Irwin, J. J., 等人. "ZINC22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery." Journal of Chemical Information and Modeling 2023.
ZINC15:Irwin, J. J., 等人. "ZINC15 – Ligand Discovery for Everyone." Journal of Chemical Information and Modeling 2020, 60, 6065–6073.
每周安装次数
135
仓库
GitHub 星标数
23.4K
首次出现
2026 年 1 月 21 日
安全审计
安装于
claude-code118
opencode108
cursor105
gemini-cli104
antigravity102
codex94
ZINC is a freely accessible repository of 230M+ purchasable compounds maintained by UCSF. Search by ZINC ID or SMILES, perform similarity searches, download 3D-ready structures for docking, discover analogs for virtual screening and drug discovery.
This skill should be used when:
ZINC has evolved through multiple versions:
This skill primarily focuses on ZINC22, the most current and comprehensive version.
Primary access point: https://zinc.docking.org/ Interactive searching: https://cartblanche22.docking.org/
All ZINC22 searches can be performed programmatically via the CartBlanche22 API:
Base URL : https://cartblanche22.docking.org/
All API endpoints return data in text or JSON format with customizable fields.
Retrieve specific compounds using their ZINC identifiers.
Web interface : https://cartblanche22.docking.org/search/zincid
API endpoint :
curl "https://cartblanche22.docking.org/[email protected]_fields=smiles,zinc_id"
Multiple IDs :
curl "https://cartblanche22.docking.org/substances.txt:zinc_id=ZINC000000000001,ZINC000000000002&output_fields=smiles,zinc_id,tranche"
Response fields : zinc_id, smiles, sub_id, supplier_code, catalogs, tranche (includes H-count, LogP, MW, phase)
Find compounds by chemical structure using SMILES notation, with optional distance parameters for analog searching.
Web interface : https://cartblanche22.docking.org/search/smiles
API endpoint :
curl "https://cartblanche22.docking.org/[email protected]=4-Fadist=4"
Parameters :
smiles: Query SMILES string (URL-encoded if necessary)dist: Tanimoto distance threshold (default: 0 for exact match)adist: Alternative distance parameter for broader searches (default: 0)output_fields: Comma-separated list of desired output fieldsExample - Exact match :
curl "https://cartblanche22.docking.org/smiles.txt:smiles=c1ccccc1"
Example - Similarity search :
curl "https://cartblanche22.docking.org/smiles.txt:smiles=c1ccccc1&dist=3&output_fields=zinc_id,smiles,tranche"
Query compounds from specific chemical suppliers or retrieve all molecules from particular catalogs.
Web interface : https://cartblanche22.docking.org/search/catitems
API endpoint :
curl "https://cartblanche22.docking.org/catitems.txt:catitem_id=SUPPLIER-CODE-123"
Use cases :
Generate random compound sets for screening or benchmarking purposes.
Web interface : https://cartblanche22.docking.org/search/random
API endpoint :
curl "https://cartblanche22.docking.org/substance/random.txt:count=100"
Parameters :
count: Number of random compounds to retrieve (default: 100)subset: Filter by subset (e.g., 'lead-like', 'drug-like', 'fragment')output_fields: Customize returned data fieldsExample - Random lead-like molecules :
curl "https://cartblanche22.docking.org/substance/random.txt:count=1000&subset=lead-like&output_fields=zinc_id,smiles,tranche"
Define search criteria based on target properties or desired chemical space
Query ZINC22 using appropriate search method:
curl "https://cartblanche22.docking.org/substance/random.txt:count=10000&subset=drug-like&output_fields=zinc_id,smiles,tranche" > docking_library.txt
Parse results to extract ZINC IDs and SMILES:
import pandas as pd
# Load results
df = pd.read_csv('docking_library.txt', sep='\t')
# Filter by properties in tranche data
# Tranche format: H##P###M###-phase
# H = H-bond donors, P = LogP*10, M = MW
4. Download 3D structures for docking using ZINC ID or download from file repositories
Obtain SMILES of the hit compound:
hit_smiles = "CC(C)Cc1ccc(cc1)C(C)C(=O)O" # Example: Ibuprofen
Perform similarity search with distance threshold:
curl "https://cartblanche22.docking.org/smiles.txt:smiles=CC(C)Cc1ccc(cc1)C(C)C(=O)O&dist=5&output_fields=zinc_id,smiles,catalogs" > analogs.txt
Analyze results to identify purchasable analogs:
import pandas as pd
analogs = pd.read_csv('analogs.txt', sep='\t')
print(f"Found {len(analogs)} analogs")
print(analogs[['zinc_id', 'smiles', 'catalogs']].head(10))
4. Retrieve 3D structures for the most promising analogs
Compile list of ZINC IDs from literature, databases, or previous screens:
zinc_ids = [ "ZINC000000000001", "ZINC000000000002", "ZINC000000000003" ] zinc_ids_str = ",".join(zinc_ids)
Query ZINC22 API :
Process results for downstream analysis or purchasing
Select subset parameters based on screening goals:
Generate random sample :
curl "https://cartblanche22.docking.org/substance/random.txt:count=5000&subset=lead-like&output_fields=zinc_id,smiles,tranche" > chemical_space_sample.txt
Analyze chemical diversity and prepare for virtual screening
Customize API responses with the output_fields parameter:
Available fields :
zinc_id: ZINC identifiersmiles: SMILES string representationsub_id: Internal substance IDsupplier_code: Vendor catalog numbercatalogs: List of suppliers offering the compoundtranche: Encoded molecular properties (H-count, LogP, MW, reactivity phase)Example :
curl "https://cartblanche22.docking.org/substances.txt:zinc_id=ZINC000000000001&output_fields=zinc_id,smiles,catalogs,tranche"
ZINC organizes compounds into "tranches" based on molecular properties:
Format : H##P###M###-phase
Example tranche : H05P035M400-0
Use tranche data to filter compounds by drug-likeness criteria.
For molecular docking, 3D structures are available via file repositories:
File repository : https://files.docking.org/zinc22/
Structures are organized by tranches and available in multiple formats:
Refer to ZINC documentation at https://wiki.docking.org for downloading protocols and batch access methods.
import subprocess
import json
def query_zinc_by_id(zinc_id, output_fields="zinc_id,smiles,catalogs"):
"""Query ZINC22 by ZINC ID."""
url = f"https://cartblanche22.docking.org/[email protected]_id={zinc_id}&output_fields={output_fields}"
result = subprocess.run(['curl', url], capture_output=True, text=True)
return result.stdout
def search_by_smiles(smiles, dist=0, adist=0, output_fields="zinc_id,smiles"):
"""Search ZINC22 by SMILES with optional distance parameters."""
url = f"https://cartblanche22.docking.org/smiles.txt:smiles={smiles}&dist={dist}&adist={adist}&output_fields={output_fields}"
result = subprocess.run(['curl', url], capture_output=True, text=True)
return result.stdout
def get_random_compounds(count=100, subset=None, output_fields="zinc_id,smiles,tranche"):
"""Get random compounds from ZINC22."""
url = f"https://cartblanche22.docking.org/substance/random.txt:count={count}&output_fields={output_fields}"
if subset:
url += f"&subset={subset}"
result = subprocess.run(['curl', url], capture_output=True, text=True)
return result.stdout
import pandas as pd
from io import StringIO
# Query ZINC and parse as DataFrame
result = query_zinc_by_id("ZINC000000000001")
df = pd.read_csv(StringIO(result), sep='\t')
# Extract tranche properties
def parse_tranche(tranche_str):
"""Parse ZINC tranche code to extract properties."""
# Format: H##P###M###-phase
import re
match = re.match(r'H(\d+)P(\d+)M(\d+)-(\d+)', tranche_str)
if match:
return {
'h_donors': int(match.group(1)),
'logP': int(match.group(2)) / 10.0,
'mw': int(match.group(3)),
'phase': int(match.group(4))
}
return None
df['tranche_props'] = df['tranche'].apply(parse_tranche)
Comprehensive documentation including:
Consult this document for detailed technical information and advanced usage patterns.
ZINC explicitly states: "We do not guarantee the quality of any molecule for any purpose and take no responsibility for errors arising from the use of this database."
When using ZINC in publications, cite the appropriate version:
ZINC22 : Irwin, J. J., et al. "ZINC22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery." Journal of Chemical Information and Modeling 2023.
ZINC15 : Irwin, J. J., et al. "ZINC15 – Ligand Discovery for Everyone." Journal of Chemical Information and Modeling 2020, 60, 6065–6073.
Weekly Installs
135
Repository
GitHub Stars
23.4K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
claude-code118
opencode108
cursor105
gemini-cli104
antigravity102
codex94
免费AI数据抓取智能体:自动化收集、丰富与存储网站/API数据
1,100 周安装