medchem by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill medchemMedchem 是一个用于药物发现工作流程中分子过滤和优先级排序的 Python 库。应用数百种成熟且新颖的分子过滤器、结构警报和药物化学规则,以大规模高效地筛选和优先排序化合物库。规则和过滤器是上下文相关的——请结合领域专业知识将其用作指导原则。
此技能应在以下情况下使用:
uv pip install medchem
使用 medchem.rules 模块将成熟的类药性规则应用于分子。
可用规则:
单一规则应用:
import medchem as mc
# 对 SMILES 字符串应用五规则
smiles = "CC(=O)OC1=CC=CC=C1C(=O)O" # 阿司匹林
passes = mc.rules.basic_rules.rule_of_five(smiles)
# 返回:True
# 检查特定规则
passes_oprea = mc.rules.basic_rules.rule_of_oprea(smiles)
passes_cns = mc.rules.basic_rules.rule_of_cns(smiles)
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
使用 RuleFilters 应用多个规则:
import datamol as dm
import medchem as mc
# 加载分子
mols = [dm.to_mol(smiles) for smiles in smiles_list]
# 创建包含多个规则的过滤器
rfilter = mc.rules.RuleFilters(
rule_list=[
"rule_of_five",
"rule_of_oprea",
"rule_of_cns",
"rule_of_leadlike_soft"
]
)
# 并行应用过滤器
results = rfilter(
mols=mols,
n_jobs=-1, # 使用所有 CPU 核心
progress=True
)
结果格式: 结果以字典形式返回,包含每条规则的通过/失败状态和详细信息。
使用 medchem.structural 模块检测潜在有问题的结构模式。
可用过滤器:
常见警报:
import medchem as mc
# 创建过滤器
alert_filter = mc.structural.CommonAlertsFilters()
# 检查单个分子
mol = dm.to_mol("c1ccccc1")
has_alerts, details = alert_filter.check_mol(mol)
# 批量并行过滤
results = alert_filter(
mols=mol_list,
n_jobs=-1,
progress=True
)
NIBR 过滤器:
import medchem as mc
# 应用 NIBR 过滤器
nibr_filter = mc.structural.NIBRFilters()
results = nibr_filter(mols=mol_list, n_jobs=-1)
Lilly 扣分系统:
import medchem as mc
# 计算 Lilly 扣分
lilly = mc.structural.LillyDemeritsFilters()
results = lilly(mols=mol_list, n_jobs=-1)
# 每个结果包含扣分分数以及是否通过(≤100 扣分)
medchem.functional 模块为常见工作流程提供了便捷的函数。
快速过滤:
import medchem as mc
# 对列表应用 NIBR 过滤器
filter_ok = mc.functional.nibr_filter(
mols=mol_list,
n_jobs=-1
)
# 应用常见警报
alert_results = mc.functional.common_alerts_filter(
mols=mol_list,
n_jobs=-1
)
使用 medchem.groups 识别特定的化学基团和官能团。
可用基团:
用法:
import medchem as mc
# 创建基团检测器
group = mc.groups.ChemicalGroup(groups=["hinge_binders"])
# 检查匹配项
has_matches = group.has_match(mol_list)
# 获取详细的匹配信息
matches = group.get_matches(mol)
通过 medchem.catalogs 访问精选的化学结构集合。
可用目录:
用法:
import medchem as mc
# 访问命名目录
catalogs = mc.catalogs.NamedCatalogs
# 使用目录进行匹配
catalog = catalogs.get("functional_groups")
matches = catalog.get_matches(mol)
使用 medchem.complexity 计算近似合成可及性的复杂性指标。
常用指标:
用法:
import medchem as mc
# 计算复杂性
complexity_score = mc.complexity.calculate_complexity(mol)
# 按复杂性阈值过滤
complex_filter = mc.complexity.ComplexityFilter(max_complexity=500)
results = complex_filter(mols=mol_list)
使用 medchem.constraints 应用基于自定义属性的约束。
示例约束:
用法:
import medchem as mc
# 定义约束
constraints = mc.constraints.Constraints(
mw_range=(200, 500),
logp_range=(-2, 5),
tpsa_max=140,
rotatable_bonds_max=10
)
# 应用约束
results = constraints(mols=mol_list, n_jobs=-1)
使用专门的查询语言处理复杂的过滤条件。
查询示例:
# 通过 Ro5 且没有常见警报的分子
"rule_of_five AND NOT common_alerts"
# 具有低复杂性的 CNS 样分子
"rule_of_cns AND complexity < 400"
# 没有 Lilly 扣分的先导样分子
"rule_of_leadlike AND lilly_demerits == 0"
用法:
import medchem as mc
# 解析并应用查询
query = mc.query.parse("rule_of_five AND NOT common_alerts")
results = query.apply(mols=mol_list, n_jobs=-1)
过滤大型化合物集合以识别类药候选物。
import datamol as dm
import medchem as mc
import pandas as pd
# 加载化合物库
df = pd.read_csv("compounds.csv")
mols = [dm.to_mol(smi) for smi in df["smiles"]]
# 应用主要过滤器
rule_filter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_veber"])
rule_results = rule_filter(mols=mols, n_jobs=-1, progress=True)
# 应用结构警报
alert_filter = mc.structural.CommonAlertsFilters()
alert_results = alert_filter(mols=mols, n_jobs=-1, progress=True)
# 合并结果
df["passes_rules"] = rule_results["pass"]
df["has_alerts"] = alert_results["has_alerts"]
df["drug_like"] = df["passes_rules"] & ~df["has_alerts"]
# 保存过滤后的化合物
filtered_df = df[df["drug_like"]]
filtered_df.to_csv("filtered_compounds.csv", index=False)
在先导化合物优化期间应用更严格的标准。
import medchem as mc
# 创建综合过滤器
filters = {
"rules": mc.rules.RuleFilters(rule_list=["rule_of_leadlike_strict"]),
"alerts": mc.structural.NIBRFilters(),
"lilly": mc.structural.LillyDemeritsFilters(),
"complexity": mc.complexity.ComplexityFilter(max_complexity=400)
}
# 应用所有过滤器
results = {}
for name, filt in filters.items():
results[name] = filt(mols=candidate_mols, n_jobs=-1)
# 识别通过所有过滤器的化合物
passes_all = all(r["pass"] for r in results.values())
查找包含特定官能团或骨架的分子。
import medchem as mc
# 为多个基团创建基团检测器
group_detector = mc.groups.ChemicalGroup(
groups=["hinge_binders", "phosphate_binders"]
)
# 筛选库
matches = group_detector.get_all_matches(mol_list)
# 过滤具有所需基团的分子
mol_with_groups = [mol for mol, match in zip(mol_list, matches) if match]
上下文至关重要:不要盲目应用过滤器。理解生物靶标和化学空间。
组合多个过滤器:将规则、结构警报和领域知识结合使用,以做出更好的决策。
使用并行化:对于大型数据集(>1000 个分子),始终使用 n_jobs=-1 进行并行处理。
迭代优化:从宽泛的过滤器(Ro5)开始,然后根据需要应用更具体的标准(CNS、先导样)。
记录过滤决策:跟踪哪些分子被过滤掉及其原因,以确保可重复性。
验证结果:请记住,已上市药物通常不符合标准过滤器——将这些规则用作指导原则,而非绝对规则。
考虑前药:设计为前药的分子可能有意违反标准的药物化学规则。
全面的 API 参考,涵盖所有 medchem 模块,包含详细的函数签名、参数和返回类型。
可用规则、过滤器和警报的完整目录,包含描述、阈值和文献引用。
用于批量过滤工作流程的生产就绪脚本。支持多种输入格式(CSV、SDF、SMILES)、可配置的过滤器组合和详细报告。
用法:
python scripts/filter_molecules.py input.csv --rules rule_of_five,rule_of_cns --alerts nibr --output filtered.csv
每周安装次数
123
仓库
GitHub 星标数
22.6K
首次出现
2026年1月21日
安全审计
已安装于
claude-code104
opencode98
gemini-cli92
cursor91
antigravity83
codex82
Medchem is a Python library for molecular filtering and prioritization in drug discovery workflows. Apply hundreds of well-established and novel molecular filters, structural alerts, and medicinal chemistry rules to efficiently triage and prioritize compound libraries at scale. Rules and filters are context-specific—use as guidelines combined with domain expertise.
This skill should be used when:
uv pip install medchem
Apply established drug-likeness rules to molecules using the medchem.rules module.
Available Rules:
Single Rule Application:
import medchem as mc
# Apply Rule of Five to a SMILES string
smiles = "CC(=O)OC1=CC=CC=C1C(=O)O" # Aspirin
passes = mc.rules.basic_rules.rule_of_five(smiles)
# Returns: True
# Check specific rules
passes_oprea = mc.rules.basic_rules.rule_of_oprea(smiles)
passes_cns = mc.rules.basic_rules.rule_of_cns(smiles)
Multiple Rules with RuleFilters:
import datamol as dm
import medchem as mc
# Load molecules
mols = [dm.to_mol(smiles) for smiles in smiles_list]
# Create filter with multiple rules
rfilter = mc.rules.RuleFilters(
rule_list=[
"rule_of_five",
"rule_of_oprea",
"rule_of_cns",
"rule_of_leadlike_soft"
]
)
# Apply filters with parallelization
results = rfilter(
mols=mols,
n_jobs=-1, # Use all CPU cores
progress=True
)
Result Format: Results are returned as dictionaries with pass/fail status and detailed information for each rule.
Detect potentially problematic structural patterns using the medchem.structural module.
Available Filters:
Common Alerts:
import medchem as mc
# Create filter
alert_filter = mc.structural.CommonAlertsFilters()
# Check single molecule
mol = dm.to_mol("c1ccccc1")
has_alerts, details = alert_filter.check_mol(mol)
# Batch filtering with parallelization
results = alert_filter(
mols=mol_list,
n_jobs=-1,
progress=True
)
NIBR Filters:
import medchem as mc
# Apply NIBR filters
nibr_filter = mc.structural.NIBRFilters()
results = nibr_filter(mols=mol_list, n_jobs=-1)
Lilly Demerits:
import medchem as mc
# Calculate Lilly demerits
lilly = mc.structural.LillyDemeritsFilters()
results = lilly(mols=mol_list, n_jobs=-1)
# Each result includes demerit score and whether it passes (≤100 demerits)
The medchem.functional module provides convenient functions for common workflows.
Quick Filtering:
import medchem as mc
# Apply NIBR filters to a list
filter_ok = mc.functional.nibr_filter(
mols=mol_list,
n_jobs=-1
)
# Apply common alerts
alert_results = mc.functional.common_alerts_filter(
mols=mol_list,
n_jobs=-1
)
Identify specific chemical groups and functional groups using medchem.groups.
Available Groups:
Usage:
import medchem as mc
# Create group detector
group = mc.groups.ChemicalGroup(groups=["hinge_binders"])
# Check for matches
has_matches = group.has_match(mol_list)
# Get detailed match information
matches = group.get_matches(mol)
Access curated collections of chemical structures through medchem.catalogs.
Available Catalogs:
Usage:
import medchem as mc
# Access named catalogs
catalogs = mc.catalogs.NamedCatalogs
# Use catalog for matching
catalog = catalogs.get("functional_groups")
matches = catalog.get_matches(mol)
Calculate complexity metrics that approximate synthetic accessibility using medchem.complexity.
Common Metrics:
Usage:
import medchem as mc
# Calculate complexity
complexity_score = mc.complexity.calculate_complexity(mol)
# Filter by complexity threshold
complex_filter = mc.complexity.ComplexityFilter(max_complexity=500)
results = complex_filter(mols=mol_list)
Apply custom property-based constraints using medchem.constraints.
Example Constraints:
Usage:
import medchem as mc
# Define constraints
constraints = mc.constraints.Constraints(
mw_range=(200, 500),
logp_range=(-2, 5),
tpsa_max=140,
rotatable_bonds_max=10
)
# Apply constraints
results = constraints(mols=mol_list, n_jobs=-1)
Use a specialized query language for complex filtering criteria.
Query Examples:
# Molecules passing Ro5 AND not having common alerts
"rule_of_five AND NOT common_alerts"
# CNS-like molecules with low complexity
"rule_of_cns AND complexity < 400"
# Leadlike molecules without Lilly demerits
"rule_of_leadlike AND lilly_demerits == 0"
Usage:
import medchem as mc
# Parse and apply query
query = mc.query.parse("rule_of_five AND NOT common_alerts")
results = query.apply(mols=mol_list, n_jobs=-1)
Filter a large compound collection to identify drug-like candidates.
import datamol as dm
import medchem as mc
import pandas as pd
# Load compound library
df = pd.read_csv("compounds.csv")
mols = [dm.to_mol(smi) for smi in df["smiles"]]
# Apply primary filters
rule_filter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_veber"])
rule_results = rule_filter(mols=mols, n_jobs=-1, progress=True)
# Apply structural alerts
alert_filter = mc.structural.CommonAlertsFilters()
alert_results = alert_filter(mols=mols, n_jobs=-1, progress=True)
# Combine results
df["passes_rules"] = rule_results["pass"]
df["has_alerts"] = alert_results["has_alerts"]
df["drug_like"] = df["passes_rules"] & ~df["has_alerts"]
# Save filtered compounds
filtered_df = df[df["drug_like"]]
filtered_df.to_csv("filtered_compounds.csv", index=False)
Apply stricter criteria during lead optimization.
import medchem as mc
# Create comprehensive filter
filters = {
"rules": mc.rules.RuleFilters(rule_list=["rule_of_leadlike_strict"]),
"alerts": mc.structural.NIBRFilters(),
"lilly": mc.structural.LillyDemeritsFilters(),
"complexity": mc.complexity.ComplexityFilter(max_complexity=400)
}
# Apply all filters
results = {}
for name, filt in filters.items():
results[name] = filt(mols=candidate_mols, n_jobs=-1)
# Identify compounds passing all filters
passes_all = all(r["pass"] for r in results.values())
Find molecules containing specific functional groups or scaffolds.
import medchem as mc
# Create group detector for multiple groups
group_detector = mc.groups.ChemicalGroup(
groups=["hinge_binders", "phosphate_binders"]
)
# Screen library
matches = group_detector.get_all_matches(mol_list)
# Filter molecules with desired groups
mol_with_groups = [mol for mol, match in zip(mol_list, matches) if match]
Context Matters : Don't blindly apply filters. Understand the biological target and chemical space.
Combine Multiple Filters : Use rules, structural alerts, and domain knowledge together for better decisions.
Use Parallelization : For large datasets (>1000 molecules), always use n_jobs=-1 for parallel processing.
Iterative Refinement : Start with broad filters (Ro5), then apply more specific criteria (CNS, leadlike) as needed.
Document Filtering Decisions : Track which molecules were filtered out and why for reproducibility.
Validate Results : Remember that marketed drugs often fail standard filters—use these as guidelines, not absolute rules.
Consider Prodrugs : Molecules designed as prodrugs may intentionally violate standard medicinal chemistry rules.
Comprehensive API reference covering all medchem modules with detailed function signatures, parameters, and return types.
Complete catalog of available rules, filters, and alerts with descriptions, thresholds, and literature references.
Production-ready script for batch filtering workflows. Supports multiple input formats (CSV, SDF, SMILES), configurable filter combinations, and detailed reporting.
Usage:
python scripts/filter_molecules.py input.csv --rules rule_of_five,rule_of_cns --alerts nibr --output filtered.csv
Official documentation: https://medchem-docs.datamol.io/ GitHub repository: https://github.com/datamol-io/medchem
Weekly Installs
123
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
claude-code104
opencode98
gemini-cli92
cursor91
antigravity83
codex82
PPTX 文件处理全攻略:Python 脚本创建、编辑、分析 .pptx 文件内容与结构
915 周安装