devtu-optimize-descriptions by mims-harvard/tooluniverse
npx skills add https://github.com/mims-harvard/tooluniverse --skill devtu-optimize-descriptions优化 ToolUniverse JSON 配置文件中的工具描述,确保其清晰、完整且用户友好。
在以下情况时使用:
工具描述审查:
- [ ] 已说明先决条件(软件包、API 密钥、账户)
- [ ] 关键缩写词在首次使用时已展开
- [ ] 必需参数与可选参数清晰
- [ ] 互斥选项已编号/标记
- [ ] 参数指导包含权衡说明
- [ ] 过滤器语法显示了可用字段
- [ ] 在相关处添加了文件大小警告
- [ ] 示例展示了实际用法
问题:用户不知道是需要提供一种输入还是所有输入。
修复:对于互斥选项,使用“必需:提供一种输入类型”。
// 优化前
"description": "处理 BED 区域、基序或基因列表..."
// 优化后
"description": "处理基因组数据。**必需:提供一种输入类型** - (1) BED 区域, (2) DNA 基序, 或 (3) 基因列表。分析..."
对选项进行编号,并使用粗体突出“必需”。
问题:用户不知道在使用前需要安装/配置什么。
修复:为每个工具系列中的第一个工具添加先决条件说明。
"description": "查询单细胞数据。先决条件:需要 'package-name'(安装命令:pip install tooluniverse[extra])。返回..."
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
包括:
问题:新用户不理解技术术语。
修复:在首次使用时以“缩写词(全称)”的格式展开。
需要展开的常见缩写词:
H5AD → 基于 HDF5 的 AnnData
RPM → 每百万读数
TSS → 转录起始位点
TAD → 拓扑关联域
DRS → 数据存储库服务
API 名称(MACS2、IUPAC 等)
// 优化前 "description": "下载 H5AD 文件..."
// 优化后
"description": "下载 H5AD(基于 HDF5 的 AnnData)文件..."
问题:用户不知道有哪些可用字段或使用什么语法。
修复:列出操作符、常用字段,并提供多个示例。
"parameter_name": {
"type": "string",
"description": "使用类 SQL 语法进行过滤。格式:'字段 == \"值\"'。操作符:==, !=, in, <, >, <=, >=。使用 'and'/'or' 组合。常用字段:tissue, cell_type, disease, assay, sex, ethnicity。示例:'tissue == \"lung\"', 'disease == \"COVID-19\" and tissue == \"lung\"', 'cell_type in [\"T cell\", \"B cell\"]'。"
}
包括:
问题:用户不知道选择哪个值或存在哪些权衡。
修复:解释每个值的含义并提供建议。
// 优化前
"threshold": "Q 值阈值 (05=1e-5, 10=1e-10, 20=1e-20)"
// 优化后
"threshold": "峰值检测严格度。'05'=1e-5(宽松,峰值更多,特征更宽),'10'=1e-10(中等,平衡),'20'=1e-20(严格,高置信度,窄峰)。默认值 '05' 适用于大多数分析。值越高 = 峰值越少但置信度越高。"
对于每个参数选项,解释:
问题:用户提供了多个选项,但只允许一个。
修复:将选项标记为“选项 1”、“选项 2”等。
"bed_data": {
"description": "**选项 1**:BED 格式区域(制表符分隔:chr, start, end)。示例:'chr1\\t1000\\t2000'。"
},
"motif": {
"description": "**选项 2**:IUPAC 符号表示的 DNA 序列基序。使用:A/T/G/C, W=A|T, S=G|C。示例:'CANNTG'。"
},
"gene_list": {
"description": "**选项 3**:基因符号数组。示例:['TP53', 'MDM2']。"
}
对于下载或返回大文件的工具:
"description": "下载接触矩阵。注意:文件可能很大(GB 级别),下载前请检查元数据中的 file_size。返回..."
当工具返回提交 URL 而非直接结果时:
"description": "执行富集分析。注意:返回提交 URL(基于 Web 表单的分析)。分析..."
对于具有多种格式选项的工具:
"file_type": "文件格式。常见类型:'cooler'(多分辨率接触矩阵),'pairs'(比对读取对),'hic'(Juicer 格式),'mcool'(多分辨率 cooler)。"
{
"name": "Tool_operation_name",
"type": "ToolClassName",
"description": "[动作动词] 以 [目的]。[如果是第一个工具,添加先决条件]。[关键数据/特性]。[如果是互斥的,说明必需输入]。[关于限制/要求的说明]。用于:[用例 1]、[用例 2]、[用例 3]。",
"parameter": {
"properties": {
"param_name": {
"type": "string",
"description": "[它的作用]。[如果适用,说明格式/语法]。[包含权衡的选项]。[示例]。[如果适用,提供建议]。"
}
}
}
}
要验证描述质量,请询问:
"description": "从 [来源] 查询 [数据类型]。[先决条件]。按 [标准] 过滤。返回 [输出]。[数据规模]。用于:[发现]、[分析]、[特定研究任务]。"
关键要素:
"description": "从 [来源] 下载 [文件类型]。[格式详情]。[文件大小警告]。[身份验证要求]。用于:[离线分析]、[自定义处理]、[集成]。"
关键要素:
"description": "分析 [输入类型] 以找到 [结果]。**必需:提供一种输入类型** - (1) [选项], (2) [选项], (3) [选项]。与 [数据库/背景] 进行比较。[结果格式]。用于:[识别]、[发现]、[预测]。"
关键要素:
更新描述后,验证 JSON 语法:
# 验证所有工具 JSON
python3 -m json.tool src/tooluniverse/data/your_tools.json > /dev/null && echo "✓ 有效"
# 检查类别中的所有工具
for f in src/tooluniverse/data/*_tools.json; do
python3 -m json.tool "$f" > /dev/null && echo "✓ $f 有效" || echo "✗ $f 无效"
done
优化前(不清晰):
{
"name": "Tool_enrichment",
"description": "使用工具执行富集以查找因子。",
"parameter": {
"properties": {
"bed": {"description": "BED 数据"},
"motif": {"description": "基序"},
"genes": {"description": "基因"},
"threshold": {"description": "阈值"}
}
}
}
优化后(清晰):
{
"name": "Tool_enrichment_analysis",
"description": "识别数据中富集的转录因子。**必需:提供一种输入类型** - (1) BED 基因组区域, (2) DNA 序列基序(IUPAC 符号), 或 (3) 基因符号列表。与 400,000+ 个 ChIP-seq 实验进行比较。返回带有富集分数的排名蛋白质。注意:返回提交 URL(基于 Web 的分析)。用于:识别区域调控因子、查找与基序结合的蛋白质、发现调控基因的转录因子。",
"parameter": {
"properties": {
"bed_data": {
"description": "**选项 1**:BED 格式区域(制表符分隔:chr, start, end)。用于查找与基因组区域结合的蛋白质。示例:'chr1\\t1000\\t2000'。"
},
"motif": {
"description": "**选项 2**:IUPAC 符号表示的 DNA 基序(A/T/G/C, W=A|T, S=G|C, M=A|C, K=G|T, R=A|G, Y=C|T)。示例:'CANNTG'(E-box)。"
},
"gene_list": {
"description": "**选项 3**:基因符号数组或单个基因。示例:['TP53', 'MDM2', 'CDKN1A']。"
},
"threshold": {
"description": "峰值严格度。'05'=1e-5(宽松,峰值更多),'10'=1e-10(中等),'20'=1e-20(严格,高置信度)。默认值 '05' 适用于大多数分析。"
}
}
}
}
优化优先级顺序:
预期影响:用户错误减少 50-75%,首次成功使用时间加快 50-67%。
每周安装量
148
代码仓库
GitHub 星标数
1.2K
首次出现
2026年2月4日
安全审计
安装于
codex143
opencode142
gemini-cli138
github-copilot136
amp132
kimi-cli131
Optimize tool descriptions in ToolUniverse JSON configuration files to ensure they are clear, complete, and user-friendly.
Use when:
Tool Description Review:
- [ ] Prerequisites stated (packages, API keys, accounts)
- [ ] Critical abbreviations expanded on first use
- [ ] Required vs optional parameters clear
- [ ] Mutually exclusive options numbered/labeled
- [ ] Parameter guidance includes trade-offs
- [ ] Filter syntax shows available fields
- [ ] File size warnings where relevant
- [ ] Examples show realistic usage
Problem : Users don't know if they need ONE input or ALL inputs.
Fix : Use "Required: Provide ONE input type " for mutually exclusive options.
// Before
"description": "Process BED regions, motifs, or gene lists..."
// After
"description": "Process genomic data. **Required: Provide ONE input type** - (1) BED regions, (2) DNA motif, or (3) gene list. Analyzes..."
Number the options and use bold for "Required".
Problem : Users don't know what to install/configure before use.
Fix : Add prerequisites note to first tool in each family.
"description": "Query single-cell data. Prerequisites: Requires 'package-name' (install: pip install tooluniverse[extra]). Returns..."
Include:
Problem : New users don't understand technical terms.
Fix : Expand on first use with format: "Abbreviation (Full Name)".
Common abbreviations to expand:
H5AD → HDF5-based AnnData
RPM → Reads Per Million
TSS → Transcription Start Site
TAD → Topologically Associating Domain
DRS → Data Repository Service
API names (MACS2, IUPAC, etc.)
// Before "description": "Download H5AD files..."
// After
"description": "Download H5AD (HDF5-based AnnData) files..."
Problem : Users don't know what fields are available or what syntax to use.
Fix : List operators, common fields, and provide multiple examples.
"parameter_name": {
"type": "string",
"description": "Filter using SQL-like syntax. Format: 'field == \"value\"'. Operators: ==, !=, in, <, >, <=, >=. Combine with 'and'/'or'. Common fields: tissue, cell_type, disease, assay, sex, ethnicity. Examples: 'tissue == \"lung\"', 'disease == \"COVID-19\" and tissue == \"lung\"', 'cell_type in [\"T cell\", \"B cell\"]'."
}
Include:
Problem : Users don't know which value to choose or what trade-offs exist.
Fix : Explain what each value means and provide recommendations.
// Before
"threshold": "Q-value threshold (05=1e-5, 10=1e-10, 20=1e-20)"
// After
"threshold": "Peak calling stringency. '05'=1e-5 (permissive, more peaks, broad features), '10'=1e-10 (moderate, balanced), '20'=1e-20 (strict, high confidence, narrow peaks). Default '05' suitable for most analyses. Higher values = fewer but more confident peaks."
For each parameter option, explain:
Problem : Users provide multiple options when only one is allowed.
Fix : Label options as "Option 1 ", "Option 2 ", etc.
"bed_data": {
"description": "**Option 1**: BED format regions (tab-separated: chr, start, end). Example: 'chr1\\t1000\\t2000'."
},
"motif": {
"description": "**Option 2**: DNA sequence motif in IUPAC notation. Use: A/T/G/C, W=A|T, S=G|C. Example: 'CANNTG'."
},
"gene_list": {
"description": "**Option 3**: Gene symbols as array. Example: ['TP53', 'MDM2']."
}
For tools that download or return large files:
"description": "Download contact matrices. Note: Files can be large (GBs), check file_size in metadata before downloading. Returns..."
When tool returns submission URL instead of direct results:
"description": "Perform enrichment analysis. Note: Returns submission URL (web form-based analysis). Analyzes..."
For tools with multiple format options:
"file_type": "File format. Common types: 'cooler' (multi-resolution contact matrices), 'pairs' (aligned read pairs), 'hic' (Juicer format), 'mcool' (multi-resolution cooler)."
{
"name": "Tool_operation_name",
"type": "ToolClassName",
"description": "[Action verb] to [purpose]. [Prerequisites if first tool]. [Key data/features]. [Required inputs if mutually exclusive]. [Note about limitations/requirements]. Use for: [use case 1], [use case 2], [use case 3].",
"parameter": {
"properties": {
"param_name": {
"type": "string",
"description": "[What it does]. [Format/syntax if applicable]. [Options with trade-offs]. [Examples]. [Recommendation if applicable]."
}
}
}
}
To verify description quality, ask:
Can a new user understand what the tool does?
Can a user provide correct inputs on first try?
Can a user choose appropriate parameters?
Are prerequisites obvious?
"description": "Query [data type] from [source]. [Prerequisites]. Filter by [criteria]. Returns [output]. [Data scale]. Use for: [discovery], [analysis], [specific research tasks]."
Key elements:
"description": "Download [file types] from [source]. [Format details]. [File size warning]. [Authentication requirement]. Use for: [offline analysis], [custom processing], [integration]."
Key elements:
"description": "Analyze [input type] to find [results]. **Required: Provide ONE input type** - (1) [option], (2) [option], (3) [option]. Compares against [database/background]. [Result format]. Use for: [identifying], [discovering], [predicting]."
Key elements:
After updating descriptions, validate JSON syntax:
# Validate all tool JSONs
python3 -m json.tool src/tooluniverse/data/your_tools.json > /dev/null && echo "✓ Valid"
# Check all tools in category
for f in src/tooluniverse/data/*_tools.json; do
python3 -m json.tool "$f" > /dev/null && echo "✓ $f valid" || echo "✗ $f invalid"
done
Before (Unclear):
{
"name": "Tool_enrichment",
"description": "Perform enrichment with tool to find factors.",
"parameter": {
"properties": {
"bed": {"description": "BED data"},
"motif": {"description": "Motif"},
"genes": {"description": "Genes"},
"threshold": {"description": "Threshold value"}
}
}
}
After (Clear):
{
"name": "Tool_enrichment_analysis",
"description": "Identify transcription factors enriched in your data. **Required: Provide ONE input type** - (1) BED genomic regions, (2) DNA sequence motif (IUPAC notation), or (3) gene symbol list. Compares against 400,000+ ChIP-seq experiments. Returns ranked proteins with enrichment scores. Note: Returns submission URL (web-based analysis). Use for: identifying regulators of regions, finding proteins bound to motifs, discovering transcription factors regulating genes.",
"parameter": {
"properties": {
"bed_data": {
"description": "**Option 1**: BED format regions (tab-separated: chr, start, end). For finding proteins bound to genomic regions. Example: 'chr1\\t1000\\t2000'."
},
"motif": {
"description": "**Option 2**: DNA motif in IUPAC notation (A/T/G/C, W=A|T, S=G|C, M=A|C, K=G|T, R=A|G, Y=C|T). Example: 'CANNTG' (E-box)."
},
"gene_list": {
"description": "**Option 3**: Gene symbols as array or single gene. Example: ['TP53', 'MDM2', 'CDKN1A']."
},
"threshold": {
"description": "Peak stringency. '05'=1e-5 (permissive, more peaks), '10'=1e-10 (moderate), '20'=1e-20 (strict, high confidence). Default '05' suitable for most analyses."
}
}
}
}
Priority order for optimization:
Critical (fix immediately):
High (fix soon):
Medium (nice to have):
Expected impact : 50-75% reduction in user errors, 50-67% faster time to first successful use.
Weekly Installs
148
Repository
GitHub Stars
1.2K
First Seen
Feb 4, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
codex143
opencode142
gemini-cli138
github-copilot136
amp132
kimi-cli131
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
118,000 周安装