重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
tool-design by guanyang/antigravity-skills
npx skills add https://github.com/guanyang/antigravity-skills --skill tool-design将每个工具设计为确定性系统与非确定性智能体之间的契约。与面向人类的 API 不同,面向智能体的工具必须仅通过描述就能使契约明确无误——智能体从描述中推断意图,并生成必须符合预期格式的调用。任何模糊性都会成为潜在的故障模式,这是任何提示工程都无法修复的。
在以下情况下启用此技能:
围绕整合原则设计工具:如果人类工程师无法明确说出在给定情况下应该使用哪个工具,就不能指望智能体做得更好。减少工具集,直到每个工具都有一个明确的目的,因为智能体通过比较描述来选择工具,任何重叠都会引入选择错误。
将每个工具描述视为塑造智能体行为的提示工程。描述不是面向人类的文档——它被注入到智能体的上下文中,并直接引导推理。编写描述时要回答工具做什么、何时使用以及返回什么这三个问题,因为这三个问题正是智能体在工具选择过程中评估的内容。
作为契约的工具 将每个工具设计为一个自包含的契约。当人类调用 API 时,他们会阅读文档、理解约定并发出适当的请求。智能体必须从单个描述块中推断出整个契约。通过包含格式示例、预期模式和明确约束来使契约明确无误。不要省略调用者需要了解的任何信息,因为智能体在发出调用前无法提出澄清性问题。
作为提示的工具描述 编写工具描述时,要了解它们会直接加载到智能体上下文中,并共同引导行为。像“搜索数据库”这样带有神秘参数名称的模糊描述会迫使智能体猜测——而猜测会产生错误的调用。相反,应包含使用上下文、参数格式示例和合理的默认值。描述中的每个词要么有助于提高工具选择的准确性,要么会损害它。
命名空间和组织 随着集合的增长,将工具放在通用前缀下进行命名空间管理,因为智能体受益于分层分组。当智能体需要数据库操作时,它会路由到 db_* 命名空间;当需要网络交互时,它会路由到 web_*。没有命名空间,智能体必须在扁平列表中评估每个工具,这会随着数量增长而降低选择准确性。
单一综合性工具 构建单一的综合性工具,而不是多个重叠的狭窄工具。与其分别实现 list_users、 和 ,不如实现一个 ,它能在一次调用中查找可用性并进行安排。综合性工具在内部处理完整的工作流程,消除了智能体以正确顺序链接调用的负担。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
list_eventscreate_eventschedule_event整合为何有效 应用整合是因为智能体的上下文和注意力有限。集合中的每个工具在工具选择过程中都会争夺注意力,每个描述都会消耗上下文预算令牌,而重叠的功能会产生模糊性。整合消除了冗余描述,消除了选择模糊性,并缩小了有效工具集。Vercel 通过将其智能体从 17 个专用工具减少到 2 个通用工具并实现了更好的性能,证明了这一原则——更少的工具意味着更少的混淆和更可靠的工具选择。
何时不应整合 当工具具有根本不同的行为、服务于不同的上下文或必须能够独立调用时,保持工具分离。过度整合会产生另一个问题:一个拥有过多参数和模式的单一工具会让智能体难以正确参数化。
将整合原则推向逻辑极端,移除大多数专用工具,转而支持原始的、通用的能力。生产证据表明,这种方法可以胜过复杂的多工具架构。
文件系统智能体模式 通过单一的命令执行工具提供直接的文件系统访问,而不是为数据探索、模式查找和查询验证构建自定义工具。智能体使用标准的 Unix 实用程序(grep、cat、find、ls)来探索和操作系统。这之所以有效,是因为文件系统是模型深入理解的成熟抽象,标准工具具有可预测的行为,智能体可以灵活地链接原语,而不是受限于预定义的工作流程,并且文件中的良好文档可以替代摘要工具。
简化何时优于复杂 在以下情况下选择简化:数据层文档齐全且结构一致,模型具有足够的推理能力,专用工具限制了模型而非赋能模型,或者花在维护脚手架上的时间多于改进结果的时间。在以下情况下避免简化:底层数据混乱或文档不全,领域需要模型缺乏的专门知识,安全约束必须限制智能体操作,或者操作确实受益于结构化的工作流程。
为未来模型构建 设计能受益于模型改进的最小化架构,而不是锁定当前限制的复杂架构。询问每个工具是启用了新能力,还是限制了模型本可以自行处理的推理——作为“护栏”构建的工具通常随着模型的改进而成为负担。
有关生产证据,请参阅架构简化案例研究。
描述结构 构建每个工具描述以回答四个问题:
默认参数选择 设置默认值以反映常见用例。默认值通过消除不必要的参数指定来减轻智能体负担,并防止因省略参数而导致的错误。选择能在无需智能体理解每个选项的情况下产生有用结果的默认值。
提供响应格式选项(简洁版与详细版),因为工具响应大小会显著影响上下文使用。简洁格式仅返回基本字段,适用于确认。详细格式返回完整对象,适用于需要完整上下文来驱动决策的情况。在工具描述中记录何时使用每种格式,以便智能体学会适当选择。
为两类受众设计错误消息:调试问题的开发人员和从故障中恢复的智能体。对于智能体,每条错误消息都必须是可操作的——它必须说明出了什么问题以及如何纠正。对于可重试的错误,包含重试指导;对于输入错误,包含更正后的格式示例;对于不完整的请求,包含具体的缺失字段。只说“失败”的错误无法提供任何恢复信号。
在所有工具中建立一致的模式。对工具名称使用动词-名词模式(get_customer、create_order),跨工具使用一致的参数名称(始终使用 customer_id,而不是有时用 id,有时用 identifier),以及一致的返回字段名称。一致性降低了智能体的认知负荷,并提高了跨工具的泛化能力。
对于大多数应用,将工具集合限制在 10-20 个工具,因为研究表明描述重叠会导致模型混淆,并且更多的工具并不总是带来更好的结果。当确实需要更多工具时,使用命名空间来创建逻辑分组。实现选择机制:按领域分组工具、基于示例的选择提示,以及路由到专用子工具的伞形工具。
使用 MCP(模型上下文协议)时,始终使用完全限定的工具名称,以避免“找不到工具”错误。
格式:ServerName:tool_name
# 正确:完全限定名称
"使用 BigQuery:bigquery_schema 工具来检索表模式。"
"使用 GitHub:create_issue 工具来创建问题。"
# 错误:非限定名称
"使用 bigquery_schema 工具..." # 当有多个服务器时可能会失败
没有服务器前缀,当有多个 MCP 服务器可用时,智能体可能无法定位工具。建立命名约定,在所有工具引用中包含服务器上下文。
将观察到的工具故障反馈给智能体,以诊断问题并改进描述。生产测试表明,这种方法通过帮助未来的智能体避免错误,实现了任务完成时间减少 40%。
工具测试智能体模式:
def optimize_tool_description(tool_spec, failure_examples):
"""
使用智能体分析工具故障并改进描述。
过程:
1. 智能体尝试在各种任务中使用工具
2. 收集故障模式和摩擦点
3. 智能体分析故障并提出改进建议
4. 针对相同任务测试改进后的描述
"""
prompt = f"""
分析此工具规范和观察到的故障。
工具:{tool_spec}
观察到的故障:
{failure_examples}
识别:
1. 智能体使用此工具失败的原因
2. 描述中缺少哪些信息
3. 哪些模糊性导致了不正确的使用
提出一个改进的工具描述来解决这些问题。
"""
return get_agent_response(prompt)
这创建了一个反馈循环:使用工具的智能体生成故障数据,然后智能体利用这些数据改进工具描述,从而减少未来的故障。
根据五个标准评估工具设计:明确性、完整性、可恢复性、效率和一致性。通过呈现具有代表性的智能体请求,并评估生成的工具调用是否符合预期行为来进行测试。
设计工具集合时:
示例 1:设计良好的工具
def get_customer(customer_id: str, format: str = "concise"):
"""
通过 ID 检索客户信息。
使用时机:
- 用户询问特定客户详情时
- 需要客户上下文进行决策时
- 验证客户身份时
参数:
customer_id: 格式 "CUST-######"(例如,"CUST-000001")
format: "concise" 表示关键字段,"detailed" 表示完整记录
返回:
包含请求字段的客户对象
错误:
NOT_FOUND: 未找到客户 ID
INVALID_FORMAT: ID 必须匹配 CUST-###### 模式
"""
示例 2:设计不佳的工具
此示例演示了多个工具设计反模式:
def search(query):
"""搜索数据库。"""
pass
此设计的问题:
故障模式:
x、val 或 param1 的参数会迫使智能体猜测含义。使用描述性名称,无需阅读进一步文档即可传达目的。id,在另一个中使用 identifier,在第三个中使用 customer_id 会造成混淆。在整个工具集合中标准化参数名称。search)时,智能体无法区分。始终使用完全限定的 ServerName:tool_name 格式,并在添加新提供者时检查冲突。options 对象中。此技能关联到:
内部参考资料:
此集合中的相关技能:
外部资源:
创建时间:2025-12-20 最后更新:2026-03-17 作者:Agent Skills for Context Engineering Contributors 版本:2.0.0
每周安装次数
59
代码仓库
GitHub 星标数
595
首次出现时间
2026年1月26日
安全审计
安装于
opencode54
codex53
github-copilot52
gemini-cli51
cursor50
antigravity50
Design every tool as a contract between a deterministic system and a non-deterministic agent. Unlike human-facing APIs, agent-facing tools must make the contract unambiguous through the description alone -- agents infer intent from descriptions and generate calls that must match expected formats. Every ambiguity becomes a potential failure mode that no amount of prompt engineering can fix.
Activate this skill when:
Design tools around the consolidation principle: if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. Reduce the tool set until each tool has one unambiguous purpose, because agents select tools by comparing descriptions and any overlap introduces selection errors.
Treat every tool description as prompt engineering that shapes agent behavior. The description is not documentation for humans -- it is injected into the agent's context and directly steers reasoning. Write descriptions that answer what the tool does, when to use it, and what it returns, because these three questions are exactly what agents evaluate during tool selection.
Tools as Contracts Design each tool as a self-contained contract. When humans call APIs, they read docs, understand conventions, and make appropriate requests. Agents must infer the entire contract from a single description block. Make the contract unambiguous by including format examples, expected patterns, and explicit constraints. Omit nothing that a caller needs to know, because agents cannot ask clarifying questions before making a call.
Tool Description as Prompt Write tool descriptions knowing they load directly into agent context and collectively steer behavior. A vague description like "Search the database" with cryptic parameter names forces the agent to guess -- and guessing produces incorrect calls. Instead, include usage context, parameter format examples, and sensible defaults. Every word in the description either helps or hurts tool selection accuracy.
Namespacing and Organization Namespace tools under common prefixes as the collection grows, because agents benefit from hierarchical grouping. When an agent needs database operations, it routes to the db_* namespace; when it needs web interactions, it routes to web_*. Without namespacing, agents must evaluate every tool in a flat list, which degrades selection accuracy as the count grows.
Single Comprehensive Tools Build single comprehensive tools instead of multiple narrow tools that overlap. Rather than implementing list_users, list_events, and create_event separately, implement schedule_event that finds availability and schedules in one call. The comprehensive tool handles the full workflow internally, removing the agent's burden of chaining calls in the correct order.
Why Consolidation Works Apply consolidation because agents have limited context and attention. Each tool in the collection competes for attention during tool selection, each description consumes context budget tokens, and overlapping functionality creates ambiguity. Consolidation eliminates redundant descriptions, removes selection ambiguity, and shrinks the effective tool set. Vercel demonstrated this principle by reducing their agent from 17 specialized tools to 2 general-purpose tools and achieving better performance -- fewer tools meant less confusion and more reliable tool selection.
When Not to Consolidate Keep tools separate when they have fundamentally different behaviors, serve different contexts, or must be callable independently. Over-consolidation creates a different problem: a single tool with too many parameters and modes becomes hard for agents to parameterize correctly.
Push the consolidation principle to its logical extreme by removing most specialized tools in favor of primitive, general-purpose capabilities. Production evidence shows this approach can outperform sophisticated multi-tool architectures.
The File System Agent Pattern Provide direct file system access through a single command execution tool instead of building custom tools for data exploration, schema lookup, and query validation. The agent uses standard Unix utilities (grep, cat, find, ls) to explore and operate on the system. This works because file systems are a proven abstraction that models understand deeply, standard tools have predictable behavior, agents can chain primitives flexibly rather than being constrained to predefined workflows, and good documentation in files replaces summarization tools.
When Reduction Outperforms Complexity Choose reduction when the data layer is well-documented and consistently structured, the model has sufficient reasoning capability, specialized tools were constraining rather than enabling the model, or more time is spent maintaining scaffolding than improving outcomes. Avoid reduction when underlying data is messy or poorly documented, the domain requires specialized knowledge the model lacks, safety constraints must limit agent actions, or operations genuinely benefit from structured workflows.
Build for Future Models Design minimal architectures that benefit from model improvements rather than sophisticated architectures that lock in current limitations. Ask whether each tool enables new capabilities or constrains reasoning the model could handle on its own -- tools built as "guardrails" often become liabilities as models improve.
See Architectural Reduction Case Study for production evidence.
Description Structure Structure every tool description to answer four questions:
Default Parameter Selection Set defaults to reflect common use cases. Defaults reduce agent burden by eliminating unnecessary parameter specification and prevent errors from omitted parameters. Choose defaults that produce useful results without requiring the agent to understand every option.
Offer response format options (concise vs. detailed) because tool response size significantly impacts context usage. Concise format returns essential fields only, suitable for confirmations. Detailed format returns complete objects, suitable when full context drives decisions. Document when to use each format in the tool description so agents learn to select appropriately.
Design error messages for two audiences: developers debugging issues and agents recovering from failures. For agents, every error message must be actionable -- it must state what went wrong and how to correct it. Include retry guidance for retryable errors, corrected format examples for input errors, and specific missing fields for incomplete requests. An error that says only "failed" provides zero recovery signal.
Establish a consistent schema across all tools. Use verb-noun pattern for tool names (get_customer, create_order), consistent parameter names across tools (always customer_id, never sometimes id and sometimes identifier), and consistent return field names. Consistency reduces the cognitive load on agents and improves cross-tool generalization.
Limit tool collections to 10-20 tools for most applications, because research shows description overlap causes model confusion and more tools do not always lead to better outcomes. When more tools are genuinely needed, use namespacing to create logical groupings. Implement selection mechanisms: tool grouping by domain, example-based selection hints, and umbrella tools that route to specialized sub-tools.
Always use fully qualified tool names with MCP (Model Context Protocol) to avoid "tool not found" errors.
Format: ServerName:tool_name
# Correct: Fully qualified names
"Use the BigQuery:bigquery_schema tool to retrieve table schemas."
"Use the GitHub:create_issue tool to create issues."
# Incorrect: Unqualified names
"Use the bigquery_schema tool..." # May fail with multiple servers
Without the server prefix, agents may fail to locate tools when multiple MCP servers are available. Establish naming conventions that include server context in all tool references.
Feed observed tool failures back to an agent to diagnose issues and improve descriptions. Production testing shows this approach achieves 40% reduction in task completion time by helping future agents avoid mistakes.
The Tool-Testing Agent Pattern :
def optimize_tool_description(tool_spec, failure_examples):
"""
Use an agent to analyze tool failures and improve descriptions.
Process:
1. Agent attempts to use tool across diverse tasks
2. Collect failure modes and friction points
3. Agent analyzes failures and proposes improvements
4. Test improved descriptions against same tasks
"""
prompt = f"""
Analyze this tool specification and the observed failures.
Tool: {tool_spec}
Failures observed:
{failure_examples}
Identify:
1. Why agents are failing with this tool
2. What information is missing from the description
3. What ambiguities cause incorrect usage
Propose an improved tool description that addresses these issues.
"""
return get_agent_response(prompt)
This creates a feedback loop: agents using tools generate failure data, which agents then use to improve tool descriptions, which reduces future failures.
Evaluate tool designs against five criteria: unambiguity, completeness, recoverability, efficiency, and consistency. Test by presenting representative agent requests and evaluating the resulting tool calls against expected behavior.
When designing tool collections:
Example 1: Well-Designed Tool
def get_customer(customer_id: str, format: str = "concise"):
"""
Retrieve customer information by ID.
Use when:
- User asks about specific customer details
- Need customer context for decision-making
- Verifying customer identity
Args:
customer_id: Format "CUST-######" (e.g., "CUST-000001")
format: "concise" for key fields, "detailed" for complete record
Returns:
Customer object with requested fields
Errors:
NOT_FOUND: Customer ID not found
INVALID_FORMAT: ID must match CUST-###### pattern
"""
Example 2: Poor Tool Design
This example demonstrates several tool design anti-patterns:
def search(query):
"""Search the database."""
pass
Problems with this design:
Failure modes:
x, val, or param1 force agents to guess meaning. Use descriptive names that convey purpose without reading further documentation.id in one tool, identifier in another, and customer_id in a third creates confusion. Standardize parameter names across the entire tool collection.search), agents cannot disambiguate. Always use fully qualified format and audit for collisions when adding new providers.This skill connects to:
Internal references:
Related skills in this collection:
External resources:
Created : 2025-12-20 Last Updated : 2026-03-17 Author : Agent Skills for Context Engineering Contributors Version : 2.0.0
Weekly Installs
59
Repository
GitHub Stars
595
First Seen
Jan 26, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode54
codex53
github-copilot52
gemini-cli51
cursor50
antigravity50
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
76,800 周安装
Flutter BLoC 模式最佳实践指南:状态管理、依赖注入与导航规范
46 周安装
skill-comply:AI编码智能体合规性自动化测试工具 - 验证Claude技能规则遵循
46 周安装
ljg-card:内容转PNG图片工具,支持长图、信息图、漫画等多种视觉化格式
46 周安装
Trello项目管理技能:Membrane集成指南与API自动化操作教程
46 周安装
Linear CLI Watch:实时监听Linear问题变更,支持自定义轮询与JSON输出的命令行工具
46 周安装
客户探索方法指南:史蒂夫·布兰克方法论,验证商业假设,避免产品失败
46 周安装
ServerName:tool_nameoptions object.