tool-design by sickn33/antigravity-awesome-skills
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill tool-design构建智能体可以有效使用的工具,包括架构简化模式
当处理智能体可以有效使用的构建工具时,包括架构简化模式,请使用此技能。
工具是智能体与世界交互的主要机制。它们定义了确定性系统与非确定性智能体之间的契约。与为开发者设计的传统软件 API 不同,工具 API 必须为能够推理意图、推断参数值并根据自然语言请求生成调用的语言模型而设计。糟糕的工具设计会创建任何提示工程都无法修复的故障模式。有效的工具设计遵循特定的原则,这些原则考虑了智能体如何感知和使用工具。
在以下情况下激活此技能:
工具是确定性系统与非确定性智能体之间的契约。整合原则指出,如果人类工程师无法确定在给定情况下应该使用哪个工具,那么就不能期望智能体做得更好。有效的工具描述是塑造智能体行为的提示工程。
关键原则包括:清晰描述回答做什么、何时用以及返回什么;平衡完整性和令牌效率的响应格式;支持恢复的错误消息;以及减少认知负荷的一致约定。
工具即契约 工具是确定性系统与非确定性智能体之间的契约。当人类调用 API 时,他们理解契约并发出适当的请求。智能体必须从描述中推断契约,并生成符合预期格式的调用。
这种根本差异要求重新思考 API 设计。契约必须明确无误,示例必须说明预期模式,错误消息必须指导纠正。工具定义中的每一个模糊之处都成为一个潜在的故障模式。
工具描述即提示 工具描述被加载到智能体上下文中,并共同引导行为。这些描述不仅仅是文档——它们是塑造智能体如何推理工具使用的提示工程。
像"搜索数据库"这样带有神秘参数名称的糟糕描述会迫使智能体猜测。优化的描述包括使用上下文、示例和默认值。描述应回答:工具做什么、何时使用它以及它产生什么。
命名空间与组织 随着工具集合的增长,组织变得至关重要。命名空间将相关工具分组在公共前缀下,帮助智能体在正确的时间选择适当的工具。
命名空间在功能之间创建了清晰的边界。当智能体需要数据库信息时,它会路由到数据库命名空间。当它需要网络搜索时,它会路由到网络命名空间。
单一综合性工具 整合原则指出,如果人类工程师无法确定在给定情况下应该使用哪个工具,那么就不能期望智能体做得更好。这导致人们倾向于使用单一的综合性工具,而不是多个狭窄的工具。
与其实现 、 和 ,不如实现一个 工具,它能查找可用性并进行安排。综合性工具在内部处理完整的工作流程,而不是要求智能体链接多个调用。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
list_userslist_eventscreate_eventschedule_event整合为何有效 智能体的上下文和注意力是有限的。集合中的每个工具在工具选择阶段都会争夺注意力。每个工具都会添加消耗上下文预算的描述令牌。功能重叠会造成使用哪个工具的模糊性。
整合通过消除冗余描述来减少令牌消耗。它通过让一个工具覆盖每个工作流程来消除模糊性。它通过缩小有效工具集来降低工具选择的复杂性。
何时不应整合 整合并非普遍正确。具有根本不同行为的工具应保持独立。在不同上下文中使用的工具受益于分离。可能被独立调用的工具不应被强行捆绑。
整合原则,如果推到逻辑极端,就会导致架构简化:移除大多数专用工具,转而支持原始的、通用的能力。生产证据表明,这种方法可以胜过复杂的多工具架构。
文件系统智能体模式 与其为数据探索、模式查找和查询验证构建自定义工具,不如通过单一的命令执行工具提供直接的文件系统访问。智能体使用标准的 Unix 实用程序(grep、cat、find、ls)来探索、理解和操作系统。
这之所以有效,是因为:
何时简化胜过复杂 在以下情况下,简化有效:
在以下情况下,简化会失败:
停止限制推理 一个常见的反模式是构建工具来"保护"模型免受复杂性影响。预过滤上下文、限制选项、将交互包装在验证逻辑中。随着模型的改进,这些防护措施往往会变成负担。
要问的问题是:您的工具是在赋能新能力,还是在限制模型本可以自行处理的推理?
为未来模型构建 模型的改进速度比工具开发能跟上的速度更快。为今天的模型优化的架构可能对明天的模型来说限制过多。构建能够从模型改进中受益的最小化架构,而不是锁定当前限制的复杂架构。
有关生产证据,请参阅架构简化案例研究。
描述结构 有效的工具描述回答四个问题:
默认参数选择 默认值应反映常见用例。它们通过消除不必要的参数指定来减轻智能体负担。它们可以防止因省略参数而导致的错误。
工具响应大小显著影响上下文使用。实现响应格式选项可以让智能体控制详细程度。
简洁格式仅返回基本字段,适用于确认或基本信息。详细格式返回包含所有字段的完整对象,适用于需要完整上下文进行决策的情况。
在工具描述中包含关于何时使用每种格式的指导。智能体学会根据任务要求选择合适的格式。
错误消息服务于两个受众:调试问题的开发人员和从故障中恢复的智能体。对于智能体,错误消息必须具有可操作性。它们必须告诉智能体出了什么问题以及如何纠正。
设计能够支持恢复的错误消息。对于可重试的错误,包含重试指导。对于输入错误,包含正确的格式。对于缺失的数据,说明需要什么。
在所有工具中使用一致的模式。建立命名约定:工具名称采用动词-名词模式,跨工具的参数名称一致,返回字段名称一致。
研究表明,工具描述重叠会导致模型混淆。更多的工具并不总是带来更好的结果。一个合理的指导原则是,大多数应用使用 10-20 个工具。如果需要更多,请使用命名空间创建逻辑分组。
实现机制来帮助智能体选择正确的工具:工具分组、基于示例的选择,以及使用伞形工具路由到专用子工具的层次结构。
使用 MCP(模型上下文协议)工具时,始终使用完全限定的工具名称,以避免"找不到工具"错误。
格式:ServerName:tool_name
# 正确:完全限定名称
"使用 BigQuery:bigquery_schema 工具来检索表模式。"
"使用 GitHub:create_issue 工具来创建问题。"
# 错误:非限定名称
"使用 bigquery_schema 工具..." # 当有多个服务器时可能会失败
没有服务器前缀,智能体可能无法定位工具,尤其是在有多个 MCP 服务器可用时。建立命名约定,在所有工具引用中包含服务器上下文。
Claude 可以优化自己的工具。当给定一个工具并观察到故障模式时,它会诊断问题并提出改进建议。生产测试表明,通过帮助未来的智能体避免错误,这种方法实现了任务完成时间减少 40%。
工具测试智能体模式:
def optimize_tool_description(tool_spec, failure_examples):
"""
使用智能体分析工具故障并改进描述。
流程:
1. 智能体尝试在各种任务中使用工具
2. 收集故障模式和摩擦点
3. 智能体分析故障并提出改进建议
4. 针对相同任务测试改进后的描述
"""
prompt = f"""
分析此工具规范和观察到的故障。
工具:{tool_spec}
观察到的故障:
{failure_examples}
识别:
1. 为什么智能体在使用此工具时失败
2. 描述中缺少什么信息
3. 哪些模糊性导致使用不正确
提出一个改进的工具描述,以解决这些问题。
"""
return get_agent_response(prompt)
这创建了一个反馈循环:使用工具的智能体生成故障数据,然后智能体使用这些数据来改进工具描述,从而减少未来的故障。
根据以下标准评估工具设计:明确性、完整性、可恢复性、效率和一致性。通过呈现具有代表性的智能体请求并评估产生的工具调用来测试工具。
模糊描述: "搜索数据库以获取客户信息"留下了太多未回答的问题。
神秘的参数名称: 名为 x、val 或 param1 的参数迫使智能体猜测其含义。
缺少错误处理: 因通用错误而失败的工具不提供任何恢复指导。
命名不一致: 在某些工具中使用 id,在另一些中使用 identifier,在某些中使用 customer_id,这会造成混淆。
设计工具集合时:
示例 1:设计良好的工具
def get_customer(customer_id: str, format: str = "concise"):
"""
按 ID 检索客户信息。
使用时机:
- 用户询问特定客户详情时
- 需要客户上下文进行决策时
- 验证客户身份时
参数:
customer_id: 格式 "CUST-######" (例如,"CUST-000001")
format: "concise" 表示关键字段,"detailed" 表示完整记录
返回:
包含请求字段的客户对象
错误:
NOT_FOUND: 未找到客户 ID
INVALID_FORMAT: ID 必须匹配 CUST-###### 模式
"""
示例 2:糟糕的工具设计
此示例演示了几个工具设计的反模式:
def search(query):
"""搜索数据库。"""
pass
此设计的问题:
故障模式:
此技能关联到:
内部参考资料:
本集合中的相关技能:
外部资源:
创建日期 : 2025-12-20 最后更新 : 2025-12-23 作者 : Agent Skills for Context Engineering Contributors 版本 : 1.1.0
每周安装次数
106
代码仓库
GitHub 星标数
27.1K
首次出现
Feb 1, 2026
安全审计
安装于
opencode100
codex100
gemini-cli99
github-copilot98
kimi-cli98
amp97
Build tools that agents can use effectively, including architectural reduction patterns
Use this skill when working with build tools that agents can use effectively, including architectural reduction patterns.
Tools are the primary mechanism through which agents interact with the world. They define the contract between deterministic systems and non-deterministic agents. Unlike traditional software APIs designed for developers, tool APIs must be designed for language models that reason about intent, infer parameter values, and generate calls from natural language requests. Poor tool design creates failure modes that no amount of prompt engineering can fix. Effective tool design follows specific principles that account for how agents perceive and use tools.
Activate this skill when:
Tools are contracts between deterministic systems and non-deterministic agents. The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. Effective tool descriptions are prompt engineering that shapes agent behavior.
Key principles include: clear descriptions that answer what, when, and what returns; response formats that balance completeness and token efficiency; error messages that enable recovery; and consistent conventions that reduce cognitive load.
Tools as Contracts Tools are contracts between deterministic systems and non-deterministic agents. When humans call APIs, they understand the contract and make appropriate requests. Agents must infer the contract from descriptions and generate calls that match expected formats.
This fundamental difference requires rethinking API design. The contract must be unambiguous, examples must illustrate expected patterns, and error messages must guide correction. Every ambiguity in tool definitions becomes a potential failure mode.
Tool Description as Prompt Tool descriptions are loaded into agent context and collectively steer behavior. The descriptions are not just documentation—they are prompt engineering that shapes how agents reason about tool use.
Poor descriptions like "Search the database" with cryptic parameter names force agents to guess. Optimized descriptions include usage context, examples, and defaults. The description answers: what the tool does, when to use it, and what it produces.
Namespacing and Organization As tool collections grow, organization becomes critical. Namespacing groups related tools under common prefixes, helping agents select appropriate tools at the right time.
Namespacing creates clear boundaries between functionality. When an agent needs database information, it routes to the database namespace. When it needs web search, it routes to web namespace.
Single Comprehensive Tools The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. This leads to a preference for single comprehensive tools over multiple narrow tools.
Instead of implementing list_users, list_events, and create_event, implement schedule_event that finds availability and schedules. The comprehensive tool handles the full workflow internally rather than requiring agents to chain multiple calls.
Why Consolidation Works Agents have limited context and attention. Each tool in the collection competes for attention in the tool selection phase. Each tool adds description tokens that consume context budget. Overlapping functionality creates ambiguity about which tool to use.
Consolidation reduces token consumption by eliminating redundant descriptions. It eliminates ambiguity by having one tool cover each workflow. It reduces tool selection complexity by shrinking the effective tool set.
When Not to Consolidate Consolidation is not universally correct. Tools with fundamentally different behaviors should remain separate. Tools used in different contexts benefit from separation. Tools that might be called independently should not be artificially bundled.
The consolidation principle, taken to its logical extreme, leads to architectural reduction: removing most specialized tools in favor of primitive, general-purpose capabilities. Production evidence shows this approach can outperform sophisticated multi-tool architectures.
The File System Agent Pattern Instead of building custom tools for data exploration, schema lookup, and query validation, provide direct file system access through a single command execution tool. The agent uses standard Unix utilities (grep, cat, find, ls) to explore, understand, and operate on your system.
This works because:
When Reduction Outperforms Complexity Reduction works when:
Reduction fails when:
Stop Constraining Reasoning A common anti-pattern is building tools to "protect" the model from complexity. Pre-filtering context, constraining options, wrapping interactions in validation logic. These guardrails often become liabilities as models improve.
The question to ask: are your tools enabling new capabilities, or are they constraining reasoning the model could handle on its own?
Build for Future Models Models improve faster than tooling can keep up. An architecture optimized for today's model may be over-constrained for tomorrow's. Build minimal architectures that can benefit from model improvements rather than sophisticated architectures that lock in current limitations.
See Architectural Reduction Case Study for production evidence.
Description Structure Effective tool descriptions answer four questions:
What does the tool do? Clear, specific description of functionality. Avoid vague language like "helps with" or "can be used for." State exactly what the tool accomplishes.
When should it be used? Specific triggers and contexts. Include both direct triggers ("User asks about pricing") and indirect signals ("Need current market rates").
What inputs does it accept? Parameter descriptions with types, constraints, and defaults. Explain what each parameter controls.
What does it return? Output format and structure. Include examples of successful responses and error conditions.
Default Parameter Selection Defaults should reflect common use cases. They reduce agent burden by eliminating unnecessary parameter specification. They prevent errors from omitted parameters.
Tool response size significantly impacts context usage. Implementing response format options gives agents control over verbosity.
Concise format returns essential fields only, appropriate for confirmation or basic information. Detailed format returns complete objects with all fields, appropriate when full context is needed for decisions.
Include guidance in tool descriptions about when to use each format. Agents learn to select appropriate formats based on task requirements.
Error messages serve two audiences: developers debugging issues and agents recovering from failures. For agents, error messages must be actionable. They must tell the agent what went wrong and how to correct it.
Design error messages that enable recovery. For retryable errors, include retry guidance. For input errors, include corrected format. For missing data, include what's needed.
Use a consistent schema across all tools. Establish naming conventions: verb-noun pattern for tool names, consistent parameter names across tools, consistent return field names.
Research shows tool description overlap causes model confusion. More tools do not always lead to better outcomes. A reasonable guideline is 10-20 tools for most applications. If more are needed, use namespacing to create logical groupings.
Implement mechanisms to help agents select the right tool: tool grouping, example-based selection, and hierarchy with umbrella tools that route to specialized sub-tools.
When using MCP (Model Context Protocol) tools, always use fully qualified tool names to avoid "tool not found" errors.
Format: ServerName:tool_name
# Correct: Fully qualified names
"Use the BigQuery:bigquery_schema tool to retrieve table schemas."
"Use the GitHub:create_issue tool to create issues."
# Incorrect: Unqualified names
"Use the bigquery_schema tool..." # May fail with multiple servers
Without the server prefix, agents may fail to locate tools, especially when multiple MCP servers are available. Establish naming conventions that include server context in all tool references.
Claude can optimize its own tools. When given a tool and observed failure modes, it diagnoses issues and suggests improvements. Production testing shows this approach achieves 40% reduction in task completion time by helping future agents avoid mistakes.
The Tool-Testing Agent Pattern :
def optimize_tool_description(tool_spec, failure_examples):
"""
Use an agent to analyze tool failures and improve descriptions.
Process:
1. Agent attempts to use tool across diverse tasks
2. Collect failure modes and friction points
3. Agent analyzes failures and proposes improvements
4. Test improved descriptions against same tasks
"""
prompt = f"""
Analyze this tool specification and the observed failures.
Tool: {tool_spec}
Failures observed:
{failure_examples}
Identify:
1. Why agents are failing with this tool
2. What information is missing from the description
3. What ambiguities cause incorrect usage
Propose an improved tool description that addresses these issues.
"""
return get_agent_response(prompt)
This creates a feedback loop: agents using tools generate failure data, which agents then use to improve tool descriptions, which reduces future failures.
Evaluate tool designs against criteria: unambiguity, completeness, recoverability, efficiency, and consistency. Test tools by presenting representative agent requests and evaluating the resulting tool calls.
Vague descriptions: "Search the database for customer information" leaves too many questions unanswered.
Cryptic parameter names: Parameters named x, val, or param1 force agents to guess meaning.
Missing error handling: Tools that fail with generic errors provide no recovery guidance.
Inconsistent naming: Using id in some tools, identifier in others, and customer_id in some creates confusion.
When designing tool collections:
Example 1: Well-Designed Tool
def get_customer(customer_id: str, format: str = "concise"):
"""
Retrieve customer information by ID.
Use when:
- User asks about specific customer details
- Need customer context for decision-making
- Verifying customer identity
Args:
customer_id: Format "CUST-######" (e.g., "CUST-000001")
format: "concise" for key fields, "detailed" for complete record
Returns:
Customer object with requested fields
Errors:
NOT_FOUND: Customer ID not found
INVALID_FORMAT: ID must match CUST-###### pattern
"""
Example 2: Poor Tool Design
This example demonstrates several tool design anti-patterns:
def search(query):
"""Search the database."""
pass
Problems with this design:
Failure modes:
This skill connects to:
Internal references:
Related skills in this collection:
External resources:
Created : 2025-12-20 Last Updated : 2025-12-23 Author : Agent Skills for Context Engineering Contributors Version : 1.1.0
Weekly Installs
106
Repository
GitHub Stars
27.1K
First Seen
Feb 1, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode100
codex100
gemini-cli99
github-copilot98
kimi-cli98
amp97
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
49,800 周安装
图像优化指南:提升网站性能的JPEG/PNG/WebP压缩与响应式图像最佳实践
250 周安装
React 前端无障碍最佳实践指南:WCAG 标准、语义化 HTML 与键盘导航
251 周安装
外部研究工具 - 自动化获取文档、最佳实践与API信息 | 2025技术研究助手
253 周安装
Dart Drift 数据库使用指南:SQLite 与 PostgreSQL 类型安全查询
247 周安装
i18n 专家:自动化国际化配置与审核工具,支持 React/Next.js/Vue 多框架
252 周安装
task-execution-engine:AI 驱动的任务执行引擎,直接从设计文档自动化实现代码任务
248 周安装