designing-workflow-skills by trailofbits/skills
npx skills add https://github.com/trailofbits/skills --skill designing-workflow-skills保持在 500 行以内,详细信息拆分到 references/ 和 workflows/ 目录中,仅一级深度
SKILL.md
通过遵循结构模式而非描述性文字,构建基于工作流的、可可靠执行的技能。
<essential_principles>
Claude 仅根据其 frontmatter 中的 description 来决定是否加载一个技能。SKILL.md 的主体内容——包括“何时使用”和“何时不使用”部分——仅在技能激活后才会被读取。请将你的触发关键词、用例和排除情况放在描述中。糟糕的描述意味着错误的激活或遗漏的激活,无论主体内容写得多好。
“何时使用”和“何时不使用”部分仍有其作用:它们界定了 LLM 激活后的行为范围。“何时不使用”应指明具体的替代方案:“对于简单的模式匹配使用 Semgrep”,而不是“不适用于简单任务”。
未编号的描述性指令会导致不可靠的执行顺序。每个阶段都需要:
技能在 frontmatter 中使用 allowed-tools:。代理在 frontmatter 中使用 tools:。子代理从其 获取工具。切勿列出组件不使用的工具。对于有专用工具的操作(Glob、Grep、Read、Write、Edit),切勿使用 Bash。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
subagent_type大多数技能和代理应在工具列表中包含 TodoRead 和 TodoWrite——这些工具支持多步骤执行期间的进度跟踪,即使对于不显式管理任务的技能也很有用。
SKILL.md 保持在 500 行以内。它只包含 LLM 每次调用所需的内容:原则、路由、快速参考和链接。详细的模式放在 references/ 中。分步流程放在 workflows/ 中。仅一级深度——没有引用链。
每个工作流指令在运行时都会变成工具调用。如果一个工作流需要在 N 个文件中搜索 M 个模式,请合并成一个正则表达式——而不是 N×M 次调用。如果一个工作流需要为每个项目生成子代理,请使用批处理——而不是每个文件一个子代理。应用 10,000 文件测试:在脑海中针对大型仓库运行工作流,并检查工具调用次数是否保持有界。参见 anti-patterns.md 中的 AP-18 和 AP-19。
并非每个步骤都需要相同程度的规范。根据步骤进行校准:
一个技能可以混合使用不同的自由度级别。一个安全审计技能可能在发现阶段使用高自由度(“探索代码库以寻找身份验证模式”),在报告阶段使用低自由度(“精确使用此严重性分类表”)。
</essential_principles>
为你的技能结构选择合适的模式。请在 workflow-patterns.md 中阅读完整的模式描述。
该技能有多少条不同的路径?
|
+-- 一条路径,始终相同
| +-- 它是否执行破坏性操作?
| +-- 是 -> 安全门模式
| +-- 否 -> 线性递进模式
|
+-- 从共享设置出发的多条独立路径
| +-- 路由模式
|
+-- 多个依赖步骤按顺序执行
+-- 步骤之间是否存在复杂的依赖关系?
+-- 是 -> 任务驱动模式
+-- 否 -> 顺序流水线模式
| 模式 | 适用场景 | 关键特性 |
|---|---|---|
| 路由 | 从共享输入出发的多个独立任务 | 路由表将意图映射到工作流文件 |
| 顺序流水线 | 依赖步骤,每个步骤的输出作为下一个步骤的输入 | 自动检测可能从部分进度恢复 |
| 线性递进 | 单一路径,每次执行都相同 | 带有进入/退出标准的编号阶段 |
| 安全门 | 破坏性/不可逆操作 | 执行前有两个确认门 |
| 任务驱动 | 复杂的依赖关系,容忍部分失败 | 使用 TaskCreate/TaskUpdate 进行依赖关系跟踪 |
无论采用何种模式,每个工作流技能都需要以下骨架:
---
name: kebab-case-name
description: "第三人称描述,包含触发关键词——这是 Claude 决定激活技能的方式"
allowed-tools:
- [所需的最低工具集]
# 可选字段——完整参考请见 tool-assignment-guide.md:
# disable-model-invocation: true # 仅用户可调用(Claude 不可调用)
# user-invocable: false # 仅 Claude 可调用(在 / 菜单中隐藏)
# context: fork # 在隔离的子代理上下文中运行
# agent: Explore # 子代理类型(需要 context: fork)
# model: [model-name] # 技能激活时切换模型
# argument-hint: "[filename]" # 自动补全时显示的提示
---
# 标题
## 基本原则
[3-5 条不可协商的规则,并解释 WHY]
## 何时使用
[4-6 个具体场景——界定激活后的行为范围]
## 何时不使用
[3-5 个场景,并指明替代方案——界定激活后的行为范围]
## [模式特定部分]
[路由表 / 流水线步骤 / 阶段列表 / 确认门]
## 快速参考
[常用信息的紧凑表格]
## 参考索引
[所有支持文件的链接]
## 成功标准
[用于输出验证的检查清单]
技能支持三种类型的字符串替换:用于参数和会话 ID 的美元前缀变量,以及用于 shell 预处理的感叹号-反引号语法。技能加载器会在 Claude 看到文件之前处理这些内容——即使在代码块内部也是如此——因此切勿在文档文本中使用原始语法。完整变量参考和使用指南请参见 tool-assignment-guide.md。
最常见的错误。包含修复前后对比的完整目录请见 anti-patterns.md。
| AP | 反模式 | 一行修复 |
|---|---|---|
| AP-1 | 缺少目标/反目标 | 添加“何时使用”和“何时不使用”部分 |
| AP-2 | 庞大的 SKILL.md(>500 行) | 拆分为 references/ 和 workflows/ |
| AP-3 | 引用链(A -> B -> C) | 所有文件距离 SKILL.md 仅一跳 |
| AP-4 | 硬编码路径 | 对所有内部路径使用 {baseDir} |
| AP-5 | 损坏的文件引用 | 提交前验证每个路径都能解析 |
| AP-6 | 未编号的阶段 | 为每个阶段编号并指定进入/退出标准 |
| AP-7 | 缺少退出标准 | 为每个阶段定义“完成”的含义 |
| AP-8 | 没有验证步骤 | 在每个工作流末尾添加验证 |
| AP-9 | 模糊的路由关键词 | 为每个工作流路由使用独特的关键词 |
| AP-11 | 工具使用不当 | 使用 Glob/Grep/Read,而非 Bash 等效命令 |
| AP-12 | 工具权限过高 | 移除实际未使用的工具 |
| AP-13 | 模糊的子代理提示 | 指定要分析什么、寻找什么以及返回什么 |
| AP-15 | 参考文档转储 | 教授判断力,而非原始文档 |
| AP-16 | 缺少合理化理由 | 为审计技能添加“应拒绝的合理化理由” |
| AP-17 | 没有具体示例 | 为关键指令展示 输入 -> 输出 |
| AP-18 | 笛卡尔积工具调用 | 将模式合并为单个正则表达式,grep 一次,然后过滤 |
| AP-19 | 无限制的子代理生成 | 将项目分批,每批一个子代理 |
| AP-20 | 描述中总结了工作流 | 描述 = 仅触发条件,绝不包含工作流步骤 |
AP-10(无默认/回退路由)、AP-14(代理中缺少工具理由)和 AP-20(描述中总结了工作流)在完整目录中。由于 AP-20 影响重大,也包含在上面的快速参考中。
将你的组件类型映射到正确的工具集。完整指南请见 tool-assignment-guide.md。
| 组件类型 | 典型工具 |
|---|---|
| 只读分析技能 | Read, Glob, Grep, TodoRead, TodoWrite |
| 交互式分析技能 | Read, Glob, Grep, AskUserQuestion, TodoRead, TodoWrite |
| 代码生成技能 | Read, Glob, Grep, Write, Bash, TodoRead, TodoWrite |
| 流水线技能 | Read, Write, Glob, Grep, Bash, AskUserQuestion, Task, TaskCreate, TaskList, TaskUpdate, TodoRead, TodoWrite |
| 只读代理 | Read, Grep, Glob, TodoRead, TodoWrite |
| 操作代理 | Read, Grep, Glob, Write, Bash, TodoRead, TodoWrite |
关键规则:
find)、Grep(而非 grep)、Read(而非 cat)——始终优先使用专用工具allowed-tools: —— 代理使用 tools:在设计工作流技能时,拒绝以下这些走捷径的理由:
| 合理化理由 | 为何错误 |
|---|---|
| “下一个阶段很明显” | LLM 不会从描述性文字推断顺序。请为阶段编号。 |
| “退出标准是隐含的” | 隐含的标准就是被跳过的标准。请明确写出它们。 |
| “一个大的 SKILL.md 更简单” | 写起来简单,执行起来糟糕。超过 500 行后 LLM 会失去焦点。 |
| “描述不太重要” | 描述是技能被触发的方式。糟糕的描述意味着错误的激活或遗漏的激活。 |
| “Bash 可以做所有事情” | Bash 文件操作是脆弱的。专用工具能更好地处理编码、权限和格式。 |
| “LLM 会自己弄清楚工具” | 它会猜错。请为每个操作精确指定使用哪个工具。 |
| “我稍后会添加细节” | 不完整的技能交付的就是不完整的。在编写前完成设计。 |
| 文件 | 内容 |
|---|---|
| workflow-patterns.md | 5 种模式,包含结构骨架和示例 |
| anti-patterns.md | 20 个反模式及修复前后的对比 |
| tool-assignment-guide.md | 工具选择矩阵、组件比较、子代理指南 |
| progressive-disclosure-guide.md | 内容拆分规则、500 行规则、规模指南 |
| 工作流 | 目的 |
| --- | --- |
| design-a-workflow-skill.md | 从范围界定到自我审查的 6 阶段创建过程 |
| review-checklist.md | 用于提交准备的结构化自我审查清单 |
一个设计良好的工作流技能:
{baseDir})每周安装次数
630
仓库
GitHub 星标数
3.9K
首次出现
2026年2月19日
安全审计
安装于
codex568
opencode568
github-copilot564
gemini-cli563
cursor563
amp558
stays under 500 lines with details split into references/ and workflows/ directories, one level deep
SKILL.md
Build workflow-based skills that execute reliably by following structural patterns, not prose.
<essential_principles>
Claude decides whether to load a skill based solely on its frontmatter description. The body of SKILL.md — including "When to Use" and "When NOT to Use" sections — is only read AFTER the skill is already active. Put your trigger keywords, use cases, and exclusions in the description. A bad description means wrong activations or missed activations regardless of what the body says.
"When to Use" and "When NOT to Use" sections still serve a purpose: they scope the LLM's behavior once active. "When NOT to Use" should name specific alternatives: "use Semgrep for simple pattern matching" not "not for simple tasks."
Unnumbered prose instructions produce unreliable execution order. Every phase needs:
Skills use allowed-tools: in frontmatter. Agents use tools: in frontmatter. Subagents get tools from their subagent_type. Never list tools the component doesn't use. Never use Bash for operations that have dedicated tools (Glob, Grep, Read, Write, Edit).
Most skills and agents should include TodoRead and TodoWrite in their tool list — these enable progress tracking during multi-step execution and are useful even for skills that don't explicitly manage tasks.
SKILL.md stays under 500 lines. It contains only what the LLM needs for every invocation: principles, routing, quick references, and links. Detailed patterns go in references/. Step-by-step processes go in workflows/. One level deep — no reference chains.
Every workflow instruction becomes tool calls at runtime. If a workflow searches N files for M patterns, combine into one regex — not N×M calls. If a workflow spawns subagents per item, use batching — not one subagent per file. Apply the 10,000-file test: mentally run the workflow against a large repo and check that tool call count stays bounded. See anti-patterns.md AP-18 and AP-19.
Not every step needs the same level of prescription. Calibrate per step:
A skill can mix freedom levels. A security audit skill might use high freedom for the discovery phase ("explore the codebase for auth patterns") and low freedom for the reporting phase ("use exactly this severity classification table").
</essential_principles>
Choose the right pattern for your skill's structure. Read the full pattern description in workflow-patterns.md.
How many distinct paths does the skill have?
|
+-- One path, always the same
| +-- Does it perform destructive actions?
| +-- YES -> Safety Gate Pattern
| +-- NO -> Linear Progression Pattern
|
+-- Multiple independent paths from shared setup
| +-- Routing Pattern
|
+-- Multiple dependent steps in sequence
+-- Do steps have complex dependencies?
+-- YES -> Task-Driven Pattern
+-- NO -> Sequential Pipeline Pattern
| Pattern | Use When | Key Feature |
|---|---|---|
| Routing | Multiple independent tasks from shared intake | Routing table maps intent to workflow files |
| Sequential Pipeline | Dependent steps, each feeding the next | Auto-detection may resume from partial progress |
| Linear Progression | Single path, same every time | Numbered phases with entry/exit criteria |
| Safety Gate | Destructive/irreversible actions | Two confirmation gates before execution |
| Task-Driven | Complex dependencies, partial failure tolerance | TaskCreate/TaskUpdate with dependency tracking |
Every workflow skill needs this skeleton, regardless of pattern:
---
name: kebab-case-name
description: "Third-person description with trigger keywords — this is how Claude decides to activate the skill"
allowed-tools:
- [minimum tools needed]
# Optional fields — see tool-assignment-guide.md for full reference:
# disable-model-invocation: true # Only user can invoke (not Claude)
# user-invocable: false # Only Claude can invoke (hidden from / menu)
# context: fork # Run in isolated subagent context
# agent: Explore # Subagent type (requires context: fork)
# model: [model-name] # Switch model when skill is active
# argument-hint: "[filename]" # Hint shown during autocomplete
---
# Title
## Essential Principles
[3-5 non-negotiable rules with WHY explanations]
## When to Use
[4-6 specific scenarios — scopes behavior after activation]
## When NOT to Use
[3-5 scenarios with named alternatives — scopes behavior after activation]
## [Pattern-Specific Section]
[Routing table / Pipeline steps / Phase list / Gates]
## Quick Reference
[Compact tables for frequently-needed info]
## Reference Index
[Links to all supporting files]
## Success Criteria
[Checklist for output validation]
Skills support three types of string substitutions: dollar-prefixed variables for arguments and session ID, and exclamation-backtick syntax for shell preprocessing. The skill loader processes these before Claude sees the file — even inside code fences — so never use the raw syntax in documentation text. See tool-assignment-guide.md for the full variable reference and usage guidance.
The most common mistakes. Full catalog with before/after fixes in anti-patterns.md.
| AP | Anti-Pattern | One-Line Fix |
|---|---|---|
| AP-1 | Missing goals/anti-goals | Add When to Use AND When NOT to Use sections |
| AP-2 | Monolithic SKILL.md (>500 lines) | Split into references/ and workflows/ |
| AP-3 | Reference chains (A -> B -> C) | All files one hop from SKILL.md |
| AP-4 | Hardcoded paths | Use {baseDir} for all internal paths |
| AP-5 | Broken file references | Verify every path resolves before submitting |
| AP-6 | Unnumbered phases | Number every phase with entry/exit criteria |
| AP-7 | Missing exit criteria | Define what "done" means for every phase |
| AP-8 | No verification step | Add validation at the end of every workflow |
AP-10 (No Default/Fallback Route), AP-14 (Missing Tool Justification in Agents), and AP-20 (Description Summarizes Workflow) are in thefull catalog. AP-20 is included in the quick reference above due to its high impact.
Map your component type to the right tool set. Full guide in tool-assignment-guide.md.
| Component Type | Typical Tools |
|---|---|
| Read-only analysis skill | Read, Glob, Grep, TodoRead, TodoWrite |
| Interactive analysis skill | Read, Glob, Grep, AskUserQuestion, TodoRead, TodoWrite |
| Code generation skill | Read, Glob, Grep, Write, Bash, TodoRead, TodoWrite |
| Pipeline skill | Read, Write, Glob, Grep, Bash, AskUserQuestion, Task, TaskCreate, TaskList, TaskUpdate, TodoRead, TodoWrite |
| Read-only agent | Read, Grep, Glob, TodoRead, TodoWrite |
| Action agent | Read, Grep, Glob, Write, Bash, TodoRead, TodoWrite |
Key rules:
find), Grep (not grep), Read (not cat) — always prefer dedicated toolsallowed-tools: — agents use tools:When designing workflow skills, reject these shortcuts:
| Rationalization | Why It's Wrong |
|---|---|
| "It's obvious which phase comes next" | LLMs don't infer ordering from prose. Number the phases. |
| "Exit criteria are implied" | Implied criteria are skipped criteria. Write them explicitly. |
| "One big SKILL.md is simpler" | Simpler to write, worse to execute. The LLM loses focus past 500 lines. |
| "The description doesn't matter much" | The description is how the skill gets triggered. A bad description means wrong activations or missed activations. |
| "Bash can do everything" | Bash file operations are fragile. Dedicated tools handle encoding, permissions, and formatting better. |
| "The LLM will figure out the tools" | It will guess wrong. Specify exactly which tool for each operation. |
| "I'll add details later" | Incomplete skills ship incomplete. Design fully before writing. |
| File | Content |
|---|---|
| workflow-patterns.md | 5 patterns with structural skeletons and examples |
| anti-patterns.md | 20 anti-patterns with before/after fixes |
| tool-assignment-guide.md | Tool selection matrix, component comparison, subagent guidance |
| progressive-disclosure-guide.md | Content splitting rules, the 500-line rule, sizing guidelines |
| Workflow | Purpose |
| --- | --- |
| design-a-workflow-skill.md | 6-phase creation process from scope to self-review |
A well-designed workflow skill:
{baseDir})Weekly Installs
630
Repository
GitHub Stars
3.9K
First Seen
Feb 19, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
codex568
opencode568
github-copilot564
gemini-cli563
cursor563
amp558
agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试
136,300 周安装
tmux 会话控制技能:自动化管理 Claude Code 会话与终端进程
1,100 周安装
AntV 图表可视化技能 - 智能图表选择与数据可视化生成工具
1,000 周安装
Flutter HTTP与JSON网络请求教程:异步API调用、JSON解析与序列化
972 周安装
App Store Connect CLI 截图流水线:自动化构建、AXe 驱动截图与框架合成上传
1,100 周安装
freee API 技能:集成 freee 会计/人事/发票 API 的 MCP 服务器使用指南
1,100 周安装
Uniswap 交换集成指南:前端、后端与智能合约集成方法详解
473 周安装
| AP-9 | Vague routing keywords | Use distinctive keywords per workflow route |
| AP-11 | Wrong tool for the job | Use Glob/Grep/Read, not Bash equivalents |
| AP-12 | Overprivileged tools | Remove tools not actually used |
| AP-13 | Vague subagent prompts | Specify what to analyze, look for, and return |
| AP-15 | Reference dumps | Teach judgment, not raw documentation |
| AP-16 | Missing rationalizations | Add "Rationalizations to Reject" for audit skills |
| AP-17 | No concrete examples | Show input -> output for key instructions |
| AP-18 | Cartesian product tool calls | Combine patterns into single regex, grep once, then filter |
| AP-19 | Unbounded subagent spawning | Batch items into groups, one subagent per batch |
| AP-20 | Description summarizes workflow | Description = triggering conditions only, never workflow steps |
| review-checklist.md | Structured self-review checklist for submission readiness |