能力构建框架指南：从培训到可观察技能，诊断10大常见陷阱与优化路径

competency-builder by jwynia/agent-skills

122 周安装量

46 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/jwynia/agent-skills --skill competency-builder

质量管理方法论人力资源管理

🇨🇳中文介绍

能力构建技能

构建并运行能够产生实际能力（而非仅仅是完成度）的能力框架。诊断能力发展受阻之处，并指导下一步行动。

核心原则

能力是可观察到的技能，而非知识状态。如果你无法观察某人演示它，那它就不是一种能力。

诊断状态

CF0：无框架

症状： 拥有培训内容但缺乏能力结构。人们完成了培训但无法应用。同样的问题不断被提出。

测试：

人们需要运用这些知识做出什么决策？
哪些错误表明某人缺乏能力？
你能描述一个有能力的人能做什么吗？

干预措施： 从失败模式入手。列出你见过的错误、本不该被提出的问题、耗时过长的环节。每个失败模式都暗示着一种可以预防它的能力。

CF1：内容优先陷阱

症状： 从罗列人们需要知道的所有信息开始。培训内容全面但能力低下。"我们培训过那个"，但错误仍在继续。

测试：

你能描述拥有这种能力的人能做什么吗？
你会观察他们做什么来验证其能力？
每个内容片段是否都连接到一项特定的能力？

干预措施： 将每个内容块重新定义为"这能支持什么决策/行动？"。删除不支持任何能力的孤立内容。从行动反向推导所需知识。

CF2：模糊的能力定义

症状： 能力被定义为知识状态（"理解 X"）而非技能（"能根据 Y 评估 X"）。无法判断某人是否具备该能力。

测试：

两个人是否会就某人是否具备此能力产生分歧？

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

CF6：无进阶路径

症状： 存在能力定义但顺序不明确。先决条件不清楚。没有跳过逻辑。无论先前知识如何，每个人都遵循相同的路径。

哪些能力需要其他能力作为基础？
每个角色的最小可行路径是什么？
有先前知识的人可以跳过某些部分吗？

干预措施： 映射依赖关系。构建进阶树：

基础（所有人）
├── 先决能力
├─► 中级（建立在基础之上）
└─► 角色特定分支（并行路径）

定义跳过逻辑：什么证据允许跳过哪些模块？

CF7：无验证后果

症状： 存在评估但不设门槛。人们跳过或应付了事。证明能力与未证明能力没有区别。

验证结果用于什么决策？
如果有人评估失败会发生什么？
"未证明"是否有实际后果？

干预措施： 将每次验证与一个决策联系起来：

招聘：候选人是否晋级？
入职：是否准备好独立工作？
权限：是否有资格获得更高权限？
晋升：是否已发展出所需能力？

如果验证不与决策挂钩，质疑其是否有必要进行。

CF8：无反馈循环

症状： 框架构建一次后从未更新。不断出现未预料到的问题。无法了解哪些地方不起作用。

你如何知道哪些地方不起作用？
什么机制能暴露培训、框架或流程中的差距？
这个框架上次基于反馈进行更改是什么时候？

干预措施： 实施反馈循环：

代理/支持系统记录问题及其上下文
按能力/内容领域（或"未映射"）标记问题
定期审查模式
将修复方案分配给负责人（培训团队、政策所有者、工具团队）
跟踪模式何时导致变更

症状： 框架是数月/数年前构建的。现实已发生变化但框架未变。问题揭示框架与当前状态不符。

上次审查是什么时候？
什么会触发更新？
谁负责维护？

干预措施： 定义：

审查触发条件（政策变更、事件、新工具、反馈模式）
所有权（谁更新什么）
版本跟踪（接受 V1 与 V2 培训的人员）
节奏（即使没有触发条件，最低审查频率）

CF10：框架可运行

症状： 能力可观察，场景经过测试，进阶路径已映射，验证有意义，反馈循环活跃，维护有主。

能对所有先前状态的问题给出肯定回答
新员工更快达到能力要求
重复问题减少
框架已基于反馈数据演进
跳过逻辑根据已证明的能力个性化路径

当某人提出能力发展需求时：

识别当前状态 — 存在什么？培训内容？能力列表？场景？评估？
应用状态诊断 — 将症状与上述状态匹配
提出澄清性问题 — 人们做出什么决策？你见过哪些失败案例？
解释诊断结果 — 指出状态名称及缺失内容
推荐下一步 — 指向特定模板或干预措施
验证进展 — 检查干预措施是否解决了该状态

各阶段关键问题

人们需要运用这些知识做出什么决策？
哪些错误表明某人缺乏这种能力？
你会观察某人做什么来验证其能力？
这项能力可以预防什么失败模式？

什么现实情境需要这种判断？
哪些信息会不完整或模糊？
一个弱响应会遗漏什么？
什么区分了胜任与卓越？

谁需要完整深度？谁只需要规则？
相同内容能否以不同深度服务于多个受众？
每个角色的最小可行能力是什么？

此验证结果用于什么决策？
哪些证据类型是合适的（场景响应、产出物、观察到的行为）？
什么区分了"部分"、"胜任"和"优秀"？

人们在培训后会问什么问题？
哪些问题表明是培训差距、框架差距还是流程差距？
谁接收信号？谁决定修复方案？

模式	问题	修复方案
文档转储	将现有文档转换为"培训"而未重新构建	识别文档支持的决策。从决策反向构建内容。
测验谬误	用知识回忆问题评估能力	替换为需要判断力的场景。无法通过 Ctrl+F 回答。
通用培训	为所有受众提供单一培训	分层内容。定义每个角色的最小可行能力。
孤立场景	场景未映射到任何已定义的能力	要么添加它测试的能力，要么删除该场景。
孤立内容	内容不支持任何能力	要么识别它服务的能力，要么删除该内容。
勾选框完成	"完成培训"但未证明能力	将完成与已证明的能力挂钩，而非花费的时间。
纸上完美	框架存在但未使用；培训照旧进行	用真人进行试点。获取反馈。迭代。
一次性构建	框架创建后从未更新	定义维护的触发条件、负责人和节奏。

## [集群名称] 能力

| ID | 能力 | 描述 |
|----|------------|-------------|
| [PREFIX]-1 | [行为动词短语] | [以"能..."开头的可观察技能] |

### 场景：[名称]

**核心决策结构：** [正在测试的判断力是什么]

**面试变体：**
> [通用情境]

**评估变体：**
> [组织特定情境]

**评估的能力：** [IDs]

**优秀表现示例：**
- [考量点]

**危险信号：**
- [弱响应指标]

基础（角色：所有人）
├── [COMP-1]: [名称]
└── [COMP-2]: [名称]

├─► 中级（角色：[角色]）
│   ├── [COMP-3]: [名称] (需要: COMP-1)
│   └── [COMP-4]: [名称] (需要: COMP-2)

└─► 专家（角色：[角色]）
    └── [COMP-5]: [名称] (需要: COMP-3, COMP-4)

## 反馈循环设计

**观察机制：**
- 如何记录问题
- 捕获什么上下文
- 如何将其标记到能力

**分析节奏：** [频率]

**模式类别：**
- 培训差距：[谁处理]
- 框架差距：[谁处理]
- 流程差距：[谁处理]
- 工具差距：[谁处理]

**变更跟踪：**
- 如何记录变更
- 如何衡量有效性

如果从小规模开始：

3-5 项核心能力 — 最重要的那些
2-3 个场景 — 涵盖核心能力的面试和评估变体
一个层级的内容 — 可能是 L2（实践者深度）
基本评分标准 — 未证明 / 部分 / 胜任 / 优秀
一个反馈信号 — 人们在培训后会问什么问题？

根据使用过程中学到的内容进行扩展。

此技能将主要输出写入文件，以便工作在不同会话间持久保存。

在进行任何其他工作之前：

检查项目中是否存在 context/output-config.md
如果找到，查找此技能的条目
如果未找到或此技能没有条目，首先询问用户： * "我应该将本次能力构建会话的输出保存在哪里？" * 建议：explorations/competency/ 或该项目的一个合理位置
存储用户的偏好： * 如果上下文网络存在，则存储在 context/output-config.md 中 * 否则存储在项目根目录的 .competency-builder-output.md 中

对于此技能，持久化：

诊断状态 - 适用哪种能力框架状态
能力定义 - 从失败模式推导得出
场景设计 - 每项能力的测试场景
框架结构 - 进阶模型和依赖关系
反馈循环设计 - 如何识别差距

存入文件	保留在对话中
状态诊断	澄清性问题
能力定义	关于失败模式的讨论
场景模板	结构迭代
框架架构	实时反馈

模式：{领域}-competency-{日期}.md 示例：ai-literacy-competency-2025-01-15.md

此技能不做什么

编写培训内容 — 你帮助构建结构，他们编写内容
规定具体能力 — 你帮助他们从失败模式中发现自己的能力
评估现有培训是否"好" — 你诊断缺失了什么
取代主题专家 — 你提供方法论，他们提供领域知识

在能力框架开发过程中，询问：

所有能力是否都描述了可观察的技能（而非知识状态）？
每个场景是否需要无法通过查阅资料获得的判断力？
内容是否针对不同受众进行了适当分层？
验证是否与真实决策相关联？
是否有机制了解哪些地方不起作用？
框架是否基于反馈进行了更改？
有先前知识的人可以跳过某些部分吗？
每个人都遵循相同的路径，还是个性化的？

技能	关联点
research	当构建需要领域知识的 L3 内容时使用
framework-development	相关但有区别：框架捕获知识；能力框架构建技能
framework-to-mastra	能力框架 + 反馈循环 = 可部署的代理

用户： "我们有一个 40 页的安全政策。每个人都'完成'了培训，但仍在不断犯错。"

诊断： CF1（内容优先陷阱）

要问的问题：

人们在"完成"培训后最常犯的 3 个错误是什么？
人们需要运用这些知识做出什么决策？
当某人犯错时，他们未能识别或做到什么？

指导： "每个错误都暗示着一个能力差距。让我们反向推导：如果有人错误地处理敏感数据，缺失的能力可能是'能根据组织类别对数据进行分类'。一旦我们从失败模式中得出 3-5 项能力，我们将设计场景来测试某人是否能实际应用知识——而不仅仅是回忆它。"

衍生自：references/competency-framework-development.md

🇺🇸English

Competency Builder Skill

Build and operate competency frameworks that produce capability—not just completion. Diagnose where competency development is stuck and guide the next step.

Core Principle

Competencies are observable capabilities, not knowledge states. If you can't watch someone demonstrate it, it's not a competency.

Diagnostic States

CF0: No Framework

Symptoms: Have training content but no competency structure. People complete training but can't apply it. Same questions keep getting asked.

Test:

What decisions do people need to make with this knowledge?
What mistakes indicate someone lacks competency?
Can you describe what a competent person can DO?

Intervention: Start with failure modes. List mistakes you've seen, questions that shouldn't need asking, things that take too long. Each failure mode suggests a competency that would prevent it.

CF1: Content-First Trap

Symptoms: Started by listing all the information people need to know. Training is comprehensive but competence is low. "We trained on that" but mistakes continue.

Test:

Can you describe what someone with this competency can DO?
What would you watch them do to verify competency?
Does each content piece connect to a specific competency?

Intervention: Reframe each content chunk as "what decision/action does this enable?" Kill orphan content that doesn't support a competency. Work backward from actions to required knowledge.

CF2: Vague Competencies

Symptoms: Competencies are knowledge states ("understands X") not capabilities ("can evaluate X against Y"). Can't tell if someone has the competency or not.

Test:

Could two people disagree about whether someone has this competency?
Can you observe it?
Does it start with "Can" + action verb?

Intervention: Rewrite each competency as observable behavior. Transform:

"Understands data policies" → "Can classify data according to policy categories"
"Knows the approval process" → "Can determine required approval level for a given case"
"Familiar with the tool" → "Can configure the tool to accomplish [specific task]"

CF3: No Scenarios

Symptoms: Competencies defined but no way to test them. Assessment is knowledge recall (quizzes, multiple choice). People pass but fail in real situations.

Test:

What realistic situation requires this competency?
Does assessment require judgment, or can it be answered by searching documentation?
What would a weak vs. strong response look like?

Intervention: For each core competency, create a scenario that:

Presents a realistic situation
Includes incomplete information
Requires judgment (not just recall)
Has better and worse responses (not binary right/wrong)

Create variants: interview (generic), assessment (org-specific), ongoing (real situations).

CF4: Simple Scenarios

Symptoms: Scenarios exist but have artificial clarity. All information needed is provided. There's an obvious "right answer." People pass but fail in messy real situations.

Test:

Do scenarios match the ambiguity of real situations?
Can scenarios be answered by looking up documentation?
Do scenarios require weighing trade-offs?

Intervention: Add ambiguity. Remove artificial clarity. Include information that might be relevant but isn't, and omit information that would make the answer obvious. Test with real people—if everyone gets the same answer immediately, it's too simple.

CF5: Single Audience

Symptoms: Everyone gets the same training. Specialists are bored by basics. Generalists are overwhelmed by detail. One-size-fits-none.

Test:

Who are your actual audiences?
What depth does each audience need?
Does a general employee need the same competencies as a specialist?

Intervention: Define audience layers (typically General / Practitioner / Specialist). Map competencies to audiences. Layer content by depth:

L1: Rules without extensive justification (what to do)
L2: Principles behind rules (how to handle edge cases)
L3: Full technical detail (how to verify, audit, configure)

CF6: No Progression

Symptoms: Competencies exist but no clear order. Prerequisites unclear. No skip logic. Everyone follows the same path regardless of prior knowledge.

Test:

Which competencies require others as foundation?
What's the minimum viable path for each role?
Can someone with prior knowledge skip parts?

Intervention: Map dependencies. Build progression tree:

Foundation (everyone)
├── Prerequisite competencies
├─► Intermediate (builds on foundation)
└─► Role-specific branches (parallel tracks)

Define skip logic: what evidence allows skipping which modules?

CF7: No Verification Stakes

Symptoms: Assessment exists but doesn't gate anything. People skip or game it. No consequence for demonstrating vs. not demonstrating competency.

Test:

What decision does verification inform?
What happens if someone fails assessment?
Is there a real consequence for "not demonstrated"?

Intervention: Connect each verification to a decision:

Hiring: Does candidate advance?
Onboarding: Ready to work independently?
Access: Qualified for elevated permissions?
Promotion: Has developed required competency?

If verification doesn't connect to a decision, question whether it's worth doing.

CF8: No Feedback Loop

Symptoms: Framework built once and never updated. Questions keep arising that weren't anticipated. No visibility into what's not working.

Test:

How do you know what's not working?
What mechanism surfaces gaps in training, framework, or process?
When did this framework last change based on feedback?

Intervention: Implement feedback loop:

Agent/support system logs questions with context
Tag questions by competency/content area (or "unmapped")
Regular review for patterns
Route fixes to owners (training team, policy owners, tooling)
Track when patterns lead to changes

CF9: Static Framework

Symptoms: Framework was built months/years ago. Reality has changed but framework hasn't. Questions reveal framework doesn't match current state.

Test:

When was this last reviewed?
What triggers an update?
Who owns maintenance?

Intervention: Define:

Review triggers (policy changes, incidents, new tools, feedback patterns)
Ownership (who updates what)
Version tracking (people trained on V1 vs. V2)
Cadence (minimum review frequency even without triggers)

CF10: Framework Operational

Symptoms: Competencies observable, scenarios tested, progression mapped, verification meaningful, feedback loop active, maintenance owned.

Indicators:

Can answer all previous state questions affirmatively
New hires reach competence faster
Repeat questions decrease
Framework has evolved based on feedback data
Skip logic personalizes paths based on demonstrated competency

Diagnostic Process

When someone presents a competency development need:

Identify current state — What exists? Training content? Competency list? Scenarios? Assessment?
Apply state diagnosis — Match symptoms to states above
Ask clarifying questions — What decisions do people make? What failures have you seen?
Explain the diagnosis — Name the state and what's missing
Recommend next step — Point to specific template or intervention
Validate progress — Check if intervention resolved the state

Key Questions by Phase

For Competency Identification

What decisions do people need to make with this knowledge?
What mistakes indicate someone lacks this competency?
What would you watch someone do to verify competency?
What's the failure mode this competency would prevent?

For Scenario Design

What realistic situation requires this judgment?
What information would be incomplete or ambiguous?
What would a weak response miss?
What distinguishes competent from exceptional?

For Audience Mapping

Who needs full depth? Who needs rules only?
Can the same content serve multiple audiences at different depths?
What's minimum viable competency for each role?

For Verification Design

What decision does this verification inform?
What evidence types are appropriate (scenario response, artifact, observed behavior)?
What distinguishes "partial" from "competent" from "strong"?

For Feedback Loops

What questions do people ask after training?
Which questions indicate training gaps vs. framework gaps vs. process gaps?
Who receives the signal? Who decides on fixes?

Anti-Patterns

Pattern	Problem	Fix
Document Dump	Converting existing documentation into "training" without restructuring	Identify decisions documentation supports. Build backward from decisions to content.
Quiz Fallacy	Assessing competency with knowledge recall questions	Replace with scenarios requiring judgment. Can't answer by ctrl+F.
Universal Training	One training for all audiences	Layer content. Define minimum viable competency per role.
Orphan Scenario	Scenario doesn't map to any defined competency	Either add the competency it tests, or cut the scenario.
Orphan Content	Content doesn't support any competency	Either identify the competency it serves, or cut the content.
Checkbox Completion	"Completed training" without demonstrated competency	Tie completion to demonstrated competency, not time spent.

Templates

Competency Definition Template

## [Cluster Name] Competencies

| ID | Competency | Description |
|----|------------|-------------|
| [PREFIX]-1 | [Action verb phrase] | [Observable capability starting with "Can..."] |

Scenario Template

### Scenario: [Name]

**Core decision structure:** [What judgment is being tested]

**Interview variant:**
> [Generic situation]

**Assessment variant:**
> [Organization-specific situation]

**Competencies assessed:** [IDs]

**What good looks like:**
- [Consideration]

**Red flags:**
- [Weak response indicator]

Progression Template

Foundation (Role: Everyone)
├── [COMP-1]: [Name]
└── [COMP-2]: [Name]

├─► Intermediate (Role: [Role])
│   ├── [COMP-3]: [Name] (requires: COMP-1)
│   └── [COMP-4]: [Name] (requires: COMP-2)

└─► Specialist (Role: [Role])
    └── [COMP-5]: [Name] (requires: COMP-3, COMP-4)

Feedback Loop Template

## Feedback Loop Design

**Observation mechanism:**
- How questions are logged
- What context is captured
- How they're tagged to competencies

**Analysis cadence:** [frequency]

**Pattern categories:**
- Training gap: [who handles]
- Framework gap: [who handles]
- Process gap: [who handles]
- Tooling gap: [who handles]

**Change tracking:**
- How changes are documented
- How effectiveness is measured

Minimum Viable Framework

If starting small:

3-5 core competencies — the ones that matter most
2-3 scenarios — interview + assessment variants covering core competencies
One layer of content — probably L2 (practitioner depth)
Basic rubric — not demonstrated / partial / competent / strong
One feedback signal — what questions do people ask after training?

Expand based on what you learn from using it.

Output Persistence

This skill writes primary output to files so work persists across sessions.

Output Discovery

Before doing any other work:

Check for context/output-config.md in the project
If found, look for this skill's entry
If not found or no entry for this skill, ask the user first :
- "Where should I save output from this competency-builder session?"
- Suggest: explorations/competency/ or a sensible location for this project
Store the user's preference:
- In context/output-config.md if context network exists
- In .competency-builder-output.md at project root otherwise

Primary Output

For this skill, persist:

Diagnosed state - which competency framework state applies
Competency definitions - derived from failure modes
Scenario designs - test scenarios for each competency
Framework structure - progression model and dependencies
Feedback loop design - how gaps will be identified

Conversation vs. File

Goes to File	Stays in Conversation
State diagnosis	Clarifying questions
Competency definitions	Discussion of failure modes
Scenario templates	Iteration on structure
Framework architecture	Real-time feedback

File Naming

Pattern: {domain}-competency-{date}.md Example: ai-literacy-competency-2025-01-15.md

What This Skill Does NOT Do

Write training content — You help structure, they write content
Prescribe specific competencies — You help them discover theirs from failure modes
Assess whether existing training is "good" — You diagnose what's missing
Replace subject matter expertise — You provide methodology, they provide domain knowledge

Health Check Questions

During competency framework development, ask:

Do all competencies describe observable capabilities (not knowledge states)?
Does each scenario require judgment that can't be looked up?
Is content layered appropriately for different audiences?
Does verification connect to real decisions?
Is there a mechanism to learn what's not working?
Has the framework changed based on feedback?
Can someone with prior knowledge skip parts?
Does everyone follow the same path, or is it personalized?

Integration Points

Skill	Connection
research	Use when building L3 content that requires domain expertise
framework-development	Related but distinct: frameworks capture knowledge; competency frameworks build capability
framework-to-mastra	Competency framework + feedback loop = deployable agent

Example Interaction

User: "We have a 40-page security policy. Everyone 'completes' the training but keeps making mistakes."

Diagnosis: CF1 (Content-First Trap)

Questions to ask:

What are the 3 most common mistakes people make after "completing" training?
What decisions do people make that require this knowledge?
When someone makes a mistake, what did they fail to recognize or do?

Guidance: "Each mistake suggests a competency gap. Let's work backward: if someone incorrectly handles sensitive data, the missing competency might be 'Can classify data according to organizational categories.' Once we have 3-5 competencies from failure modes, we'll design scenarios that test whether someone can actually apply the knowledge—not just recall it."

Source Framework

Derived from: references/competency-framework-development.md

Weekly Installs

105

Repository

jwynia/agent-skills

GitHub Stars

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

opencode89

codex86

gemini-cli84

github-copilot79

cursor79

claude-code74

AI代理协作核心原则：提升开发效率的6大Agentic开发原则指南

7,600 周安装

Perfect on Paper

能力构建框架指南：从培训到可观察技能，诊断10大常见陷阱与优化路径

🇨🇳中文介绍

能力构建技能

核心原则

诊断状态

CF0：无框架

CF1：内容优先陷阱

CF2：模糊的能力定义

相关 Skills

CF3：无场景

CF4：简单场景

CF5：单一受众

CF6：无进阶路径

CF7：无验证后果

CF8：无反馈循环

CF9：静态框架

CF10：框架可运行

诊断流程

各阶段关键问题

用于能力识别

用于场景设计

用于受众映射

用于验证设计

用于反馈循环

反模式

模板

能力定义模板

场景模板

进阶模板

反馈循环模板

最小可行框架

输出持久化

输出发现

主要输出

对话与文件

文件命名

此技能不做什么

健康检查问题

集成点

示例交互

来源框架

🇺🇸English

Competency Builder Skill

Core Principle

Diagnostic States

CF0: No Framework

CF1: Content-First Trap

CF2: Vague Competencies

CF3: No Scenarios

CF4: Simple Scenarios

CF5: Single Audience

CF6: No Progression

CF7: No Verification Stakes

CF8: No Feedback Loop

CF9: Static Framework

CF10: Framework Operational

Diagnostic Process

Key Questions by Phase

For Competency Identification

For Scenario Design

For Audience Mapping

For Verification Design

For Feedback Loops

Anti-Patterns

Templates

Competency Definition Template

Scenario Template

Progression Template

Feedback Loop Template

Minimum Viable Framework

Output Persistence

Output Discovery

Primary Output

Conversation vs. File

File Naming

What This Skill Does NOT Do

Health Check Questions

Integration Points

Example Interaction

Source Framework