competency-builder by jwynia/agent-skills
npx skills add https://github.com/jwynia/agent-skills --skill competency-builder构建并运行能够产生实际能力(而非仅仅是完成度)的能力框架。诊断能力发展受阻之处,并指导下一步行动。
能力是可观察到的技能,而非知识状态。如果你无法观察某人演示它,那它就不是一种能力。
症状: 拥有培训内容但缺乏能力结构。人们完成了培训但无法应用。同样的问题不断被提出。
测试:
干预措施: 从失败模式入手。列出你见过的错误、本不该被提出的问题、耗时过长的环节。每个失败模式都暗示着一种可以预防它的能力。
症状: 从罗列人们需要知道的所有信息开始。培训内容全面但能力低下。"我们培训过那个",但错误仍在继续。
测试:
干预措施: 将每个内容块重新定义为"这能支持什么决策/行动?"。删除不支持任何能力的孤立内容。从行动反向推导所需知识。
症状: 能力被定义为知识状态("理解 X")而非技能("能根据 Y 评估 X")。无法判断某人是否具备该能力。
测试:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
干预措施: 将每项能力重写为可观察的行为。转换示例:
症状: 定义了能力但无法测试。评估方式是知识回忆(测验、选择题)。人们通过了评估但在真实情境中失败。
测试:
干预措施: 为每项核心能力创建一个场景,该场景应:
创建变体:面试(通用)、评估(组织特定)、持续(真实情境)。
症状: 存在场景但过于清晰、不真实。提供了所有需要的信息。存在明显的"正确答案"。人们通过了评估但在混乱的真实情境中失败。
测试:
干预措施: 增加模糊性。去除人为的清晰度。包含可能相关但实际无关的信息,并省略那些会让答案显而易见的信息。用真人测试——如果每个人都能立即得到相同答案,那就太简单了。
症状: 每个人都接受相同的培训。专家对基础知识感到无聊。通才被细节淹没。一刀切,无人满意。
测试:
干预措施: 定义受众层级(通常是 通用 / 实践者 / 专家)。将能力映射到受众。按深度分层内容:
症状: 存在能力定义但顺序不明确。先决条件不清楚。没有跳过逻辑。无论先前知识如何,每个人都遵循相同的路径。
测试:
干预措施: 映射依赖关系。构建进阶树:
基础(所有人)
├── 先决能力
├─► 中级(建立在基础之上)
└─► 角色特定分支(并行路径)
定义跳过逻辑:什么证据允许跳过哪些模块?
症状: 存在评估但不设门槛。人们跳过或应付了事。证明能力与未证明能力没有区别。
测试:
干预措施: 将每次验证与一个决策联系起来:
如果验证不与决策挂钩,质疑其是否有必要进行。
症状: 框架构建一次后从未更新。不断出现未预料到的问题。无法了解哪些地方不起作用。
测试:
干预措施: 实施反馈循环:
症状: 框架是数月/数年前构建的。现实已发生变化但框架未变。问题揭示框架与当前状态不符。
测试:
干预措施: 定义:
症状: 能力可观察,场景经过测试,进阶路径已映射,验证有意义,反馈循环活跃,维护有主。
指标:
当某人提出能力发展需求时:
| 模式 | 问题 | 修复方案 |
|---|---|---|
| 文档转储 | 将现有文档转换为"培训"而未重新构建 | 识别文档支持的决策。从决策反向构建内容。 |
| 测验谬误 | 用知识回忆问题评估能力 | 替换为需要判断力的场景。无法通过 Ctrl+F 回答。 |
| 通用培训 | 为所有受众提供单一培训 | 分层内容。定义每个角色的最小可行能力。 |
| 孤立场景 | 场景未映射到任何已定义的能力 | 要么添加它测试的能力,要么删除该场景。 |
| 孤立内容 | 内容不支持任何能力 | 要么识别它服务的能力,要么删除该内容。 |
| 勾选框完成 | "完成培训"但未证明能力 | 将完成与已证明的能力挂钩,而非花费的时间。 |
| 纸上完美 | 框架存在但未使用;培训照旧进行 | 用真人进行试点。获取反馈。迭代。 |
| 一次性构建 | 框架创建后从未更新 | 定义维护的触发条件、负责人和节奏。 |
## [集群名称] 能力
| ID | 能力 | 描述 |
|----|------------|-------------|
| [PREFIX]-1 | [行为动词短语] | [以"能..."开头的可观察技能] |
### 场景:[名称]
**核心决策结构:** [正在测试的判断力是什么]
**面试变体:**
> [通用情境]
**评估变体:**
> [组织特定情境]
**评估的能力:** [IDs]
**优秀表现示例:**
- [考量点]
**危险信号:**
- [弱响应指标]
基础(角色:所有人)
├── [COMP-1]: [名称]
└── [COMP-2]: [名称]
├─► 中级(角色:[角色])
│ ├── [COMP-3]: [名称] (需要: COMP-1)
│ └── [COMP-4]: [名称] (需要: COMP-2)
└─► 专家(角色:[角色])
└── [COMP-5]: [名称] (需要: COMP-3, COMP-4)
## 反馈循环设计
**观察机制:**
- 如何记录问题
- 捕获什么上下文
- 如何将其标记到能力
**分析节奏:** [频率]
**模式类别:**
- 培训差距:[谁处理]
- 框架差距:[谁处理]
- 流程差距:[谁处理]
- 工具差距:[谁处理]
**变更跟踪:**
- 如何记录变更
- 如何衡量有效性
如果从小规模开始:
根据使用过程中学到的内容进行扩展。
此技能将主要输出写入文件,以便工作在不同会话间持久保存。
在进行任何其他工作之前:
context/output-config.mdexplorations/competency/ 或该项目的一个合理位置context/output-config.md 中
* 否则存储在项目根目录的 .competency-builder-output.md 中对于此技能,持久化:
| 存入文件 | 保留在对话中 |
|---|---|
| 状态诊断 | 澄清性问题 |
| 能力定义 | 关于失败模式的讨论 |
| 场景模板 | 结构迭代 |
| 框架架构 | 实时反馈 |
模式:{领域}-competency-{日期}.md 示例:ai-literacy-competency-2025-01-15.md
在能力框架开发过程中,询问:
| 技能 | 关联点 |
|---|---|
| research | 当构建需要领域知识的 L3 内容时使用 |
| framework-development | 相关但有区别:框架捕获知识;能力框架构建技能 |
| framework-to-mastra | 能力框架 + 反馈循环 = 可部署的代理 |
用户: "我们有一个 40 页的安全政策。每个人都'完成'了培训,但仍在不断犯错。"
诊断: CF1(内容优先陷阱)
要问的问题:
指导: "每个错误都暗示着一个能力差距。让我们反向推导:如果有人错误地处理敏感数据,缺失的能力可能是'能根据组织类别对数据进行分类'。一旦我们从失败模式中得出 3-5 项能力,我们将设计场景来测试某人是否能实际应用知识——而不仅仅是回忆它。"
衍生自:references/competency-framework-development.md
每周安装数
105
仓库
GitHub Stars
42
首次出现
Jan 20, 2026
安全审计
安装于
opencode89
codex86
gemini-cli84
github-copilot79
cursor79
claude-code74
Build and operate competency frameworks that produce capability—not just completion. Diagnose where competency development is stuck and guide the next step.
Competencies are observable capabilities, not knowledge states. If you can't watch someone demonstrate it, it's not a competency.
Symptoms: Have training content but no competency structure. People complete training but can't apply it. Same questions keep getting asked.
Test:
Intervention: Start with failure modes. List mistakes you've seen, questions that shouldn't need asking, things that take too long. Each failure mode suggests a competency that would prevent it.
Symptoms: Started by listing all the information people need to know. Training is comprehensive but competence is low. "We trained on that" but mistakes continue.
Test:
Intervention: Reframe each content chunk as "what decision/action does this enable?" Kill orphan content that doesn't support a competency. Work backward from actions to required knowledge.
Symptoms: Competencies are knowledge states ("understands X") not capabilities ("can evaluate X against Y"). Can't tell if someone has the competency or not.
Test:
Intervention: Rewrite each competency as observable behavior. Transform:
Symptoms: Competencies defined but no way to test them. Assessment is knowledge recall (quizzes, multiple choice). People pass but fail in real situations.
Test:
Intervention: For each core competency, create a scenario that:
Create variants: interview (generic), assessment (org-specific), ongoing (real situations).
Symptoms: Scenarios exist but have artificial clarity. All information needed is provided. There's an obvious "right answer." People pass but fail in messy real situations.
Test:
Intervention: Add ambiguity. Remove artificial clarity. Include information that might be relevant but isn't, and omit information that would make the answer obvious. Test with real people—if everyone gets the same answer immediately, it's too simple.
Symptoms: Everyone gets the same training. Specialists are bored by basics. Generalists are overwhelmed by detail. One-size-fits-none.
Test:
Intervention: Define audience layers (typically General / Practitioner / Specialist). Map competencies to audiences. Layer content by depth:
Symptoms: Competencies exist but no clear order. Prerequisites unclear. No skip logic. Everyone follows the same path regardless of prior knowledge.
Test:
Intervention: Map dependencies. Build progression tree:
Foundation (everyone)
├── Prerequisite competencies
├─► Intermediate (builds on foundation)
└─► Role-specific branches (parallel tracks)
Define skip logic: what evidence allows skipping which modules?
Symptoms: Assessment exists but doesn't gate anything. People skip or game it. No consequence for demonstrating vs. not demonstrating competency.
Test:
Intervention: Connect each verification to a decision:
If verification doesn't connect to a decision, question whether it's worth doing.
Symptoms: Framework built once and never updated. Questions keep arising that weren't anticipated. No visibility into what's not working.
Test:
Intervention: Implement feedback loop:
Symptoms: Framework was built months/years ago. Reality has changed but framework hasn't. Questions reveal framework doesn't match current state.
Test:
Intervention: Define:
Symptoms: Competencies observable, scenarios tested, progression mapped, verification meaningful, feedback loop active, maintenance owned.
Indicators:
When someone presents a competency development need:
| Pattern | Problem | Fix |
|---|---|---|
| Document Dump | Converting existing documentation into "training" without restructuring | Identify decisions documentation supports. Build backward from decisions to content. |
| Quiz Fallacy | Assessing competency with knowledge recall questions | Replace with scenarios requiring judgment. Can't answer by ctrl+F. |
| Universal Training | One training for all audiences | Layer content. Define minimum viable competency per role. |
| Orphan Scenario | Scenario doesn't map to any defined competency | Either add the competency it tests, or cut the scenario. |
| Orphan Content | Content doesn't support any competency | Either identify the competency it serves, or cut the content. |
| Checkbox Completion | "Completed training" without demonstrated competency | Tie completion to demonstrated competency, not time spent. |
## [Cluster Name] Competencies
| ID | Competency | Description |
|----|------------|-------------|
| [PREFIX]-1 | [Action verb phrase] | [Observable capability starting with "Can..."] |
### Scenario: [Name]
**Core decision structure:** [What judgment is being tested]
**Interview variant:**
> [Generic situation]
**Assessment variant:**
> [Organization-specific situation]
**Competencies assessed:** [IDs]
**What good looks like:**
- [Consideration]
**Red flags:**
- [Weak response indicator]
Foundation (Role: Everyone)
├── [COMP-1]: [Name]
└── [COMP-2]: [Name]
├─► Intermediate (Role: [Role])
│ ├── [COMP-3]: [Name] (requires: COMP-1)
│ └── [COMP-4]: [Name] (requires: COMP-2)
└─► Specialist (Role: [Role])
└── [COMP-5]: [Name] (requires: COMP-3, COMP-4)
## Feedback Loop Design
**Observation mechanism:**
- How questions are logged
- What context is captured
- How they're tagged to competencies
**Analysis cadence:** [frequency]
**Pattern categories:**
- Training gap: [who handles]
- Framework gap: [who handles]
- Process gap: [who handles]
- Tooling gap: [who handles]
**Change tracking:**
- How changes are documented
- How effectiveness is measured
If starting small:
Expand based on what you learn from using it.
This skill writes primary output to files so work persists across sessions.
Before doing any other work:
context/output-config.md in the projectexplorations/competency/ or a sensible location for this projectcontext/output-config.md if context network exists.competency-builder-output.md at project root otherwiseFor this skill, persist:
| Goes to File | Stays in Conversation |
|---|---|
| State diagnosis | Clarifying questions |
| Competency definitions | Discussion of failure modes |
| Scenario templates | Iteration on structure |
| Framework architecture | Real-time feedback |
Pattern: {domain}-competency-{date}.md Example: ai-literacy-competency-2025-01-15.md
During competency framework development, ask:
| Skill | Connection |
|---|---|
| research | Use when building L3 content that requires domain expertise |
| framework-development | Related but distinct: frameworks capture knowledge; competency frameworks build capability |
| framework-to-mastra | Competency framework + feedback loop = deployable agent |
User: "We have a 40-page security policy. Everyone 'completes' the training but keeps making mistakes."
Diagnosis: CF1 (Content-First Trap)
Questions to ask:
Guidance: "Each mistake suggests a competency gap. Let's work backward: if someone incorrectly handles sensitive data, the missing competency might be 'Can classify data according to organizational categories.' Once we have 3-5 competencies from failure modes, we'll design scenarios that test whether someone can actually apply the knowledge—not just recall it."
Derived from: references/competency-framework-development.md
Weekly Installs
105
Repository
GitHub Stars
42
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode89
codex86
gemini-cli84
github-copilot79
cursor79
claude-code74
AI代理协作核心原则:提升开发效率的6大Agentic开发原则指南
7,600 周安装
| Framework exists but isn't used; training continues as before |
| Pilot with real people. Get feedback. Iterate. |
| Build-Once | Framework created, never updated | Define triggers, owners, cadence for maintenance. |