鱼骨图因果分析工具 - 系统化问题根源诊断与解决方案优先级排序

kaizen%3Acause-and-effect by neolabhq/context-engineering-kit

208 周安装量

699 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/neolabhq/context-engineering-kit --skill kaizen:cause-and-effect

质量管理方法论项目管理

🇨🇳中文介绍

因果分析

应用鱼骨图（石川图）分析法，系统性地探究问题在多个类别下的所有潜在原因。

描述

系统性地从六个类别审视潜在原因：人员、流程、技术、环境、方法和材料。创建结构化的"鱼骨"视图，识别影响因素。

用法

/cause-and-effect [问题描述]

变量

PROBLEM: 待分析的问题（默认：提示输入）
CATEGORIES: 待探究的类别（默认：全部六个）

步骤

清晰地陈述问题（鱼头）
针对每个类别，进行头脑风暴，找出潜在原因：
- 人员 : 技能、培训、沟通、团队协作
- 流程 : 工作流、程序、标准、评审
- 技术 : 工具、基础设施、依赖项、配置
- 环境 : 工作空间、部署目标、外部因素
- 方法 : 方法、模式、架构、实践
- 材料 : 数据、依赖项、第三方服务、资源
针对每个潜在原因，追问"为什么"以深入挖掘
区分促成因素与根本原因
根据影响力和可能性对原因进行优先级排序
针对最高优先级的原因提出解决方案

示例

示例 1: API 响应延迟

问题: API 响应耗时超过 3 秒（目标：<500ms）

人员
├─ 团队不熟悉性能优化
├─ 无人负责性能监控
└─ 前端团队不了解后端限制

流程
├─ CI/CD 中没有性能测试
├─ 未定义响应时间的 SLA
└─ 代码评审未发现性能回归

技术
├─ 数据库查询未优化
│  └─ 为什么：未部署查询分析工具
├─ ORM 中存在 N+1 查询
│  └─ 为什么：未配置预加载
├─ 无缓存层
│  └─ 为什么：技术栈中未包含 Redis
└─ 同步调用外部 API
   └─ 为什么：未采用异步架构

环境
├─ 生产环境使用的数据库实例规格过小
├─ 静态资源未使用 CDN
└─ 单区域部署（对远程用户延迟高）

方法
├─ REST API 设计需要多次往返请求
├─ 大数据集未分页
└─ 序列化完整对象而非选择性字段

材料
├─ JSON 载荷过大（包含不必要数据）
├─ 响应未压缩
└─ 第三方 API（支付网关）速度慢
   └─ 为什么：使用有限制的免费套餐

根本原因:
- 未定义性能要求（流程）
- 缺少性能监控工具（技术）
- 架构不支持缓存/异步（方法）

解决方案（按优先级排序）:
1. 添加数据库索引（快速见效，高影响力）
2. 实现 Redis 缓存层（中等工作量，高影响力）
3. 使用 webhooks 异步调用外部 API（高工作量，高影响力）
4. 定义并监控性能 SLA（低工作量，防止回归）

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

示例 2: 不稳定的测试套件

问题: 15% 的测试运行失败，重试后通过

人员
├─ 团队成员的测试编写技能参差不齐
├─ 新开发者复制了现有的不稳定模式
└─ 无人负责修复不稳定的测试

流程
├─ 不稳定的测试被标记为"已知问题"并被忽略
├─ 没有禁止合并包含不稳定测试的策略
└─ 测试失败不阻止部署

技术
├─ 异步测试设置中存在竞态条件
├─ 测试共享全局状态
├─ 测试数据库未按测试隔离
├─ 使用了 setTimeout 而非正确的等待机制
└─ CI 环境不一致（不同的 CPU/内存）

环境
├─ CI 运行器负载过重
├─ 网络时间不稳定（外部 API 模拟不稳定）
└─ 本地与 CI 之间存在时区差异

方法
├─ 集成测试未适当隔离
├─ 对合理的时序问题没有重试逻辑
└─ 测试依赖执行顺序

材料
├─ 测试数据夹具重叠
├─ 共享的测试数据库被污染
└─ 模拟数据与生产模式不匹配

根本原因:
- 无测试隔离策略（方法 + 技术）
- 流程接受不稳定的测试（流程）
- 未正确处理异步时序（技术）

解决方案:
1. 实现按测试隔离的数据库（高影响力）
2. 用正确的 async/await 模式替换 setTimeout（中等影响力）
3. 添加阻止不稳定测试模式的预提交钩子（防止新问题）
4. 强制执行策略：不稳定测试 = 阻止合并（流程变更）

示例 3: 功能耗时 3 个月而非 3 周

问题: 简单的 CRUD 功能耗时 12 周，而非预估的 3 周

人员
├─ 开发者不熟悉代码库
├─ 关键架构师在关键阶段休假
└─ 设计师在开发中途更改了需求

流程
├─ 开始前需求未最终确定
├─ 前 6 周无代码评审（差异巨大）
├─ 多轮设计修订
└─ QA 开始较晚（在第 10 周发现问题）

技术
├─ 代码库耦合度高（变更产生连锁反应）
├─ 无自动化测试（手动测试缓慢）
├─ 需要先重构遗留代码
└─ 开发环境设置耗时 2 周

环境
├─ 测试环境损坏长达 3 周
├─ 测试需要生产数据（合规性延迟）
└─ 依赖项被另一个团队阻塞

方法
├─ 无增量交付（大爆炸式方法）
├─ 过度设计（"趁此机会"添加了未来功能）
└─ 无设计文档（在实现过程中发现问题）

材料
├─ 第三方 API 在开发期间变更
├─ 生产数据模型与测试环境不同
└─ 缺少设计素材（等待设计师）

根本原因:
- 开始前未锁定需求（流程）
- 架构阻碍增量变更（技术）
- 采用大爆炸式而非迭代式方法（方法）
- 开发环境未自动化（技术）

解决方案:
1. 要求在开始前提供设计文档 + 最终确定的需求（流程）
2. 实现功能开关以支持增量交付（方法）
3. 自动化开发环境设置（技术）
4. 重构高耦合区域（技术，长期）

鱼骨图揭示了跨领域的系统性问题
多种原因常常共同作用导致问题
不要在每个类别中停留在第一个原因上——深入挖掘
有些原因跨越多个类别（标记它们）
根本原因通常在流程或方法中（而不仅仅是技术）
可与 /why 命令结合使用，对特定原因进行更深入的分析
解决方案优先级排序公式：影响力 × 可行性 ÷ 工作量
解决根本原因，而非仅仅表象

🇺🇸English

Cause and Effect Analysis

Apply Fishbone (Ishikawa) diagram analysis to systematically explore all potential causes of a problem across multiple categories.

Description

Systematically examine potential causes across six categories: People, Process, Technology, Environment, Methods, and Materials. Creates structured "fishbone" view identifying contributing factors.

Usage

/cause-and-effect [problem_description]

Variables

PROBLEM: Issue to analyze (default: prompt for input)
CATEGORIES: Categories to explore (default: all six)

Steps

State the problem clearly (the "head" of the fish)
For each category, brainstorm potential causes:
- People : Skills, training, communication, team dynamics
- Process : Workflows, procedures, standards, reviews
- Technology : Tools, infrastructure, dependencies, configuration
- Environment : Workspace, deployment targets, external factors
- Methods : Approaches, patterns, architectures, practices
- Materials : Data, dependencies, third-party services, resources
For each potential cause, ask "why" to dig deeper
Identify which causes are contributing vs. root causes
Prioritize causes by impact and likelihood
Propose solutions for highest-priority causes

Examples

Example 1: API Response Latency

Problem: API responses take 3+ seconds (target: <500ms)

PEOPLE
├─ Team unfamiliar with performance optimization
├─ No one owns performance monitoring
└─ Frontend team doesn't understand backend constraints

PROCESS
├─ No performance testing in CI/CD
├─ No SLA defined for response times
└─ Performance regression not caught in code review

TECHNOLOGY
├─ Database queries not optimized
│  └─ Why: No query analysis tools in place
├─ N+1 queries in ORM
│  └─ Why: Eager loading not configured
├─ No caching layer
│  └─ Why: Redis not in tech stack
└─ Synchronous external API calls
   └─ Why: No async architecture in place

ENVIRONMENT
├─ Production uses smaller database instance than needed
├─ No CDN for static assets
└─ Single region deployment (high latency for distant users)

METHODS
├─ REST API design requires multiple round trips
├─ No pagination on large datasets
└─ Full object serialization instead of selective fields

MATERIALS
├─ Large JSON payloads (unnecessary data)
├─ Uncompressed responses
└─ Third-party API (payment gateway) is slow
   └─ Why: Free tier with rate limiting

ROOT CAUSES:
- No performance requirements defined (Process)
- Missing performance monitoring tooling (Technology)
- Architecture doesn't support caching/async (Methods)

SOLUTIONS (Priority Order):
1. Add database indexes (quick win, high impact)
2. Implement Redis caching layer (medium effort, high impact)
3. Make external API calls async with webhooks (high effort, high impact)
4. Define and monitor performance SLAs (low effort, prevents regression)

Example 2: Flaky Test Suite

Problem: 15% of test runs fail, passing on retry

PEOPLE
├─ Test-writing skills vary across team
├─ New developers copy existing flaky patterns
└─ No one assigned to fix flaky tests

PROCESS
├─ Flaky tests marked as "known issue" and ignored
├─ No policy against merging with flaky tests
└─ Test failures don't block deployments

TECHNOLOGY
├─ Race conditions in async test setup
├─ Tests share global state
├─ Test database not isolated per test
├─ setTimeout used instead of proper waiting
└─ CI environment inconsistent (different CPU/memory)

ENVIRONMENT
├─ CI runner under heavy load
├─ Network timing varies (external API mocks flaky)
└─ Timezone differences between local and CI

METHODS
├─ Integration tests not properly isolated
├─ No retry logic for legitimate timing issues
└─ Tests depend on execution order

MATERIALS
├─ Test data fixtures overlap
├─ Shared test database polluted
└─ Mock data doesn't match production patterns

ROOT CAUSES:
- No test isolation strategy (Methods + Technology)
- Process accepts flaky tests (Process)
- Async timing not handled properly (Technology)

SOLUTIONS:
1. Implement per-test database isolation (high impact)
2. Replace setTimeout with proper async/await patterns (medium impact)
3. Add pre-commit hook blocking flaky test patterns (prevents new issues)
4. Enforce policy: flaky test = block merge (process change)

Example 3: Feature Takes 3 Months Instead of 3 Weeks

Problem: Simple CRUD feature took 12 weeks vs. 3 week estimate

PEOPLE
├─ Developer unfamiliar with codebase
├─ Key architect on vacation during critical phase
└─ Designer changed requirements mid-development

PROCESS
├─ Requirements not finalized before starting
├─ No code review for first 6 weeks (large diff)
├─ Multiple rounds of design revision
└─ QA started late (found issues in week 10)

TECHNOLOGY
├─ Codebase has high coupling (change ripple effects)
├─ No automated tests (manual testing slow)
├─ Legacy code required refactoring first
└─ Development environment setup took 2 weeks

ENVIRONMENT
├─ Staging environment broken for 3 weeks
├─ Production data needed for testing (compliance delay)
└─ Dependencies blocked by another team

METHODS
├─ No incremental delivery (big bang approach)
├─ Over-engineering (added future features "while we're at it")
└─ No design doc (discovered issues during implementation)

MATERIALS
├─ Third-party API changed during development
├─ Production data model different than staging
└─ Missing design assets (waited for designer)

ROOT CAUSES:
- No requirements lock-down before start (Process)
- Architecture prevents incremental changes (Technology)
- Big bang approach vs. iterative (Methods)
- Development environment not automated (Technology)

SOLUTIONS:
1. Require design doc + finalized requirements before starting (Process)
2. Implement feature flags for incremental delivery (Methods)
3. Automate dev environment setup (Technology)
4. Refactor high-coupling areas (Technology, long-term)

Notes

Fishbone reveals systemic issues across domains
Multiple causes often combine to create problems
Don't stop at first cause in each category—dig deeper
Some causes span multiple categories (mark them)
Root causes usually in Process or Methods (not just Technology)
Use with /why command for deeper analysis of specific causes
Prioritize solutions by: impact × feasibility ÷ effort
Address root causes, not just symptoms

Weekly Installs

208

Repository

neolabhq/contex…ring-kit

GitHub Stars

699

First Seen

Feb 19, 2026

Installed on

opencode203

github-copilot201

codex201

gemini-cli200

kimi-cli198

cursor198

站立会议模板：敏捷开发每日站会指南与工具（含远程团队异步模板）

10,500 周安装