kaizen%3Acause-and-effect by neolabhq/context-engineering-kit
npx skills add https://github.com/neolabhq/context-engineering-kit --skill kaizen:cause-and-effect应用鱼骨图(石川图)分析法,系统性地探究问题在多个类别下的所有潜在原因。
系统性地从六个类别审视潜在原因:人员、流程、技术、环境、方法和材料。创建结构化的"鱼骨"视图,识别影响因素。
/cause-and-effect [问题描述]
问题: API 响应耗时超过 3 秒(目标:<500ms)
人员
├─ 团队不熟悉性能优化
├─ 无人负责性能监控
└─ 前端团队不了解后端限制
流程
├─ CI/CD 中没有性能测试
├─ 未定义响应时间的 SLA
└─ 代码评审未发现性能回归
技术
├─ 数据库查询未优化
│ └─ 为什么:未部署查询分析工具
├─ ORM 中存在 N+1 查询
│ └─ 为什么:未配置预加载
├─ 无缓存层
│ └─ 为什么:技术栈中未包含 Redis
└─ 同步调用外部 API
└─ 为什么:未采用异步架构
环境
├─ 生产环境使用的数据库实例规格过小
├─ 静态资源未使用 CDN
└─ 单区域部署(对远程用户延迟高)
方法
├─ REST API 设计需要多次往返请求
├─ 大数据集未分页
└─ 序列化完整对象而非选择性字段
材料
├─ JSON 载荷过大(包含不必要数据)
├─ 响应未压缩
└─ 第三方 API(支付网关)速度慢
└─ 为什么:使用有限制的免费套餐
根本原因:
- 未定义性能要求(流程)
- 缺少性能监控工具(技术)
- 架构不支持缓存/异步(方法)
解决方案(按优先级排序):
1. 添加数据库索引(快速见效,高影响力)
2. 实现 Redis 缓存层(中等工作量,高影响力)
3. 使用 webhooks 异步调用外部 API(高工作量,高影响力)
4. 定义并监控性能 SLA(低工作量,防止回归)
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
问题: 15% 的测试运行失败,重试后通过
人员
├─ 团队成员的测试编写技能参差不齐
├─ 新开发者复制了现有的不稳定模式
└─ 无人负责修复不稳定的测试
流程
├─ 不稳定的测试被标记为"已知问题"并被忽略
├─ 没有禁止合并包含不稳定测试的策略
└─ 测试失败不阻止部署
技术
├─ 异步测试设置中存在竞态条件
├─ 测试共享全局状态
├─ 测试数据库未按测试隔离
├─ 使用了 setTimeout 而非正确的等待机制
└─ CI 环境不一致(不同的 CPU/内存)
环境
├─ CI 运行器负载过重
├─ 网络时间不稳定(外部 API 模拟不稳定)
└─ 本地与 CI 之间存在时区差异
方法
├─ 集成测试未适当隔离
├─ 对合理的时序问题没有重试逻辑
└─ 测试依赖执行顺序
材料
├─ 测试数据夹具重叠
├─ 共享的测试数据库被污染
└─ 模拟数据与生产模式不匹配
根本原因:
- 无测试隔离策略(方法 + 技术)
- 流程接受不稳定的测试(流程)
- 未正确处理异步时序(技术)
解决方案:
1. 实现按测试隔离的数据库(高影响力)
2. 用正确的 async/await 模式替换 setTimeout(中等影响力)
3. 添加阻止不稳定测试模式的预提交钩子(防止新问题)
4. 强制执行策略:不稳定测试 = 阻止合并(流程变更)
问题: 简单的 CRUD 功能耗时 12 周,而非预估的 3 周
人员
├─ 开发者不熟悉代码库
├─ 关键架构师在关键阶段休假
└─ 设计师在开发中途更改了需求
流程
├─ 开始前需求未最终确定
├─ 前 6 周无代码评审(差异巨大)
├─ 多轮设计修订
└─ QA 开始较晚(在第 10 周发现问题)
技术
├─ 代码库耦合度高(变更产生连锁反应)
├─ 无自动化测试(手动测试缓慢)
├─ 需要先重构遗留代码
└─ 开发环境设置耗时 2 周
环境
├─ 测试环境损坏长达 3 周
├─ 测试需要生产数据(合规性延迟)
└─ 依赖项被另一个团队阻塞
方法
├─ 无增量交付(大爆炸式方法)
├─ 过度设计("趁此机会"添加了未来功能)
└─ 无设计文档(在实现过程中发现问题)
材料
├─ 第三方 API 在开发期间变更
├─ 生产数据模型与测试环境不同
└─ 缺少设计素材(等待设计师)
根本原因:
- 开始前未锁定需求(流程)
- 架构阻碍增量变更(技术)
- 采用大爆炸式而非迭代式方法(方法)
- 开发环境未自动化(技术)
解决方案:
1. 要求在开始前提供设计文档 + 最终确定的需求(流程)
2. 实现功能开关以支持增量交付(方法)
3. 自动化开发环境设置(技术)
4. 重构高耦合区域(技术,长期)
/why 命令结合使用,对特定原因进行更深入的分析每周安装次数
208
代码仓库
GitHub 星标数
699
首次出现
2026年2月19日
安装于
opencode203
github-copilot201
codex201
gemini-cli200
kimi-cli198
cursor198
Apply Fishbone (Ishikawa) diagram analysis to systematically explore all potential causes of a problem across multiple categories.
Systematically examine potential causes across six categories: People, Process, Technology, Environment, Methods, and Materials. Creates structured "fishbone" view identifying contributing factors.
/cause-and-effect [problem_description]
Problem: API responses take 3+ seconds (target: <500ms)
PEOPLE
├─ Team unfamiliar with performance optimization
├─ No one owns performance monitoring
└─ Frontend team doesn't understand backend constraints
PROCESS
├─ No performance testing in CI/CD
├─ No SLA defined for response times
└─ Performance regression not caught in code review
TECHNOLOGY
├─ Database queries not optimized
│ └─ Why: No query analysis tools in place
├─ N+1 queries in ORM
│ └─ Why: Eager loading not configured
├─ No caching layer
│ └─ Why: Redis not in tech stack
└─ Synchronous external API calls
└─ Why: No async architecture in place
ENVIRONMENT
├─ Production uses smaller database instance than needed
├─ No CDN for static assets
└─ Single region deployment (high latency for distant users)
METHODS
├─ REST API design requires multiple round trips
├─ No pagination on large datasets
└─ Full object serialization instead of selective fields
MATERIALS
├─ Large JSON payloads (unnecessary data)
├─ Uncompressed responses
└─ Third-party API (payment gateway) is slow
└─ Why: Free tier with rate limiting
ROOT CAUSES:
- No performance requirements defined (Process)
- Missing performance monitoring tooling (Technology)
- Architecture doesn't support caching/async (Methods)
SOLUTIONS (Priority Order):
1. Add database indexes (quick win, high impact)
2. Implement Redis caching layer (medium effort, high impact)
3. Make external API calls async with webhooks (high effort, high impact)
4. Define and monitor performance SLAs (low effort, prevents regression)
Problem: 15% of test runs fail, passing on retry
PEOPLE
├─ Test-writing skills vary across team
├─ New developers copy existing flaky patterns
└─ No one assigned to fix flaky tests
PROCESS
├─ Flaky tests marked as "known issue" and ignored
├─ No policy against merging with flaky tests
└─ Test failures don't block deployments
TECHNOLOGY
├─ Race conditions in async test setup
├─ Tests share global state
├─ Test database not isolated per test
├─ setTimeout used instead of proper waiting
└─ CI environment inconsistent (different CPU/memory)
ENVIRONMENT
├─ CI runner under heavy load
├─ Network timing varies (external API mocks flaky)
└─ Timezone differences between local and CI
METHODS
├─ Integration tests not properly isolated
├─ No retry logic for legitimate timing issues
└─ Tests depend on execution order
MATERIALS
├─ Test data fixtures overlap
├─ Shared test database polluted
└─ Mock data doesn't match production patterns
ROOT CAUSES:
- No test isolation strategy (Methods + Technology)
- Process accepts flaky tests (Process)
- Async timing not handled properly (Technology)
SOLUTIONS:
1. Implement per-test database isolation (high impact)
2. Replace setTimeout with proper async/await patterns (medium impact)
3. Add pre-commit hook blocking flaky test patterns (prevents new issues)
4. Enforce policy: flaky test = block merge (process change)
Problem: Simple CRUD feature took 12 weeks vs. 3 week estimate
PEOPLE
├─ Developer unfamiliar with codebase
├─ Key architect on vacation during critical phase
└─ Designer changed requirements mid-development
PROCESS
├─ Requirements not finalized before starting
├─ No code review for first 6 weeks (large diff)
├─ Multiple rounds of design revision
└─ QA started late (found issues in week 10)
TECHNOLOGY
├─ Codebase has high coupling (change ripple effects)
├─ No automated tests (manual testing slow)
├─ Legacy code required refactoring first
└─ Development environment setup took 2 weeks
ENVIRONMENT
├─ Staging environment broken for 3 weeks
├─ Production data needed for testing (compliance delay)
└─ Dependencies blocked by another team
METHODS
├─ No incremental delivery (big bang approach)
├─ Over-engineering (added future features "while we're at it")
└─ No design doc (discovered issues during implementation)
MATERIALS
├─ Third-party API changed during development
├─ Production data model different than staging
└─ Missing design assets (waited for designer)
ROOT CAUSES:
- No requirements lock-down before start (Process)
- Architecture prevents incremental changes (Technology)
- Big bang approach vs. iterative (Methods)
- Development environment not automated (Technology)
SOLUTIONS:
1. Require design doc + finalized requirements before starting (Process)
2. Implement feature flags for incremental delivery (Methods)
3. Automate dev environment setup (Technology)
4. Refactor high-coupling areas (Technology, long-term)
/why command for deeper analysis of specific causesWeekly Installs
208
Repository
GitHub Stars
699
First Seen
Feb 19, 2026
Installed on
opencode203
github-copilot201
codex201
gemini-cli200
kimi-cli198
cursor198
站立会议模板:敏捷开发每日站会指南与工具(含远程团队异步模板)
10,500 周安装