事后分析（Postmortem）框架：5个为什么分析法与根本原因识别指南

postmortem by alirezarezvani/claude-skills

337 周安装量

9,500 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/alirezarezvani/claude-skills --skill postmortem

质量管理方法论项目管理

🇨🇳中文介绍

/em:postmortem —— 对问题根源的诚实分析

命令： /em:postmortem <事件>

不是指责。是理解。失败的交易、未达成的季度目标、反响不佳的功能、不成功的招聘。究竟发生了什么，原因是什么，以及因此需要做出哪些改变。

为何大多数事后分析会失败

它们通常演变成以下两种情况之一：

指责会 —— 有人成了替罪羊，防御墙高高筑起，真正的原因未被审视，同样的问题会以不同的形式再次发生。

粉饰会 —— “我们学到了很多，我们会做得更好，这里有12条模糊的行动项。” 一切照旧。同样的问题，不同的季度。

真正的复盘分析两者都不是。它是对系统故障的一次严谨调查。不是“这是谁的错”，而是“哪些条件使得这个结果在事后看来是可预见的？”

目的： 从失败中汲取最大的学习价值，以便防止问题复发并改进系统。

分析框架

步骤 1：精确定义事件

在分析之前：准确描述发生了什么。

预期结果是什么？
实际结果是什么？
差距何时首次显现？
产生了什么影响（财务、运营、声誉）？

精确性至关重要。“我们未达到第三季度收入目标”不够精确。“我们新签了42万美元的年度经常性收入，而目标是68万美元——26万美元的差额主要由三笔推迟到第四季度的交易和一笔输给竞争对手的交易造成”是精确的。

步骤 2：正确的“5个为什么”分析法

目标：从发生了什么（表象）追溯到为什么会发生（根本原因）。

典型的糟糕“5个为什么”：

为什么我们没达到收入目标？因为交易推迟了。
为什么交易推迟了？因为销售周期比预期长。
为什么？因为客户的购买流程很复杂。
为什么？因为我们面向企业销售。
为什么？企业销售本来就是这样。

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

步骤 3：区分促成因素和根本原因

大多数事件都有多个促成因素。并非所有都是根本原因。

促成因素： 使情况恶化，但不是核心原因。如果消除它，结果可能会不同——但同类问题仍会复发。

根本原因： 导致该结果很可能发生的基本条件。解决这个问题，此类问题就不会复发。

示例 —— 失败的招聘：

促成因素：流程仓促、跳过了背景调查、团队面临人员配备压力
根本原因：没有定义明确的能力框架，因此面试流程因面试官而异

这种区分很重要。 如果只解决促成因素，下次你将会遇到一个看似不同但结构相同的失败。

步骤 4：识别被忽视的预警信号

每次失败都有前兆。事后看来，它们显而易见。这一步的价值在于让它们在未来变得显而易见。

在哪个时间点，负面结果是可以预测的？
在那个时间点有哪些可见的信号？
谁看到了它们？当他们提出时发生了什么？
为什么没有采取行动？

信号被提出，但被资深人士驳回
信号未被提出，因为没有人感到安全去说出来
信号被看到，但没有人有明确的职责去采取行动
数据是可用的，但没有人查看
团队过于乐观，没有认真对待负面信号

这一步对于系统性问题尤为重要——“我们觉得提出担忧不安全”是一个比“交易资格认定有误”更深层的根本原因。

步骤 5：区分可控与不可控因素

有些失败发生在决策正确的情况下。有些则是因为决策错误。了解其中的差异可以防止过度纠正和纠正不足。

可控： 流程、标准、团队能力、资源分配、所做的决策
不可控： 市场状况、客户决策、竞争对手行动、宏观事件

对于不可控因素：可以做些什么来增强对类似事件的韧性？对于可控因素：具体需要改变什么？

警告： “这超出了我们的控制范围”有时被用来逃避责任。要严谨。

步骤 6：建立变更记录

每次复盘分析都应以变更记录结束——具体的承诺，有负责人和截止日期。

糟糕的行动项：

“我们将改进我们的资格认定流程”
“沟通会更好”
“我们将更严格地进行预测”

良好的行动项：

“Ravi 负责在3月15日前重写资格认定标准，将识别内部支持者作为硬性要求。新标准从3月22日起在每周销售站会上审查。”
“在3月10日前，Elena 在CRM中为任何超过60天未进行产品演示的未结机会添加交易推迟风险标志。”
“Maria 从4月1日起每6周与企业销售团队进行一次30分钟的回顾，审查赢单/丢单数据。”

针对每项行动：

具体要改变什么？
谁负责？
截止日期是什么时候？
如何验证它是否有效？

步骤 7：验证日期

最常被跳过的步骤。如果没有人检查变更是否真的发生并真正有效，那么复盘分析就毫无用处。

设定一个验证日期：“我们将在6月的董事会会议上审查资格认定标准是否已更新，以及交易推迟率是否有所改善。”

没有这一步，复盘分析就是一场表演。

复盘分析输出格式

事件: [名称和日期]
预期: [预期发生什么]
实际: [实际发生了什么]
影响: [量化影响]

时间线
[日期]: [发生了什么或显现了什么]
[日期]: ...

5个为什么
1. [为什么X发生了？] → 因为 [Y]
2. [为什么Y发生了？] → 因为 [Z]
3. [为什么Z发生了？] → 因为 [A]
4. [为什么A发生了？] → 因为 [B]
5. [为什么B发生了？] → 因为 [根本原因]

根本原因: [一句清晰的陈述]

促成因素
• [因素] —— 它是如何促成问题的
• [因素] —— 它是如何促成问题的

被忽视的预警信号
• [在什么日期可见的信号] —— 为什么没有采取行动

可控因素: [列表]
不可控因素: [列表]

变更记录
| 行动项 | 负责人 | 截止日期 | 验证方式 |
|--------|-------|----------|-------------|
| [具体变更] | [姓名] | [日期] | [如何验证] |

验证日期: [检查日期]

良好复盘分析的基调

指责是廉价的。理解是困难的。

目标不是确认某人犯了错误。目标是理解为什么系统产生了那个结果——以便改进系统。

“销售人员没有正确认定交易资格”是指责。“当我们向上游市场转移时，我们的资格认定框架没有更新，而且没有人负责保持其时效性”是理解。

第一种版本会解雇或羞辱某人。第二种版本会建立一个更具韧性的组织。

两者可能同时成立。区别在于：哪一个才能真正防止问题复发？

🇺🇸English

/em:postmortem — Honest Analysis of What Went Wrong

Command: /em:postmortem <event>

Not blame. Understanding. The failed deal, the missed quarter, the feature that flopped, the hire that didn't work out. What actually happened, why, and what changes as a result.

Why Most Post-Mortems Fail

They become one of two things:

The blame session — someone gets scapegoated, defensive walls go up, actual causes don't get examined, and the same problem happens again in a different form.

The whitewash — "We learned a lot, we're going to do better, here are 12 vague action items." Nothing changes. Same problem, different quarter.

A real post-mortem is neither. It's a rigorous investigation into a system failure. Not "whose fault was it" but "what conditions made this outcome predictable in hindsight?"

The purpose: extract the maximum learning value from a failure so you can prevent recurrence and improve the system.

The Framework

Step 1: Define the Event Precisely

Before analysis: describe exactly what happened.

What was the expected outcome?
What was the actual outcome?
When was the gap first visible?
What was the impact (financial, operational, reputational)?

Precision matters. "We missed Q3 revenue" is not precise enough. "We closed $420K in new ARR vs $680K target — a $260K miss driven primarily by three deals that slipped to Q4 and one deal that was lost to a competitor" is precise.

Step 2: The 5 Whys — Done Properly

The goal: get from what happened (the symptom) to why it happened (the root cause).

Standard bad 5 Whys:

Why did we miss revenue? Because deals slipped.
Why did deals slip? Because the sales cycle was longer than expected.
Why? Because the customer buying process is complex.
Why? Because we're selling to enterprise.
Why? That's just how enterprise sales works.

→ Conclusion: Nothing to do. It's just enterprise.

Real 5 Whys:

Why did we miss revenue? Three deals slipped out of quarter.
Why did those deals slip? None of them had identified a champion with budget authority.
Why did we progress deals without a champion? Our qualification criteria didn't require it.
Why didn't our qualification criteria require it? When we built the criteria 8 months ago, we were in SMB, not enterprise.
Why haven't we updated qualification criteria as ICP shifted? No owner, no process for criteria review.

→ Root cause: Qualification criteria outdated, no owner, no review process. → Fix: Update criteria, assign owner, add quarterly review.

The test for a good root cause: Could you prevent recurrence with a specific, concrete change? If yes, you've found something real.

Step 3: Distinguish Contributing Factors from Root Cause

Most events have multiple contributing factors. Not all are root causes.

Contributing factor: Made it worse, but isn't the core reason. If removed, the outcome might have been different — but the same class of problem would recur.

Root cause: The fundamental condition that made the outcome probable. Fix this, and this class of problem doesn't recur.

Example — failed hire:

Contributing factors: rushed process, reference checks skipped, team under pressure to staff up
Root cause: No defined competency framework, so interview process varied by who happened to conduct interviews

The distinction matters. If you address only contributing factors, you'll have a different-looking but structurally identical failure next time.

Step 4: Identify the Warning Signs That Were Ignored

Every failure has precursors. In hindsight, they're obvious. The value of this step is making them obvious prospectively.

Ask:

At what point was the negative outcome predictable?
What signals were visible at that point?
Who saw them? What happened when they raised them?
Why weren't they acted on?

Common patterns:

Signal was raised but dismissed by a senior person
Signal wasn't raised because nobody felt safe saying it
Signal was seen but no one had clear ownership to act on it
Data was available but nobody was looking at it
The team was too optimistic to take negative signals seriously

This step is particularly important for systemic issues — "we didn't feel safe raising the concern" is a much deeper root cause than "the deal qualification was off."

Step 5: Distinguish What Was in Control vs. Out of Control

Some failures happen despite correct decisions. Some happen because of incorrect decisions. Knowing the difference prevents both overcorrection and undercorrection.

In control: Process, criteria, team capability, resource allocation, decisions made
Out of control: Market conditions, customer decisions, competitor actions, macro events

For things out of control: what can be done to be more resilient to similar events? For things in control: what specifically needs to change?

Warning: "It was outside our control" is sometimes used to avoid accountability. Be rigorous.

Step 6: Build the Change Register

Every post-mortem ends with a change register — specific commitments, owned and dated.

Bad action items:

"We'll improve our qualification process"
"Communication will be better"
"We'll be more rigorous about forecasting"

Good action items:

"Ravi owns rewriting qualification criteria by March 15 to include champion identification as hard requirement. New criteria reviewed in weekly sales standup starting March 22."
"By March 10, Elena adds deal-slippage risk flag to CRM for any open opportunity >60 days without a product demo"
"Maria runs a 30-min retrospective with enterprise sales team every 6 weeks starting April 1, reviews win/loss data"

For each action:

What exactly is changing?
Who owns it?
By when?
How will you verify it worked?

Step 7: Verification Date

The most commonly skipped step. Post-mortems are useless if nobody checks whether the changes actually happened and actually worked.

Set a verification date: "We'll review whether qualification criteria have been updated and whether deal slippage rate has improved at the June board meeting."

Without this, post-mortems are theater.

Post-Mortem Output Format

EVENT: [Name and date]
EXPECTED: [What was supposed to happen]
ACTUAL: [What happened]
IMPACT: [Quantified]

TIMELINE
[Date]: [What happened or was visible]
[Date]: ...

5 WHYS
1. [Why did X happen?] → Because [Y]
2. [Why did Y happen?] → Because [Z]
3. [Why did Z happen?] → Because [A]
4. [Why did A happen?] → Because [B]
5. [Why did B happen?] → Because [ROOT CAUSE]

ROOT CAUSE: [One clear sentence]

CONTRIBUTING FACTORS
• [Factor] — how it contributed
• [Factor] — how it contributed

WARNING SIGNS MISSED
• [Signal visible at what date] — why it wasn't acted on

WHAT WAS IN CONTROL: [List]
WHAT WASN'T: [List]

CHANGE REGISTER
| Action | Owner | Due Date | Verification |
|--------|-------|----------|-------------|
| [Specific change] | [Name] | [Date] | [How to verify] |

VERIFICATION DATE: [Date of check-in]

The Tone of Good Post-Mortems

Blame is cheap. Understanding is hard.

The goal isn't to establish that someone made a mistake. The goal is to understand why the system produced that outcome — so the system can be improved.

"The salesperson didn't qualify the deal properly" is blame. "Our qualification framework hadn't been updated when we moved upmarket, and no one owned keeping it current" is understanding.

The first version fires or shames someone. The second version builds a more resilient organization.

Both might be true simultaneously. The distinction is: which one actually prevents recurrence?

Weekly Installs

109

Repository

alirezarezvani/…e-skills

GitHub Stars

6.7K

First Seen

6 days ago

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

opencode105

amp104

gemini-cli104

codex104

kimi-cli104

cursor104

站立会议模板：敏捷开发每日站会指南与工具（含远程团队异步模板）

10,500 周安装