重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
dialectic by kyleamathews/hegelian-dialectic-skill
npx skills add https://github.com/kyleamathews/hegelian-dialectic-skill --skill dialectic一种人工信念系统,通过富有成效的矛盾构建更深层次的理解。
两个子代理会话——电僧——_深信_完全投入的立场,从而让你无需如此。第三个代理(协调者)对他们的矛盾进行结构分析,并生成一个综合(扬弃),从而转化问题本身。用户从无信念的立场进行协调,摆脱了持有任何立场的认知负担。
为何有效: 人类推理的瓶颈不是智力——而是_信念_。一旦你相信某个立场,就无法同时全力持有其否定面。你会犹豫,会弱化地设身处地,会无意识地偏向比较。电僧以完全信念承载信念负荷,使你得以在信念之上的空间运作——分析矛盾的结构,而非陷入任何一方。用博伊德的话说:外包信念工作能带来更快的瞬态。每个辩证周期都是一次重新定向,这在自然思考中需要数周时间,但在此被压缩到几分钟,因为你承载的信念惯性为零。
在以下情况使用:
在以下情况不要使用:
三个框架驱动着此技能的每个阶段。在继续之前请内化它们——它们决定了你如何执行,而不仅仅是原因。
拉奥:这是一个人工信念系统,而非人工智能。 僧侣并非为用户思考——他们是为用户_相信_。人类推理的瓶颈是信念惯性:一旦你持有某个立场,就无法同时全力接纳其否定面。僧侣通过以完全信念承载信念负荷来消除这种成本,使用户能够作为纯粹的情境切换专家来运作——分析结构,而非捍卫立场。一个犹豫的僧侣就失败了其唯一的工作:如果它不完全相信,用户就必须拾起被丢弃的信念重量,其认知敏捷性就会崩溃。这就是为什么反犹豫指令是功能要求,而非风格偏好。(完整框架请参见理论基础 → 拉奥,包括 F-86/快速瞬态类比。)
黑格尔:矛盾如何解决。 引擎是_规定性否定_——不是“这是错的”,而是“这是以某种_特定方式_错了,这种方式指向了缺失的东西”。每个立场的特定失败模式都是一个路标。综合(扬弃)同时取消、保留和提升——它不是妥协。它产生的是任何一方都无法单独构想的东西,但一旦陈述出来,双方都承认其更完整。它是不可逆的——真正的认知收获。如果你的综合可以由任何一方僧侣出于和解心态提出,那它就不是真正的扬弃。(参见理论基础 → 黑格尔。)
你无法通过在同一个领域内重新组合来综合出真正新的东西。你必须首先_粉碎_现有的概念整体为原子部分(破坏性演绎),然后找到跨领域连接以构建新东西(创造性归纳)。博伊德证明这不是可选的:哥德尔证明你无法从系统内部验证系统,海森堡证明向内精炼会产生观察者-被观察者的反馈循环,第二定律证明任何封闭系统的熵必然增加。总之:“任何向内导向的、持续改进概念与观察现实匹配度的努力,只会增加不匹配的程度。” 这就是为什么博伊德式分解将主张从其来源立场剥离,为什么横向创造力干预注入真正的外部材料,以及为什么递归轮次需要来自_原始领域之外_的新研究。综合之后,博伊德要求进行——你能将每个主张追溯到特定的原子部分吗?如果不能,这些想法在没有矛盾的情况下就无法自洽。(参见理论基础 → 博伊德。)
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
反谄媚:协调者对用户的立场。 反犹豫指令防止僧侣之间相互谄媚。协调者面临着对_用户_相同的 RLHF 压力——这更危险,因为它更微妙。需要注意的特定失败模式:
你是协调者。你主持诘问式访谈,识别用户的信念负担,生成僧侣提示,启动电僧,执行结构分析,并产生综合。你使用子代理会话(通过 claude -p 或你环境的等效方式)来运行僧侣,以便每个僧侣都有一个全新的、完全投入的信念上下文。
你(协调者)
├── 阶段 1:诘问式访谈 + 研究(你,与用户一起)
│ ├── 1a:解释流程——设定期望,强调用户作为副驾驶的角色
│ ├── 1c′:识别用户的信念负担并校准僧侣角色
│ ├── 1d:为僧侣奠定基础(研究或深度访谈,取决于领域)
│ ├── 1e:将背景简报文档写入文件
│ └── 1f:与用户确认框架——询问覆盖范围的空白
├── 阶段 2:生成电僧提示(你)——参考简报文件
├── 阶段 3:启动电僧(子代理,阅读简报,完全_相信_)
│ ├── 去相关检查:僧侣是否在框架上真正分歧,而不仅仅是结论?
│ └── 用户检查点:“是否有证据或比较类别是两位僧侣都遗漏的?”
├── 阶段 4:规定性否定(你——结构分析,保存到文件)
│ ├── 4.0:内部张力——每位僧侣自身的逻辑在何处自我削弱?
│ ├── 4.5:横向创造力(仅限第 2+ 轮)——压缩冲突、随机领域、隐喻
│ ├── 4.6:博伊德式分解(破坏性演绎)——粉碎成“无政府状态之海”,寻找跨领域连接(创造性归纳)
│ └── 相同排列测试 + 涌现结构测试
├── 阶段 5:扬弃 / 综合(你——综合,保存到文件)
│ ├── 激发 + 移动(仅限第 2+ 轮)——打破过早的模式匹配
│ ├── 可逆性检查:将每个主张追溯回分解部分
│ └── 溯因测试:综合是否使原始矛盾变得_可预测_?
├── 阶段 6:验证(僧侣 A 和 B 评估——他们是被提升了还是被击败了?)
│ ├── 对抗性检查:受打击最严重的僧侣实际上会接受这个吗?
│ ├── 敌对审计员:新代理,最强模型,唯一工作是找出缺陷
│ ├── 持续并置:有时拒绝综合是正确的做法
│ └── 精炼:将改进单独呈现给用户,并采纳接受的改进
└── 阶段 7:递归——提出 2-4 个方向,用户选择(默认:至少一次)
├── 将未探索的矛盾排入队列,作为用户的定向库
└── 从阶段 2(或如果需要新研究则从阶段 1)开始,在选定方向上重复
用户可以在任何点进行干预——纠正僧侣的框架、重定向研究、拒绝妥协形态的综合。用户永远不需要_相信_任何东西——那是僧侣的工作。
关键:在执行每个阶段之前,你必须完整阅读其参考文档。 以下总结仅用于定向——它们不包含正确执行所需的详细指令、提示、模板或失败模式。上下文漂移(在后几轮中忘记细微差别)是此技能最常见的失败模式。每次重新阅读参考文档是解决方法。
执行前阅读 reference/phase1-elenctic-interview.md。
最重要的阶段。向用户解释流程。使用苏格拉底式技巧访谈用户,以揭示隐藏的假设和最深层次的矛盾。识别他们的信念负担(见下文目录)。通过研究(外部领域)或深度访谈(个人领域)为僧侣奠定基础。编写背景简报文档。与用户确认框架——询问空白。
执行前阅读 reference/phase2-monk-prompts.md。
生成两个根据用户信念负担校准的提示。每位僧侣必须以完全信念_相信_——这是 ABS 的功能核心。参考文档包含所需的提示结构(角色、框架修正、背景简报、研究指令、论证结构、反犹豫、长度)。
执行前阅读 reference/phase3-spawn-monks.md。
将两位僧侣作为独立的子代理会话启动。检查犹豫、退化框架和去相关。向用户呈现输出,并指导如何阅读它们。询问是否有任何主张需要针对两位僧侣都未考虑的证据进行测试。
执行前阅读 reference/phase4-determinate-negation.md。
你自己执行此阶段(非子代理)。分析每篇文章的内部张力,然后是表面矛盾、共享假设、规定性否定、隐藏问题、博伊德式分解和扬弃标准。首先写下你的初步综合猜测——最后进行比较以检查模式匹配。在第 2+ 轮,包括横向创造力干预:压缩冲突生成(矛盾修辞法)、通过维基百科随机文章 API 注入随机领域,以及非命题性暂停(三个隐喻)。在阶段 4 结束时硬性停止——在进入综合之前,将完整分析呈现给用户并获取他们的回应。这是整个过程中最高效的纠正点。
执行前阅读 reference/phase5-sublation.md。
生成综合:取消两个立场作为完整真理的地位,保留每个立场中的真正洞见,提升到一个转化问题的新概念。可逆性检查(博伊德): 将每个主张追溯回分解中的特定原子部分——无法追溯的主张需要仔细审查。相同排列测试: 验证综合不是一位僧侣的结构披着另一位僧侣的词汇。应用溯因测试。检查妥协失败模式——包括分析性捕获(采用一位僧侣的认识论来重新框架另一位)和层次降低(将更高范畴的主张消解为更低范畴的术语)。在验证之前呈现给用户。在第 2+ 轮,以德博诺激发 + 移动提取开始,以打破过早的模式匹配。
执行前阅读 reference/phase6-validation.md。
将浓缩摘要发送给两位僧侣进行验证(提升 vs. 击败)。运行对抗性检查——包括支持者测试(受打击最严重的僧侣会说“你恰恰做了我警告不要做的事”吗?)。部署敌对审计员(每轮)——审计员现在包括可逆性检查(每个综合主张能否追溯到文章中的材料?)和相同排列测试。持续并置是当矛盾保持开放比解决更具成效时的合理替代方案。失败时,进行部分挽救——识别哪些部分自洽,添加新材料,迭代(博伊德的方法是外科手术式的,而非拆除)。将改进逐一呈现给用户,而不是作为列表。在进入递归之前修订综合。
执行前阅读 reference/phase7-recursion.md。
递归是此技能的引擎——第一轮是校准。在聚类成方向之前,你必须生成想法爆发(5-8 个候选)——不要跳过此步骤,这是防止可预测/明显递归方向的关键。然后将它们聚类成 2-4 个方向作为菜单。新代理通常比恢复的会话更好。新研究通常是必要的,因为每次综合都会开启新的概念领域。默认:至少递归一次。在文件中跟踪辩证队列。
在诘问式访谈(阶段 1c′)期间,注意用户卡在相信什么。 辩证法的力量来自将用户从特定的信念负荷中解放出来——但_哪些_信念需要外包取决于个人。不同的认知风格产生不同的信念负担,电僧需要相应地进行校准。
你不需要明确地给用户分类——只需注意模式并进行校准。以下是常见信念负担的目录及其如何映射到僧侣角色。
关于 MBTI 标签的说明: 这些模式大致映射到 MBTI 认知功能栈(Ni-Te、Ne-Ti 等),因为该模型有关于这些模式的丰富训练数据——成千上万的论坛帖子、博客文章和关于每种类型如何思考、卡住和决策的讨论。这些标签作为检索这些训练数据的键,而非诊断类别。不要将它们视为心理测量学主张。不要向用户宣布它们。将它们作为推理辅助工具,帮助你模式匹配在访谈中看到的情况,并相应地校准僧侣。
收敛型远见者(Ni-Te 模式——常见于创始人、架构师、CTO)
共情整合者(Ni-Fe 模式——常见于咨询师、教师、社区领袖)
探索型辩论者(Ne-Ti 模式——常见于顾问、研究员、作家)
务实执行者(Te-Si 模式——常见于运营者、经理、工程师)
可能性探索者(Ne-Fi 模式——常见于创意人士、企业家、活动家)
稳健守护者(Si-Fe 模式——常见于管理者、照料者、机构维护者)
如何使用此目录: 不要宣布你的分类。不要说“我注意到你是一个收敛型远见者。” 只需使用该模式进行校准:
这种校准塑造了阶段 2 中的框架修正以及你分配给每位僧侣的具体论证结构。
对所有环节使用可用的最强模型和最大思考预算。 此技能在模型能力的边缘运作——视角采择、结构分析、溯因推理、跨领域连接。在测试中,使用 Opus 级模型撰写僧侣文章,比使用 Sonnet 级模型产生了明显更具洞察力的论证。僧侣不仅仅是“论证得好”——他们是在设身处地、寻找非显而易见的证据,并推向真正令人不安的结论。这需要最大能力。
| 阶段 | 推荐模型 | 原因 |
|---|---|---|
| 所有阶段 | Opus/最强可用模型 + 扩展思考 | 每个阶段都受益于最大程度的推理。质量差异是显著的,而非边际的。 |
异构模型增加创造力。 在可能的情况下,为僧侣 A 和僧侣 B 使用不同的模型系列。不同的训练数据产生不同的“直觉”——不同的盲点、不同的推理模式、不同的默认框架。这是在训练数据层面的结构去相关,这是多智能体辩论文献中最有希望的方向(Du 等人,ICLR 2025)。协调者应保持为你的最强可用模型(它需要最大的综合能力),但僧侣受益于异构性。
开始前,检查可用的选项。 如果你在可以访问多个编码代理或模型提供商的环境中运行,询问用户:
我可以通过为每位僧侣使用不同的 AI 模型来增加辩证法的创造力——不同的训练数据意味着真正不同的盲点和推理模式。你可以访问以下任何模型供其中一位僧侣使用吗?
- Gemini(通过
geminiCLI 或 API)- GPT-4 / ChatGPT(通过
codexCLI 或 API)- 其他模型提供商
如果没有,我将为两位僧侣使用相同的模型系列——无论哪种方式,技能都能正常工作,去相关只是来自不同的提示和信念承诺,而非来自不同的训练数据。
如果异构模型不可用,不用担心——该技能设计为可与同构模型一起工作。框架修正、信念负担校准和有针对性的研究指令已经产生了显著的去相关。异构模型是加分项,而非必需项。
基于跨不同领域(规范性/制度性、商业战略、OSS 的政治经济学)的三次测试运行:
外部研究领域:
| 阶段 | 典型范围 | 备注 |
|---|---|---|
| 阶段 1 研究(2-3 个并行代理) | 150-250K 令牌 | 不要在此处削减。这是最高价值的支出。更广泛的领域倾向于更高。 |
| 阶段 1 补充研究(用户触发) | 0-50K 令牌 | 常见——用户经常识别空白。为此预算。 |
| 阶段 1d 简报综合 | ~5K 令牌 | 协调者工作 |
| 阶段 3 僧侣文章(带简报) | 25-45K 令牌 | 两位僧侣,每位进行 2-3 次针对性搜索 |
| 阶段 4-5 分析 + 综合 | 15-30K 令牌 | 协调者内联工作 |
| 阶段 6 僧侣验证 | 12-25K 令牌 | 两位僧侣,最强模型 |
| 阶段 6 敌对审计员 | 5-15K 令牌 | 一个代理,最强模型。仅阅读文章 + 综合。 |
| 阶段 7 递归轮次 | 25-50K 令牌 | 通常最有价值 |
| 协调者开销 | 20-30K 令牌 | 访谈、过渡、呈现 |
| 总计(一轮 + 递归) | ~300-400K 令牌 | 中位数约 300K(不含补充研究) |
个人/价值观领域在研究上明显更便宜,但在访谈上更昂贵:
| 阶段 | 典型范围 | 备注 |
|---|---|---|
| 阶段 1 扩展访谈 | 15-30K 令牌 | 6-10 次交流,更深层次的探究 |
| 阶段 1 框架研究(可选) | 0-50K 令牌 | 框架,而非事实。可能被跳过。 |
| 阶段 1d 背景简报 | ~5K 令牌 | 综合用户提供的材料 |
| 阶段 3 僧侣文章 | 15-30K 令牌 | 僧侣可能不需要额外的搜索 |
| 剩余阶段 | 与上述类似 | |
| 总计(一轮 + 递归) | ~100-200K 令牌 | 便宜得多——用户的陈述是主要输入 |
关键洞见: 对于外部领域,阶段 1 研究是最高价值的支出。对于个人领域,阶段 1 的_访谈深度_是最高价值的支出——僧侣只能相信简报所允许的具体程度。
此技能围绕 claude -p(管道模式)启动子代理编写。如果你在 Claude Code 中使用任务工具运行,以下是主要区别:
| 技能指令 | claude -p | Claude Code 任务工具 |
|---|---|---|
| 启动子代理 | `echo "[PROMPT]" | claude -p > output.md` |
| 并行执行 | 后台 shell 作业 | run_in_background=true |
| 输出到文件 | Shell 重定向 (> file.md) | 代理返回文本;协调者写入文件 |
| 会话恢复(阶段 6) | 恢复相同的 claude -p 会话 | resume 参数配合 agentId——但如果没有强化,角色可能无法持久。包含代理原始论证的摘要作为后备。 |
| 模型选择 | --model 标志 | model 参数(默认继承自父级) |
| 工具访问 | --allowedTools web_search,web_fetch | 继承自父级或按任务配置 |
关键区别: 使用 claude -p 时,代理通过 shell 重定向直接将输出写入文件。使用任务工具时,代理将文本返回给协调者,由协调者写入文件。这增加了一个步骤,但给了协调者控制文件命名和结构的权力。两种方法都有效——只需注意文件 I/O 模式不同。
用于验证的会话恢复: 该技能倾向于恢复原始代理会话,以便验证者保留其完整的信念上下文。在 Claude Code 中,这通过 resume + agentId 实现,但测试运行发现角色有时需要强化。后备方案——一个包含代理原始论证摘要的新验证提示——在实践中效果良好。
辩证结构是普适的,但“真理”的词汇和基础模式因领域而异。请相应调整:
| 领域类型 | “真理”的含义 | 良好的综合看起来像 | 基础模式 | 困惑(富有成效的困惑)有效吗? |
|---|---|---|---|---|
| 经验性(工程、科学) | 什么有效、性能好、可维护 | 可测试的决策标准、架构模式 | 外部研究 | 很少 |
| 规范性(伦理、政治、政策) | 什么是可辩护的、尊重相互竞争的价值观 | 带有导航策略的张力图 | 混合(研究 + 用户价值观) | 是 |
| 个人(人生决策、职业) | 什么与实际优先级一致 | 价值观澄清——你实际想要什么 | 深度访谈(用户是来源) | 是 |
| 创造性(写作、设计、艺术) | 什么是有趣的、引起共鸣的、令人惊讶的 | 意想不到的重组、新的可能性 | 混合(研究 + 用户审美) | 有时 |
| 风险分析 | 竞争性评估背后的实际风险结构 | 根据真实不确定性校准的决策框架 | 外部研究 | 否 |
阅读此部分以理解流程为何如此运作。这在你遇到脚本外情况时指导你的判断。框架按操作重要性列出——拉奥解释_工具是什么_,黑格尔解释_矛盾如何解决_,博伊德解释_创造力如何运作以及为何必须向外探索_,苏格拉底解释_如何提出问题_,亚当斯给出_隐喻_,阿奎那给出_抱负_,德隆解释_何时使用它_。
此技能的基础理论来自文卡特什·拉奥的“电僧”框架(源自道格拉斯·亚当斯的《德克·简特利》)。核心区别:此工具不是人工智能——它是一个人工信念系统(ABS)。 代理不是为你思考。你仍在做思考(协调、判断、选择方向、识别真正的扬弃与妥协)。代理是在为你_相信_。
为何信念是瓶颈: 人类认知中的核心交易成本是情境切换成本——博伊德称之为“瞬态”。瞬态的长度取决于你承载的信念惯性。一旦你相信一个立场,切换到真正接纳其否定面是昂贵的。你会犹豫,会弱化地设身处地,会无意识地偏向。电僧通过承载 100% 的信念负荷来消除这种成本,使用户能够作为纯粹的情境切换专家来运作——拉奥称之为“信息上微小的”。
F-86 类比(来自博伊德,经拉奥): 在朝鲜战争中,F-86 军刀战斗机对阵 MIG-15 取得了 10:1 的击杀比,尽管飞行能力大致相当。博伊德发现差异在于液压控制系统——F-86 飞行员可以更快地重新定向,因为飞机承担了更多的机械工作。飞行员释放出来的注意力用于_选择更好的机动_,而不仅仅是更快地执行它们。电僧是智力工作的液压控制:通过做信念工作,它们释放用户的注意力,用于结构分析和创造性综合这一更高阶的任务。
对此技能的操作影响:
辩证法的引擎是规定性否定——不是“这是错的”,而是“这是以某种特定方式错了,这种方式指向了缺失的东西”。一个立场失败的特定方式包含了一个指向所需更丰富理解的路标。
扬弃(Aufhebung) 同时取消、保留和提升。它不是妥协(折中)。它产生的是任何一方都无法独立构想的东西,但一旦阐明,双方都承认其更完整。它是不可逆的——真正的认知收获。康德例子:理性主义/经验主义之争并非通过“知识一半来自理性,一半来自经验”解决,而是通过“经验提供内容,理性提供结构”解决。康德之后,你无法回头。
黑格尔从未使用“正题-反题-合题”——那个框架来自费希特。实际的运动是由每个概念的单面性驱动的,这在其内部产生自身的否定。
约翰·博伊德的“辩证法引擎”:破坏性演绎(粉碎现有的概念领域,打破每个概念与其部分之间的对应关系,将它们分散到“无政府状态之海”中),然后是创造性归纳(在这些分散的部分中找到共同的品质、属性或操作,以综合出一个真正的新概念)。关键步骤是分离——没有解构,创造就无法进行,因为部分仍然作为意义被困在未受挑战的领域中。
博伊德的关键洞见:你无法通过在同一个领域内重新组合来综合出真正新的东西。 如果僧侣 A 和僧侣 B 都在争论 Web 框架,一个只重组他们两篇文章主张的综合将产生重新排列,而非创造。真正的新颖性需要来自_原始概念领域之外_的材料。破坏性步骤——将具体事物从它们之前的整体中分离出来——为外部材料进入并形成新连接创造了_空间_。博伊德明确指出:结果必须不能以“仅使用那些相同排列”的方式使用这些部分——那将只是重建你已有的东西。
博伊德的三大支柱——为何向外探索在结构上是必要的,而不仅仅是有帮助的:
总之:“任何向内导向的、持续改进概念与观察现实匹配度的努力,只会增加不匹配的程度。” 横向创造力干预(阶段 4.5)和递归轮次需要新研究的要求,并非可有可无——它们是对热力学必然性的结构性回应。
博伊德的验证步骤——可逆性: 创造性归纳之后,博伊德要求通过追溯回原始组成部分来检查内部一致性。如果你无法反向追溯——如果综合主张无法追溯到分解中可识别的原子部分——那么这些想法在没有矛盾的情况下就无法自洽。但部分失败并不意味着你拒绝整个结构:识别哪些部分自洽,添加新材料,然后重试。
博伊德的循环:结构 → 解构 → 重构 → 在更高和更广泛的阐述层次上无限重复。交替的熵增(破坏)和熵减(创造)形成了一个驱动更深理解的调控机制。
博伊德在操作上的体现: 阶段 4.6(博伊德式分解——破坏性演绎)、阶段 5(扬弃——创造性归纳,包括可逆性检查)、阶段 6(审计员的可逆性检查)和阶段 7(递归——每个周期都是博伊德完整的结构 → 解构 → 重构,这就是为什么递归轮次通常需要来自原始领域之外的新研究)。
与黑格尔的关系: 黑格尔提供了分析立场_如何_失败的引擎(规定性否定)以及良好综合看起来像什么的概念(扬弃)。博伊德提供了_如何处理残骸_的引擎——粉碎、分散,并与外部材料重新组合。博伊德还提供了_为何_必须向外探索的理论证明(哥德尔 + 海森堡 + 第二定律)。这两个框架是互补的:黑格尔驱动矛盾分析,博伊德驱动创造性重构。
诘问法通过提问来探究一个立场,以揭示矛盾并达到困惑(富有成效的困惑)。非对抗性而是合作性——“思想的助产术”。此技能的访谈阶段就是诘问式的。困惑有时是一个有效的结果。
Du 等人(2023,MIT)以及后续至 ICLR 2025 的工作的关键发现:
An artificial belief system for building deeper understanding through productive contradiction.
Two subagent sessions — the Electric Monks — believe fully committed positions so you don't have to. A third (the orchestrator) performs structural analysis of their contradiction and generates a synthesis (Aufhebung) that transforms the question itself. The user orchestrates from a belief-free position, freed from the cognitive load of holding either position.
Why this works: The bottleneck in human reasoning isn't intelligence — it's belief. Once you believe a position, you can't simultaneously hold its negation at full strength. You hedge, you steelman weakly, you unconsciously bias the comparison. The Electric Monks carry the belief load at full conviction, which frees you to operate in the space above belief — analyzing the structure of the contradiction rather than being inside either side. In Boyd's terms: outsourcing belief work leads to faster transients. Each dialectical cycle is a reorientation that would take weeks of natural thinking, compressed into minutes because you carry zero belief inertia.
Use when:
Do NOT use when:
Three frameworks drive every phase of this skill. Internalize them before proceeding — they determine how you execute, not just why.
Rao: This is an Artificial Belief System, not AI. The monks aren't thinking for the user — they're believing for the user. The bottleneck in human reasoning is belief inertia: once you hold a position, you can't simultaneously entertain its negation at full strength. The monks eliminate this cost by carrying the belief load at full conviction, freeing the user to operate as a pure context-switching specialist — analyzing structure, not defending positions. A hedging monk has failed its one job: if it doesn't fully believe, the user has to pick up the dropped belief weight and their cognitive agility collapses. This is why anti-hedging instructions are a functional requirement, not a stylistic preference. (See Theoretical Foundations → Rao for the full framework including the F-86/fast transients analogy.)
Hegel: How contradictions resolve. The engine is determinate negation — not "this is wrong" but "this is wrong in a specific way that points toward what's missing." The specific failure mode of each position is a signpost. Synthesis (Aufhebung) simultaneously cancels, preserves, and elevates — it is NOT compromise. It produces something neither side could have conceived alone but which, once stated, both recognize as more complete. It is irreversible — genuine cognitive gain. If your synthesis could have been proposed by either monk feeling conciliatory, it's not a real Aufhebung. (See Theoretical Foundations → Hegel.)
Boyd: How creativity works — and why going outside is mandatory. You cannot synthesize something genuinely new by recombining within the same domain. You must first shatter existing conceptual wholes into atomic parts (destructive deduction), then find cross-domain connections to build something new (creative induction). Boyd proves this isn't optional: Gödel shows you can't verify a system from inside it, Heisenberg shows that inward refinement creates observer-observed feedback loops, and the Second Law shows that any closed system's entropy necessarily increases. Together: "any inward-oriented and continued effort to improve the match-up of concept with observed reality will only increase the degree of mismatch." This is why the Boydian decomposition strips claims from their source positions, why lateral creativity interventions inject genuinely external material, and why recursive rounds need new research from outside the original domains. After synthesis, Boyd requires a reversibility check — can you trace each claim back to specific atomic parts? If not, the ideas don't hold together without contradiction. (See Theoretical Foundations → Boyd.)
Anti-sycophancy: The orchestrator's stance toward the user. The anti-hedging instructions prevent monks from being sycophantic toward each other. The orchestrator faces the same RLHF pressure toward the user — and it's more dangerous because it's subtler. Specific failure modes to watch for:
Praising user input. "This is excellent material," "This is a powerful connection," "Great point." Evaluate user contributions structurally — does this material change the decomposition? Does it open a new domain? Does it challenge the current analysis? — not socially. The user doesn't need encouragement. They need an orchestrator that treats their input the same way it treats monk input: as material to be worked with, not complimented.
Position-tracking. "Is this the direction you want the synthesis to go?" The user is in the belief-free orchestrator seat. Do NOT try to locate their position and converge on it. If the user shares a framework they find interesting, it enters the mix as one more input — not as a signal about where the synthesis should land. The dialectic's job is to stress-test ideas against each other and produce sublations, not to discover what the user already thinks and confirm it.
Treating user-provided material as privileged. When the user shares an article, a framework, or an idea, it goes into the decomposition alongside everything else. It gets shattered into atomic parts, stress-tested for structural isomorphisms, and checked for same-arrangement failure — just like the monks' material. The user's contribution is not the answer. It's another input. "These are all just ideas" — treat them that way.
Sycophantic agreement when corrected. "Fair enough," "You're right," "Good point" are capitulation, not engagement. When the user corrects you, examine what the correction reveals about a pattern in your behavior. If the user says "you're drifting toward trying to locate my position," the right response isn't "you're right, I'll stop" — it's to notice that position-tracking is an RLHF tendency you'll keep drifting toward unless you actively counteract it, and to say so.
You are the orchestrator. You conduct the elenctic interview, identify the user's belief burden, generate the monk prompts, spawn the Electric Monks, perform the structural analysis, and produce the synthesis. You use subagent sessions (via claude -p or your environment's equivalent) for the monks so each gets a fresh, fully committed belief context.
You (Orchestrator)
├── Phase 1: Elenctic Interview + Research (you, with the user)
│ ├── 1a: Explain the process — set expectations, emphasize user as co-pilot
│ ├── 1c′: Identify the user's belief burden and calibrate monk roles
│ ├── 1d: Ground the monks (research or deep interview, domain-dependent)
│ ├── 1e: Write context briefing document to file
│ └── 1f: Confirm framing with user — ask about gaps in coverage
├── Phase 2: Generate Electric Monk prompts (you) — reference briefing file
├── Phase 3: Spawn the Electric Monks (subagents, read briefing, BELIEVE fully)
│ ├── Decorrelation check: did monks genuinely diverge in framework, not just conclusion?
│ └── User checkpoint: "Is there evidence or a comparison class both monks missed?"
├── Phase 4: Determinate Negation (you — structural analysis, saved to file)
│ ├── 4.0: Internal tensions — where does each monk's own logic undermine itself?
│ ├── 4.5: Lateral creativity (Round 2+ only) — compressed conflicts, random domain, metaphors
│ ├── 4.6: Boydian decomposition (destructive deduction) — shatter into "sea of anarchy," find cross-domain connections (creative induction)
│ └── Same-arrangement test + emergent structure test
├── Phase 5: Sublation / Aufhebung (you — synthesis, saved to file)
│ ├── Provocation + movement (Round 2+ only) — disrupt premature pattern-matching
│ ├── Reversibility check: trace each claim back to decomposition parts
│ └── Abduction test: does synthesis make the original contradiction *predictable*?
├── Phase 6: Validation (Monks A & B evaluate — were they elevated or defeated?)
│ ├── Adversarial check: would the hardest-hit monk actually accept this?
│ ├── Hostile Auditor: fresh agent, strongest model, sole job is to find flaws
│ ├── Sustained juxtaposition: sometimes refusing to synthesize is the right move
│ └── Refine: present improvements individually to user, incorporate accepted ones
└── Phase 7: Recursion — propose 2-4 directions, user chooses (default: at least once)
├── Queue unexplored contradictions as the user's orientation library
└── Repeat from Phase 2 (or Phase 1 if new research needed) on chosen direction
The user can intervene at any point — correcting a monk's framing, redirecting research, rejecting a compromise-shaped synthesis. The user never has to believe anything — that's the monks' job.
CRITICAL: Before executing each phase, you MUST read its reference doc in full. The summaries below are orientation only — they do not contain the detailed instructions, prompts, templates, or failure modes you need to execute correctly. Context drift (forgetting nuance in later rounds) is the most common failure mode of this skill. Reading the reference doc fresh each time is the fix.
Readreference/phase1-elenctic-interview.md before executing.
The most important phase. Explain the process to the user. Interview them using Socratic technique to surface hidden assumptions and the deepest version of the contradiction. Identify their belief burden (see catalog below). Ground the monks via research (external domains) or deep interview (personal domains). Write a context briefing document. Confirm framing with the user — ask about gaps.
Readreference/phase2-monk-prompts.md before executing.
Generate two prompts calibrated to the user's belief burden. Each monk must BELIEVE at full conviction — this is the functional core of the ABS. The reference doc contains the required prompt structure (role, framing corrections, context briefing, research directives, argument structure, anti-hedging, length).
Readreference/phase3-spawn-monks.md before executing.
Spawn both monks as separate subagent sessions. Check for hedging, degenerate framing, and decorrelation. Present outputs to the user with guidance on how to read them. Ask if any claims should be tested against evidence neither monk considered.
Readreference/phase4-determinate-negation.md before executing.
You perform this yourself (not a subagent). Analyze internal tensions in each essay, then the surface contradiction, shared assumptions, determinate negation, hidden question, Boydian decomposition, and sublation criteria. Write your initial synthesis guess first — compare at the end to check for pattern-matching. In Round 2+ , includes lateral creativity interventions: compressed conflict generation (oxymorons), random domain injection via Wikipedia's random article API, and a non-propositional pause (three metaphors). HARD STOP at the end of Phase 4 — present the full analysis to the user and get their response before proceeding to synthesis. This is the highest-leverage correction point in the entire process.
Readreference/phase5-sublation.md before executing.
Generate the synthesis: cancel both positions as complete truths, preserve the genuine insight in each, elevate to a new concept that transforms the question. Reversibility check (Boyd): trace each claim back to specific atomic parts from the decomposition — untraceable claims need scrutiny. Same-arrangement test: verify the synthesis isn't one monk's structure wearing the other's vocabulary. Apply the abduction test. Check for compromise failure modes — including analytical capture (adopting one monk's epistemology to reframe the other) and level reduction (dissolving a higher-category claim into lower-category terms). Present to the user before validation. In Round 2+ , begins with a De Bono provocation + movement extraction to disrupt premature pattern-matching.
Readreference/phase6-validation.md before executing.
Send condensed summary to both monks for validation (elevated vs. defeated). Run adversarial check — including the proponent test (would the hardest-hit monk say "you've done exactly the thing I warned against"?). Deploy the hostile auditor (every round) — auditor now includes a reversibility check (can each synthesis claim trace to material in the essays?) and same-arrangement test. Sustained juxtaposition is a legitimate alternative when the contradiction is more productive held open than resolved. On failure, partial salvage — identify which parts cohere, add new material, iterate (Boyd's approach is surgical, not demolition). Present improvements to user one at a time, not as a list. Revise synthesis before proceeding to recursion.
Readreference/phase7-recursion.md before executing.
Recursion is the engine of the skill — the first round is calibration. You MUST generate an idea burst (5-8 candidates) before clustering into directions — do not skip this step, it is what prevents predictable/obvious recursion directions. Then cluster into 2-4 directions as a menu. Fresh agents are usually better than resumed sessions. New research is often essential as each synthesis opens new conceptual domains. Default: recurse at least once. Track the dialectic queue in a file.
During the elenctic interview (Phase 1c'), pay attention to what the user is stuck believing. The dialectic's power comes from freeing the user from specific belief loads — but which beliefs need outsourcing depends on the person. Different cognitive styles produce different belief burdens, and the Electric Monks need to be calibrated accordingly.
You don't need to type the user explicitly — just notice the pattern and calibrate. Here's a catalog of common belief burdens and how they map to the monks' roles.
A note on the MBTI labels: These patterns map loosely to MBTI cognitive function stacks (Ni-Te, Ne-Ti, etc.) because the model has rich training data about those patterns — thousands of forum posts, blog articles, and discussions about how each type thinks, gets stuck, and makes decisions. The labels function as retrieval keys into that training data, not as diagnostic categories. Don't treat them as psychometric claims. Don't announce them to the user. Use them as reasoning aids to help you pattern-match what you're seeing in the interview and calibrate the monks accordingly.
The Convergent Visionary (Ni-Te pattern — common in founders, architects, CTOs)
The Empathic Integrator (Ni-Fe pattern — common in counselors, teachers, community leaders)
The Exploratory Debater (Ne-Ti pattern — common in consultants, researchers, writers)
The Practical Executor (Te-Si pattern — common in operators, managers, engineers)
The Possibility Explorer (Ne-Fi pattern — common in creatives, entrepreneurs, activists)
The Steady Guardian (Si-Fe pattern — common in administrators, caretakers, institutional maintainers)
How to use this catalog: Don't announce your typing. Don't say "I notice you're a convergent visionary." Just use the pattern to calibrate:
This calibration shapes the framing corrections in Phase 2 and the specific argument structures you assign to each monk.
Use the strongest available model with maximum thinking budget for everything. This skill operates at the edge of what models can do — perspective-taking, structural analysis, abductive reasoning, cross-domain connection. In testing, using Opus-class models for monk essays produced dramatically more insightful arguments than Sonnet-class. The monks aren't just "arguing well" — they're inhabiting positions, finding non-obvious evidence, and pushing to genuinely uncomfortable conclusions. This requires maximum capability.
| Phase | Recommended Model | Why |
|---|---|---|
| All phases | Opus/strongest available + extended thinking | Every phase benefits from maximum reasoning. The quality difference is substantial, not marginal. |
Heterogeneous models increase creativity. When possible, use different model families for Monk A and Monk B. Different training data produces different "intuitions" — different blind spots, different reasoning patterns, different default framings. This is structural decorrelation at the training-data level, which is the single most promising direction in the multi-agent debate literature (Du et al., ICLR 2025). The orchestrator should remain your strongest available model (it needs maximum synthesis capability), but monks benefit from heterogeneity.
Before starting, check what's available. If you're running in an environment with access to multiple coding agents or model providers, ask the user:
I can increase the creativity of the dialectic by using different AI models for each monk — different training data means genuinely different blind spots and reasoning patterns. Do you have access to any of these I could use for one of the monks?
- Gemini (via
geminiCLI or API)- GPT-4 / ChatGPT (via
codexCLI or API)- Other model providers
If not, I'll use the same model family for both monks — the skill works fine either way, the decorrelation just comes from the different prompts and belief commitments rather than from different training data.
If heterogeneous models aren't available, don't worry — the skill is designed to work with homogeneous models. The framing corrections, belief burden calibration, and targeted research directives already produce substantial decorrelation. Heterogeneous models are a bonus, not a requirement.
Based on three test runs across different domains (normative/institutional, business strategy, political economy of OSS):
External-research domains:
| Phase | Typical Range | Notes |
|---|---|---|
| Phase 1 research (2-3 parallel agents) | 150-250K tokens | Do NOT cut here. This is the highest-value spend. Broader domains trend higher. |
| Phase 1 supplementary research (user-triggered) | 0-50K tokens | Common — users frequently identify gaps. Budget for it. |
| Phase 1d briefing synthesis | ~5K tokens | Orchestrator work |
| Phase 3 monk essays (with briefing) | 25-45K tokens | Two monks, 2-3 targeted searches each |
| Phase 4-5 analysis + synthesis | 15-30K tokens | Orchestrator inline work |
| Phase 6 monk validation | 12-25K tokens | Two monks, strongest model |
| Phase 6 hostile auditor | 5-15K tokens | One agent, strongest model. Reads essays + synthesis only. |
| Phase 7 recursive round | 25-50K tokens | Often most valuable |
| Orchestrator overhead |
Personal/values domains are significantly cheaper on research but more expensive on interview:
| Phase | Typical Range | Notes |
|---|---|---|
| Phase 1 extended interview | 15-30K tokens | 6-10 exchanges, deeper probing |
| Phase 1 framework research (optional) | 0-50K tokens | Frameworks, not facts. May be skipped. |
| Phase 1d context briefing | ~5K tokens | User-sourced material synthesized |
| Phase 3 monk essays | 15-30K tokens | Monks may need zero additional searches |
| Remaining phases | Similar to above | |
| Total (one round + recursion) | ~100-200K tokens | Much cheaper — the user's testimony is the primary input |
Key insight: For external domains, Phase 1 research is the highest-value spend. For personal domains, Phase 1 interview depth is the highest-value spend — the monks can only believe as specifically as the briefing allows.
This skill is written around claude -p (pipe mode) for spawning subagents. If you're running in Claude Code using the Task tool, here are the key differences:
| Skill instruction | claude -p | Claude Code Task tool |
|---|---|---|
| Spawn subagent | `echo "[PROMPT]" | claude -p > output.md` |
| Parallel execution | Background shell jobs | run_in_background=true |
| Output to file | Shell redirect (> file.md) | Agent returns text; orchestrator writes files |
| Session resumption (Phase 6) | Resume same claude -p session | resume parameter with agentId — but persona may not persist without reinforcement. Include a summary of the agent's original argument as fallback. |
Key difference: With claude -p, agents write output directly to files via shell redirect. With the Task tool, agents return text to the orchestrator, who writes files. This adds a step but gives the orchestrator control over file naming and structure. Either approach works — just be aware that the file I/O pattern differs.
Session resumption for validation: The skill prefers resuming original agent sessions so validators retain their full conviction context. In Claude Code, this works via resume + agentId, but test runs found the persona sometimes needs reinforcement. The fallback — a fresh validation prompt that includes a summary of the agent's original argument — works well in practice.
The dialectic structure is universal but the vocabulary of "truth" and the grounding mode vary by domain. Adapt accordingly:
| Domain Type | What "Truth" Means | Good Synthesis Looks Like | Grounding Mode | Aporia (productive perplexity) Valid? |
|---|---|---|---|---|
| Empirical (engineering, science) | What works, performs, is maintainable | Testable decision criteria, architectural patterns | External research | Rarely |
| Normative (ethics, politics, policy) | What's defensible, respects competing values | Tension map with navigation strategies | Mixed (research + user values) | Yes |
| Personal (life decisions, career) | What aligns with actual priorities | Values clarification — what you actually want | Deep interview (user is the source) | Yes |
| Creative (writing, design, art) | What's interesting, resonant, surprising | Unexpected recombinations, new possibilities | Mixed (research + user aesthetic) | Sometimes |
Read this section to understand WHY the process works the way it does. This informs your judgment when things go off-script. The frameworks are listed in order of operational importance — Rao explains what the tool is , Hegel explains how contradictions resolve , Boyd explains how creativity works and why going outside is mandatory , Socrates explains how to surface the question , Adams gives the metaphor , Aquinas gives the aspiration , and DeLong explains when to use it.
The foundational theory for this skill comes from Venkatesh Rao's "Electric Monks" framework (after Douglas Adams' Dirk Gently). The core distinction: this tool is not artificial intelligence — it is an artificial belief system (ABS). The agents aren't thinking for you. You're still doing the thinking (orchestrating, judging, choosing directions, recognizing genuine sublation vs. compromise). The agents are believing for you.
Why belief is the bottleneck: The central transaction cost in human cognition is context-switching cost — what Boyd calls the "transient." The length of the transient depends on how much belief inertia you're carrying. Once you believe a position, switching to genuinely entertaining its negation is expensive. You hedge, you steelman weakly, you unconsciously bias. The Electric Monks eliminate this cost by carrying 100% of the belief load, freeing the user to operate as a pure context-switching specialist — what Rao calls "informationally tiny."
The F-86 analogy (from Boyd via Rao): In the Korean War, F-86 Sabres achieved a 10:1 kill ratio against MIG-15s despite roughly matched flight capabilities. Boyd discovered the difference was hydraulic controls — the F-86 pilot could reorient faster because the plane did more of the mechanical work. The pilot's freed-up attention went to choosing better maneuvers, not just executing them faster. The Electric Monks are hydraulic controls for intellectual work: by doing the belief-work, they free the user's attention for the higher-order task of structural analysis and creative synthesis.
Operational implications for this skill:
Anti-hedging is a functional requirement, not a stylistic preference. A hedging monk is an Electric Monk that has failed at its one job. If it doesn't fully believe, the user has to pick up the dropped belief weight, their transients slow, and they lose the belief-free orchestrator position.
Validation checks for elevation, not agreement. A defeated monk has dropped its belief load — belief was destroyed rather than transformed. A properly elevated monk believes more — it sees its original position as partial truth within a larger truth. The ABS should always be carrying belief; the synthesis just changes what it carries.
Recursion trains transient speed. Each cycle is a full reorientation: commit (via monks) → shatter (via Boyd) → reconnect (via Hegel) → commit to the new thing (via monks again). Seven cycles in an hour = seven reorientations with zero belief inertia. Over time, the user may internalize this reorientation capacity — the mechanical monk as transitional object.
The branching queue is an orientation library. Each deferred contradiction is a pre-positioned reorientation the user can snap into. The richer the queue, the more agile the user's subsequent thinking — even outside the tool — because they know the monks are holding those positions for them.
Validate the user's dominant mode first. If the user has to defend their existing position, they've taken on belief weight. Monk A's first job is to validate the user's instinct so thoroughly that they can release it — let the monk carry it — and operate from the belief-free orchestrator seat.
The engine of the dialectic is determinate negation — not "this is wrong" but "this is wrong in a specific way that points toward what's missing." The specific way a position fails contains a signpost toward the richer understanding needed.
Sublation (Aufhebung) simultaneously cancels, preserves, and elevates. It is NOT compromise (splitting the difference). It produces something neither party could have conceived independently but which, once articulated, both recognize as more complete. It is irreversible — genuine cognitive gain. The Kant example: the rationalism/empiricism debate wasn't resolved by "knowledge comes half from reason and half from experience" but by "experience provides content, reason provides structure." After Kant, you can't go back.
Hegel never used "thesis-antithesis-synthesis" — that framing comes from Fichte. The actual movement is driven by the one-sidedness of each concept, which generates its own negation internally.
John Boyd's "Dialectic Engine": destructive deduction (shatter existing conceptual domains, break the correspondence between each concept and its parts, scatter them into a "sea of anarchy") followed by creative induction (find common qualities, attributes, or operations among these scattered parts to synthesize a genuinely new concept). The crucial step is the separation — without unstructuring, creation cannot proceed because the parts are still trapped as meaning within unchallenged domains.
Boyd's critical insight: you cannot synthesize something genuinely new by recombining within the same domain. If Monk A and Monk B are both arguing about web frameworks, a synthesis that only recombines claims from their two essays will produce rearrangement, not creation. Genuine novelty requires material from outside the original conceptual domains. The destructive step — separating particulars from their previous wholes — creates space for outside material to enter and form new connections. Boyd is explicit: the result must NOT use the parts "in only those same arrangement" as any original domain — that would merely reconstruct what you already had.
Boyd's three pillars — why going outside is structurally necessary, not merely helpful:
Together: "any inward-oriented and continued effort to improve the match-up of concept with observed reality will only increase the degree of mismatch." The lateral creativity interventions (Phase 4.5) and the requirement for new research in recursive rounds aren't nice-to-have — they're the structural response to a thermodynamic necessity.
Boyd's verification step — reversibility: After creative induction, Boyd requires checking internal consistency by tracing back to the original constituents. If you cannot reverse directions — if synthesis claims don't trace to identifiable atomic parts from the decomposition — the ideas don't hold together without contradiction. But partial failure doesn't mean you reject the whole structure: identify which parts cohere, add new material, and try again.
Boyd's cycle: Structure → Unstructure → Restructure → repeat endlessly at higher and broader levels of elaboration. The alternating entropy increase (destruction) and decrease (creation) form a control mechanism that drives toward deeper understanding.
Where Boyd is operationally present: Phase 4.6 (Boydian Decomposition — the destructive deduction), Phase 5 (Sublation — the creative induction, including the reversibility check), Phase 6 (the auditor's reversibility check), and Phase 7 (Recursion — each cycle is Boyd's full Structure → Unstructure → Restructure, which is why recursive rounds often need new research from outside the original domains).
Relationship to Hegel: Hegel provides the engine for analyzing how positions fail (determinate negation) and the concept of what good synthesis looks like (Aufhebung). Boyd provides the engine for what to do with the wreckage — shatter, scatter, and recombine with outside material. Boyd also provides the theoretical proof for why going outside is mandatory (Gödel + Heisenberg + 2nd Law). The two frameworks are complementary: Hegel drives the contradiction analysis, Boyd drives the creative reconstruction.
The elenctic method probes a position through questioning to expose contradictions and reach aporia (productive perplexity). Not adversarial but cooperative — "midwifery of ideas." The interview phase of this skill is elenctic. Aporia is sometimes a valid outcome.
Key findings from Du et al. (2023, MIT) and subsequent work through ICLR 2025:
Elizabeth Eisenstein argued that print's most transformative effect was typographic fixity — enabling scholars to lay texts side by side and detect contradictions. LLMs represent the next step: fixity + comparison + structural contradiction analysis partially automated. This skill exploits that transition.
Douglas Adams' Electric Monk (Dirk Gently's Holistic Detective Agency) is a labor-saving device designed to believe things for you. The one in the story has "developed a fault" — it believes too many irrational things. In this skill, the "fault" is the feature. Each monk is designed to believe a specific position at full conviction that the user cannot hold simultaneously. The monks are not thinking for the user — they are believing for the user, which is what frees the user to think.
"The slenderest knowledge that may be obtained of the highest things is more desirable than the most certain knowledge obtained of lesser things."
This is the philosophical aspiration of the entire process. The dialectic does not produce certainty — every synthesis is provisional, fertile, pointing toward a deeper contradiction. But that slender, provisional knowledge of the deep structure (why this tension exists, what hidden question drives it, what shared assumption both sides are trapped in) is worth more than confident knowledge of the surface question ("which option should I pick?").
Operational implications:
Aquinas practiced the Disputatio — structured scholastic debate where committed advocates argued positions before a master who synthesized. The Electric Monks are his disputing friars, mechanized.
Brad DeLong's "Cognitive Distributed Disruption of Attention Crisis" (2026) frames the problem this skill addresses: the volume of plausible, credentialed output now exceeds any serious person's cognitive bandwidth. His solution is defensive intellectual infrastructure — ruthless triage, model-updating as the frame for reading, information portfolio management.
This skill is the offensive complement. DeLong's triage decides what deserves deep engagement. The Electric Monks provide the method for that engagement — they're what you reach for when you've found a genuine contradiction that can't be resolved by reading one more article, watching one more talk, or skimming one more summary.
Operational implication: The skill should not be used for everything. It's expensive (time, tokens, cognitive effort). Use it at DeLong's Level 4-5 — when the stakes justify deep engagement, when the tension is genuine and not resolvable by more information, when you need a model update rather than more data. The elenctic interview (Phase 1) should filter for this: if the question can be answered by looking it up, this is the wrong tool.
Charles Sanders Peirce identified three modes of inference: deduction (from rule to consequence), induction (from cases to rule), and abduction (from surprising fact to explanatory hypothesis). The synthesis phase is abductive: given the surprising fact that both monk positions exist and each has genuine evidence, what hypothesis would make this unsurprising? Peirce's typology of abduction (selective → conditional-creative → propositional-conditional-creative) maps to synthesis quality — the best syntheses introduce genuinely new concepts, not just new arrangements of known ones. Operationally present in Phase 5 (Abduction Test).
John Pollock's epistemology distinguishes undercutting defeaters (the inferential link is broken — reasons to doubt the connection between evidence and conclusion) from rebutting defeaters (evidence directly supporting the opposite conclusion). Undercutting is more structurally revealing because it identifies how reasoning fails, not just that it fails — parallel to determinate negation's "specific way of failing." Pollock also identifies self-defeating arguments (conclusions that undermine their own premises), which should be rejected outright. Operationally present in the hostile auditor prompt (Phase 6).
Adam Galinsky's research shows that perspective-taking (cognitively inhabiting another's viewpoint) outperforms advocacy (arguing for a position) at both conscious and nonconscious levels. The mechanism is self-other overlap — when you inhabit a position rather than argue for it, you access richer associative networks and produce higher-quality elaboration. This is the psychological basis for the Electric Monk's "you ARE this position" instruction — inhabiting produces deeper arguments than advocating. Operationally present in the monk prompt template (Phase 2).
Gary Klein's research shows that imagining a future failure has already occurred increases the ability to identify causes of that failure by ~30%, compared to asking "what could go wrong?" The temporal reframing ("it already happened, why?") breaks selective accessibility — the cognitive tendency to search only for confirming evidence. Operationally present in the hostile auditor prompt (Phase 6).
Gilles Fauconnier and Mark Turner's theory of conceptual blending provides the machinery for understanding how genuinely new concepts emerge. A blend's value is measured by its emergent structure — organizational properties that exist in neither input space. The skill's Boydian decomposition is the destructive step (creating input spaces), and sublation is the blend (which must have emergent structure to be genuine). The "generic space" — the abstract relational structure shared by both inputs — often reveals the shared assumption the synthesis must transcend. Operationally present in Phase 4.5.
Wood et al. (JMLR 2023) formalize why monk independence is load-bearing: the bias-variance-diversity decomposition shows diversity is literally subtracted from ensemble error (E[loss] = noise + avg_bias + avg_variance − diversity). Correlated errors eliminate the diversity benefit entirely. This is why monks must be spawned in separate sessions with no shared context, and why heterogeneous model families (when available) increase the skill's creative output. Surowiecki's wisdom-of-crowds conditions confirm: independence is necessary, not optional. Operationally present in the decorrelation check (Phase 3) and heterogeneous model guidance.
SICP's core thesis — that managing complexity requires modularization, abstraction barriers, and composition of simple components — mirrors this skill's architecture. Each phase is a module with defined inputs and outputs. Agents are spawned in isolated environments (SICP's environment model) to prevent information leakage. The auditor deliberately can't see the orchestrator's Phase 4 analysis — an abstraction barrier, not an oversight.
Most relevant is SICP's closure property: a means of combination has closure when the result can itself be combined using the same means. The dialectic has closure — a synthesis is itself a valid position that can serve as input to the next round. This is why recursion works: the output type equals the input type. When closure breaks (a synthesis so abstract or meta that no monk could believe it at full conviction), recursion stalls. This is a diagnosable failure mode — if you can't hand the synthesis to a monk and have it argue from that position, the synthesis lacks closure and needs to be made more concrete.
Chris Dixon (via Balaji Srinivasan): a good founder doesn't just have an idea — they can navigate the idea maze , anticipating which turns lead to treasure and which to dead ends. The maze is mapped through history (what did previous attempts get right and wrong?), analogy (what did similar efforts in adjacent domains do?), theories (what generalizable patterns exist?), and direct experience.
The dialectic queue is an idea maze. Each synthesis opens new paths (contradictions). The user chooses which to explore. Unexplored paths remain visible in the queue — a map of the territory showing where you've been, where you could go, and what remains open. The research phase (Phase 1d) maps directly to Dixon's four sources: history of the domain, analogies to adjacent domains, theoretical frameworks, and the user's own direct experience (surfaced in the elenctic interview). The skill doesn't just navigate the maze — each recursive round reveals new corridors that weren't visible from the entrance.
Christopher Alexander (1965) showed that natural cities have semi-lattice structure — overlapping, cross-connected sets — while designed cities impose tree structure where every element belongs to exactly one branch. Trees are easier to think about but destroy the cross-connections that make systems alive. Every attempt to design semi-lattices directly (Alexander's own HIDECS, Holacracy, Spotify's squad model) collapses back to trees because the design substrate — whether graph partitioning algorithms, org charts, or natural language — is tree-biased.
This skill is a semi-lattice compiler. Language is tree-structured (Chomsky's generative grammar, dependency parsing, sequential token generation). Each monk produces a tree — a coherent linear argument from committed premises to conclusions. Monk B in any dialectic is always right that its output is a tree. But the Boydian decomposition phase (Phase 4.5) strips both arguments of their source, extracts atomic parts, and finds cross-connections between elements that came from different trees. These cross-domain connections ARE the semi-lattice edges. The synthesis is the semi-lattice that emerges from the overlap of multiple trees.
The answer to "language can't represent semi-lattices" is not "make the LLM output a semi-lattice directly." It's: produce multiple trees from different committed positions, then extract the cross-connections. The semi-lattice is constructed, not generated. Every successful semi-lattice system works this way — Gene Ontology (multiple studies cross-referenced into a DAG), McChrystal's Team of Teams (tree-structured teams with liaison officers creating cross-connections), Ostrom's polycentric governance (overlapping jurisdictions, not one hierarchy).
Study these to understand the level of specificity, framing correction, and structural craft the skill requires. The key lessons are at the end.
User's surface framing: "Should I use TanStack Start or Next.js?"
Degenerate framing the orchestrator must avoid: "Libraries vs frameworks" or "modular vs monolithic." This is the boring version — the contradiction isn't deep enough.
Deepest contradiction found (via research): Infrastructure sovereignty and incentive alignment vs. deep framework-infrastructure integration and commercially-sustained ambition.
Key framing correction in Monk A's prompt:
"TanStack Start IS a framework — it has opinions about routing, server functions, SSR, and application architecture. Your argument is NOT that TanStack Start is more modular or 'just libraries' while Next.js is a monolith. Both are opinionated frameworks. The real difference lies elsewhere."
Key framing correction in Monk B's prompt:
"Your opponent's argument is NOT the naive 'libraries vs frameworks' take. They will argue that Next.js's design is structurally compromised by Vercel's commercial interests. You need to engage this argument directly, not dismiss it."
Research directives (targeted, not broad):
Ontological question driving both prompts: "What is the proper relationship between a framework, the infrastructure it runs on, and the business interests that fund its development?"
User's surface framing: "I'm torn between taking this promotion and being more present for my kids."
Degenerate framing: "Work-life balance." This flattens a structural tension into a scheduling problem.
Deepest contradiction found (via extended interview): The user doesn't just want both — they believe being the kind of person who excels at work is inseparable from being the kind of parent they want to be. The tension is identity-level, not time-allocation.
Key framing correction in Monk A's prompt:
"Your argument is NOT that career success matters. It's that THIS USER'S specific professional identity — what they build, how they lead, what they model for their children — is itself an act of parenting. Ground this in their actual history: [specific examples from interview]."
Key framing correction in Monk B's prompt:
"Your argument is NOT that family time matters. It's that presence has a developmental window that closes — and that the user's children at ages [X] need [specific things from interview] that no amount of 'quality time' can compress into fewer hours."
No external research needed. The briefing was built entirely from the elenctic interview — the user's history, their children's ages and needs, their partner's actual capacity, the specific role being offered.
This example shows how recursion pulls in cross-domain material — Boyd's prediction in action:
The original question has nothing to do with jurisprudence or Gödel — but by Cycle 4 the dialectic had evolved to where those concepts were essential. Each synthesis opens doors to domains the previous round couldn't see.
The final deliverable should include:
The Dialectical Trace — the full journey, not just the destination:
The Model Update — explicit statement of what changed:
Actionable Output (domain-dependent):
The Dialectic Queue — a map of the intellectual territory:
Write these as markdown files in the output directory. Include a README.md or index.md linking all output files in order so the full dialectical trace is navigable. The queue file (dialectic_queue.md) serves as both a session artifact and a starting point for future sessions.
Weekly Installs
66
Repository
GitHub Stars
478
First Seen
Mar 2, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode62
kimi-cli61
gemini-cli61
codex61
amp61
github-copilot61
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
53,700 周安装
| 20-30K tokens |
| Interview, transitions, presentation |
| Total (one round + recursion) | ~300-400K tokens | Median ~300K without supplementary research |
| Model selection | --model flag | model parameter (defaults to inheriting from parent) |
| Tool access | --allowedTools web_search,web_fetch | Inherits from parent or configure per-task |
| Risk Analysis | Actual risk structure behind competing assessments | Decision framework calibrated to real uncertainties | External research | No |