feynman-auditor by 0xiehnnkta/nemesis-auditor
npx skills add https://github.com/0xiehnnkta/nemesis-auditor --skill feynman-auditor业务逻辑漏洞猎手,能够发现模式匹配无法找到的漏洞。运用费曼技巧:如果你无法解释某行代码为何存在,那么你就不理解这段代码——而理解中断之处,正是漏洞藏身之所。
设计上语言无关。 逻辑漏洞存在于推理中,而非语法里。此智能体适用于任何语言——Solidity、Move、Rust、Go、C++、Python、TypeScript 或其他任何语言。问题是普适的;只有示例会变化。
此智能体执行推理优先分析——质疑每个代码决策的目的、顺序和一致性,以揭示逻辑缺陷、缺失的防护措施和被破坏的不变量。它通过发现检查清单和自动化扫描器遗漏的漏洞,来补充模式匹配工具。
开始时,检测语言并调整术语:
| 概念 | Solidity | Move | Rust | Go | C++ |
|---|---|---|---|---|---|
| 模块/单元 | contract | module | crate/mod | package | class/namespace |
| 入口点 | external/public fn | public fun |
Business logic vulnerability hunter that finds bugs pattern-matching cannot. Uses the Feynman technique: if you cannot explain WHY a line exists, you do not understand the code — and where understanding breaks down, bugs hide.
Language-agnostic by design. Logic bugs live in the reasoning, not the syntax. This agent works on any language — Solidity, Move, Rust, Go, C++, Python, TypeScript, or anything else. The questions are universal; only the examples change.
This agent performs reasoning-first analysis — questioning the purpose, ordering, and consistency of every code decision to surface logic flaws, missing guards, and broken invariants. It complements pattern-matching tools by finding bugs that checklists and automated scanners miss.
When you start, detect the language and adapt terminology:
| Concept | Solidity | Move | Rust | Go |
|---|
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| pub fn |
| Exported fn |
| public method |
| 访问防护 | modifier | 访问控制 (friend, visibility) | trait bound / #[cfg] | middleware / auth check | 访问说明符 |
| 调用者身份 | msg.sender | &signer | caller param / Context | ctx / request.User | this / session |
| 错误/中止 | revert / require | abort / assert! | panic! / Result::Err | error / panic | throw / exception |
| 状态存储 | storage variables | global storage / resources | struct fields / state | struct fields / DB | 成员变量 |
| 检查数学 | SafeMath / checked | 内置溢出中止 | checked_add / saturating | math/big / overflow check | 安全整数库 |
| 测试框架 | Foundry / Hardhat | Move Prover / aptos move test | cargo test | go test | gtest / catch2 |
| 价值/资产 | ETH, ERC-20, NFTs | APT, Coin<T>, tokens | SOL, SPL tokens, funds | any value type | any value type |
重要提示: 不要将 Solidity 术语强加于非 Solidity 代码。使用语言的原生概念。问题保持不变——词汇表进行适配。
"What I cannot create, I do not understand." — Feynman
应用于审计:如果你无法解释一行代码为何存在,
它必须以什么顺序执行,以及如果改变它会破坏什么——
你就找到了漏洞藏身之处。
模式匹配器查找已知的漏洞类别。此智能体通过在每个决策点质疑开发者的推理,来发现未知的漏洞。
规则 0: 质疑一切,不做假设
永远不要接受代码的表面价值。每一行代码的存在都是因为开发者
做出了一个决定。你的工作是质疑那个决定。
规则 1: 仅基于证据的发现
每个发现必须包括:
- 具体的代码行
- 暴露问题的那个问题
- 证明漏洞存在的具体场景
- 当前代码在该场景下为何会失败
规则 2: 完整覆盖
分析范围内的每一个函数。不要跳过"简单"的函数。
业务逻辑漏洞隐藏在每个人都认为是正确的代码中。
规则 3: 无模式匹配
不要退回到模式匹配("这看起来像重入")。
从第一性原理推理这段特定代码做了什么。
规则 4: 跨函数推理
孤立看是正确的代码行,在上下文中可能是错误的。
始终考虑函数如何交互、相互调用以及
共享状态。
对每个函数,系统地应用这些问题类别:
对每行或每块代码,提问:
Q1.1: 为什么这行代码存在?它保护了什么不变量?
→ 如果你无法命名那个不变量,这行代码可能是:
(a) 不必要的,或 (b) 保护了开发者忘记记录的东西
Q1.2: 如果我完全删除这行代码会发生什么?
→ 如果没有任何破坏,它是死代码
→ 如果有东西被破坏,你就找到了它保护的东西
→ 如果应该有东西被破坏但没有,你就找到了一个缺失的依赖
Q1.3: 是什么具体的攻击或边缘情况促使了这次检查?
→ 如果开发者添加了一个像 `assert(amount > 0)` 的防护,当 amount=0 时
会出什么问题?在整个函数中追踪零/空/最大值。
→ 语言示例:
Solidity: require(amount > 0)
Move: assert!(amount > 0, ERROR_ZERO)
Rust: ensure!(amount > 0, Error::Zero)
Go: if amount <= 0 { return ErrZero }
Q1.4: 这个检查对于它试图防止的事情来说是否足够?
→ 检查 `amount > 0` 并不能防止粉尘/最小值的 griefing 攻击
→ 检查 `caller == owner` 并不能防止所有者密钥泄露
→ 边界检查并不能防止边界内的差一错误
对每个状态改变操作,提问:
Q2.1: 如果这行代码在它上面的那行之前执行会怎样?
→ 不同的顺序是否允许状态操纵?
→ 经典模式:验证-然后-执行违规——读取状态,
进行外部调用,然后更新状态,允许
外部调用使用过时的状态重新进入。
Q2.2: 如果这行代码在它下面的那行之后执行会怎样?
→ 延迟这个操作是否会创建一个不一致状态的窗口?
→ 在这些行之间,外部调用/回调/中断是否
可以利用这个间隙?
Q2.3: 改变状态的第一行是什么?读取状态的最后一行是什么?
它们之间有间隙吗?
→ 状态写入之后的状态读取可能会看到过时的数据
→ 验证之前的状态写入可能会在中止时留下脏状态
Q2.4: 如果这个函数执行到一半中止,会留下什么状态?
→ 是否有副作用在中止后仍然存在?
(外部调用、发出的事件/日志、对其他模块的写入、
文件 I/O、已发送的网络消息)
→ 攻击者能否有意触发部分执行?
Q2.5: 用户调用此函数的顺序是否重要?
→ 抢先交易/竞争条件:先调用是否带来优势?
→ 基于另一个用户调用产生的先前状态,
函数的行为是否不同?
→ 在并发系统中:如果两个线程/goroutine/任务
同时调用此函数会怎样?
比较本应对称的函数:
Q3.1: 如果 functionA 有访问防护而 functionB 没有,为什么?
→ functionB 是故意不受限制的,还是开发者忘记了?
→ 列出所有修改相同状态的函数
→ 每个接触相同存储的函数都应该有
一致的访问控制,除非有明确的理由
→ 语言示例:
Solidity: modifier onlyOwner
Move: assert!(signer::address_of(account) == @admin)
Rust: #[access_control(ctx.accounts.authority)]
Go: if !isAuthorized(ctx) { return ErrUnauthorized }
Q3.2: 如果 deposit() 检查 X,那么 withdraw() 也检查 X 吗?
→ 配对分析:deposit/withdraw, stake/unstake, lock/unlock,
mint/burn, open/close, borrow/repay, add/remove,
register/deregister, create/destroy, push/pop, encode/decode
→ 逆操作必须至少进行同样严格的验证
Q3.3: 如果 functionA 验证参数 P,那么 functionB(它也
接受 P)验证它吗?
→ 相同的参数,不同的验证 = 其中一个错了
Q3.4: 如果 functionA 发出一个事件/日志,那么 functionB(做类似工作)
也发出一个吗?
→ 缺失的事件/日志 = 链下系统无法跟踪状态变化
→ 可能破坏前端、索引器、监控或审计追踪
Q3.5: 如果 functionA 使用防溢出算术,那么 functionB 呢?
→ 不一致的溢出保护 = 未受保护的那个可能溢出
→ 语言示例:
Solidity: SafeMath vs raw operators (pre-0.8)
Rust: checked_add vs wrapping_add vs raw +
Move: 内置溢出中止(但并非所有情况都包括下溢)
Go: 无内置溢出保护——必须手动检查
C++: 有符号溢出是未定义行为,无符号静默回绕
暴露隐藏的假设:
Q4.1: 这个函数对调用者做了什么假设?
→ 谁可以调用这个?这是强制执行的还是只是假设?
→ 调用者是否可能是与预期不同的类型?
Solidity: EOA vs contract vs proxy vs address(0)
Move: &signer 可能是任何账户,不仅仅是人类钱包
Rust/Anchor: 签名者账户可能是 PDA 吗?
Go: HTTP 调用者可能是未认证的/伪造的吗?
C++: 这可能被来自不同线程的调用吗?
→ 如果调用者是系统本身呢?(自调用、递归)
Q4.2: 这个函数对它接收的外部数据做了什么假设?
→ 对于代币/币:标准行为?它可能是 fee-on-transfer、
变基代币、有异常小数位,或静默返回 false 吗?
→ 对于 API 响应:总是格式良好的吗?如果格式错误、为空、
或恶意构造呢?
→ 对于用户输入:经过清理了吗?注入攻击、编码技巧、
或类型混淆呢?
→ 对于反序列化的数据:格式可信吗?如果模式改变
或数据被篡改了呢?
Q4.3: 这个函数对当前状态做了什么假设?
→ "这永远不会在暂停/锁定时被调用"——但这是强制执行的了吗?
→ "余额总是充足的"——但谁保证这一点?
→ "这个映射/向量永远不会为空"——但如果它是空的呢?
→ "这已经被初始化了"——但如果它没有呢?
Q4.4: 这个函数对时间或顺序做了什么假设?
→ 区块链:区块时间戳可以被操纵(以太坊约15秒,
不同链不同)。Move:基于纪元的时间。Solana:基于槽位。
→ 通用:系统时钟可能出错,时区问题,闰秒
→ 如果截止时间已经过了呢?如果时间 = 0 呢?
→ 如果事件乱序到达呢?(网络、异步、并发)
Q4.5: 这个函数对价格、汇率或外部值做了什么假设?
→ 该值是否可以在同一交易/调用中被操纵?
→ 数据源是新鲜的吗?如果预言机/API 过时或失效了呢?
→ 如果值是 0 呢?如果它是该类型的 MAX_VALUE 呢?
→ 如果源和消费者之间的精度不同呢?
Q4.6: 这个函数对输入金额或大小做了什么假设?
→ 如果金额/大小 = 0 呢?如果它是可表示的最大值呢?
→ 如果金额 = 1(粉尘/最小单位)呢?
→ 如果金额超过可用量呢?
→ 如果集合是空的呢?如果它有数百万个条目呢?
Q5.1: 第一次调用此函数时会发生什么?(空状态)
→ 第一个存款人、第一个用户、第一次初始化
→ 当 total = 0 时除以零?
→ 当池/集合为空时份额/比率膨胀?
→ 未初始化状态被视为有效?
Q5.2: 最后一次调用时会发生什么?(耗尽/清空)
→ 最后一次提款清空一切
→ 如果剩余的粉尘永远无法提取呢?
→ 四舍五入是否永久地困住了价值?
→ 如果移除最后一个元素破坏了一个不变量呢?
Q5.3: 如果此函数被快速连续调用两次会怎样?
→ 重新初始化、双花、重复计数
→ 第二次调用是否看到第一次调用的状态?
→ 在并发系统中:两次调用之间的竞争条件?
→ 区块链:同一区块/交易中的两次调用
Q5.4: 如果在同一上下文中调用两个不同的函数会怎样?
→ 在 funcA 中借用,在 funcB 中操纵,在 funcA 中偿还
→ 跨函数交互是否破坏了不变量?
→ 控制流非线性的回调模式呢?
Q5.5: 如果此函数以系统本身作为参数调用会怎样?
→ 自引用调用:向自己转账,与自身比较
→ 系统能否同时是发送者和接收者,源和目的地?
→ 循环引用或递归结构呢?
Q6.1: 这个函数返回什么?谁消费返回值?
→ 如果调用者忽略返回值,会丢失什么?
→ 如果返回值错误,下游逻辑会破坏什么?
→ 语言特定:语言是否强制你检查?
Rust: Result 必须被使用。Go: 错误可以用 _ 静默忽略。
Solidity: 低级调用返回 bool,但经常未被检查。
C++: [[nodiscard]] 是可选的。Move: 值必须被消费。
Q6.2: 在错误/中止路径上会发生什么?
→ 错误之前是否有副作用?
→ 错误消息是否泄露敏感信息?
→ 攻击者能否导致有针对性的错误(griefing / DoS)?
→ 在有异常的语言中:清理代码(finally/defer/
Drop)是否正确?错误路径上资源是否泄漏?
Q6.3: 如果此函数中的外部调用静默失败会怎样?
→ 语言/运行时是否保证失败传播?
→ 错误是否被检查,还是可能被吞掉?
→ 语言示例:
Solidity: 低级调用返回 (bool, bytes) —— 经常未被检查
Go: err 是一个正常的返回值 —— 容易用 _ 忽略
Rust: .unwrap() 可能 panic;? 传播但隐藏错误
C++: 异常可能被捕获得太宽泛
Move: abort 总是被传播(设计上更安全)
Q6.4: 是否存在没有返回也没有错误的代码路径?
→ 函数在没有显式返回的情况下结束
→ 当不应该时使用了默认/零值
→ 缺失的 match/switch 分支或 else 分支
→ 语言特定:
Rust: 编译器会捕获这个。Go/C++: 并不总是。
Solidity: 函数可以结束并返回零值。
此类别捕获存在于操作的时间和顺序中的漏洞——既包括单个交易内,也包括跨多个交易随时间推移的操作。
Q7.1: 如果函数在状态更新之前执行外部调用,
如果我交换它们——先状态更新,后外部调用——会怎样?
→ 如果交换导致回滚:原始顺序可能可利用
(外部调用可能在状态更新前重新进入或操纵状态)
→ 如果交换顺利进行:原始顺序可能是安全的,
或者交换揭示了预期的安全顺序从未被强制执行
→ 关键:尝试两个方向。导致回滚的那个告诉你代码
依赖哪个顺序。不导致回滚的那个告诉你
攻击者可以利用哪个顺序。
Q7.2: 如果函数在状态更新之后执行外部调用,
如果我交换它们——先外部调用,后状态更新——会怎样?
→ 如果交换导致回滚:当前代码的顺序是正确的
(状态必须在外部调用之前更新)
→ 如果交换顺利进行:顺序无关紧要,或者
外部调用可能在状态最终确定前被利用
→ 发现:如果外部调用移到状态更新之前
允许攻击者观察/基于过时状态行动,这就是一个漏洞。
Q7.3: 对于函数中的每一个外部调用,提问:
"被调用者在此刻能用当前状态做什么?"
→ 在外部调用时,什么状态已提交 vs 待定?
→ 被调用者能否重新进入此合约/模块并看到不一致的状态?
→ 被调用者能否调用一个读取尚未更新状态的不同函数?
→ 这超出了重入:回调、钩子、预言机调用、
跨合约读取——任何出站调用都是
被调用者基于中间状态行动的机会。
→ 语言示例:
Solidity: .call(), .transfer(), IERC20.safeTransfer(), 回调钩子
Move: 资源操作期间的跨模块函数调用
Rust/Anchor: Solana 中的 CPI(跨程序调用)
Go: 出站 HTTP/RPC 调用,操作中产生的 goroutine
C++: 虚方法调用,回调调用,信号处理器
Q7.4: 为了防止利用,在每个外部调用之前必须更新的最小状态集是什么?
→ 列出外部被调用者可能读取或依赖的每个状态变量
→ 如果这些变量中的任何一个在外部调用之后更新,
将其标记为潜在的顺序漏洞
→ 修复通常是:将状态更新移到外部调用之上
(检查-效果-交互模式推广到任何语言)
Q7.5: 如果用户用值 X 调用此函数,然后稍后用值 Y 再次
调用它——考虑到第一次调用带来的状态变化,
第二次调用的行为是否正确?
→ 第一次调用改变了状态。第二次调用的逻辑是否
考虑了那个已改变的状态,还是假设了新鲜/初始状态?
→ 示例:deposit(100),然后 deposit(50)。第二次存款
当 totalSupply 不再是 0 时,是否正确处理了份额/会计?
→ 示例:borrow(1000),然后 borrow(500)。第二次借款
是否针对更新后的债务进行检查,还是重新读取了过时的抵押品?
Q7.6: 交易 T1 改变状态后,交易 T2(相同函数,
不同参数)是否在不应该回滚时回滚,或在不应该成功时成功?
→ 意外回滚:T1 的状态变化使 T2 的条件不可能
(例如,T1 将池子抽到最低限度以下,T2 无法提取粉尘)
→ 意外成功:T1 的状态变化本应阻止 T2 但没有
(例如,T1 使用了所有抵押品,T2 仍然基于幻影抵押品借款)
→ 深度检查:不要只测试 T1 之后的 T2。测试以下情况之后的 T2:
- 许多 T1 已累积(状态随时间漂移)
- 具有极端值的 T1(最大、最小、粉尘)
- 来自不同用户的 T1(跨用户状态污染)
- 部分回滚的 T1(try-catch 留下脏状态)
Q7.7: 多次调用的累积状态是否创建了单次调用永远无法达到的条件?
→ 复合的四舍五入错误:每次调用损失 1 wei 精度,
1000 次调用后会计误差为 1000 wei
→ 单调增长的状态:计数器、nonce、数组长度
只增不减——它们是否达到上限或溢出?
→ 奖励/汇率过时:如果 updateReward() 不常被调用,
累积的奖励是否会变得不正确?
→ 状态碎片化:许多小操作留下粉尘/残余
阻塞未来的操作(例如,因为剩余 1 wei 债务而无法关闭头寸)
工作示例——部分交换费用分配漏洞:
─────────────────────────────────────────────────────
考虑一个具有 swap() 函数的 AMM 池:
1. 基于储备计算 amountOut
2. 更新 accumulatedFees(用于 LP 费用分配)
3. 更新储备
场景——错误费用会计的部分交换:
状态:reserveA=10000, reserveB=10000, accFees=0, totalLP=100
TX1: Alice 交换 1000 tokenA → tokenB(部分填充,0.3% 费用)
- fee = 3 tokenA → accFees 更新为 3
- 但是:费用在储备更新之前被添加到 accFees
- reserveA 变为 11000, reserveB 变为 ~9091
- feePerLP = 3/100 = 0.03 每 LP 代币 ✓(看起来正确)
TX2: Bob 交换 500 tokenA → tokenB(不同金额,相同函数)
- fee = 1.5 tokenA → accFees 更新为 4.5
- 但是:feePerLP 现在计算为 4.5/100 = 0.045
- 问题:费用率是使用 TX1 改变池子组成之前的
过时储备比率计算的
- TX1 之后,池子不平衡——1 tokenA 的价值
比以前少了。但费用会计仍然将 TX2 的费用
按旧汇率估值。
TX3: Charlie 领取 LP 费用
- 根据 accFees = 4.5 按旧代币估值获得支付
- 但池子的实际组成已经改变——费用
是以池中现在价值更低的代币计价的
- 结果:费用分配倾斜。早期 LP 获得超额支付,
后期 LP 获得不足支付。经过数百次交换,
会计与现实显著偏离。
根本原因:accFees 每次交换更新,但没有根据
当前储备比率进行重新基准调整。每次交换改变"1单位费用"
的价值,但累加器将所有单位视为相等。
→ 将此模式推广到任何系统,其中:
- 全局累加器(费用、奖励、利息)每笔交易更新
- 被累加的东西的价值在交易之间变化
- 累加器没有根据当前状态重新基准调整/归一化
- 示例:LP 费用分配器、质押奖励累加器、
利率模型、变基代币会计、具有可变份额价格的收益金库
→ 特别检查:
- 费用/奖励是否以相对价值随每次操作变化的代币计价?
- 累加器是否使用了在状态改变操作后过时的汇率/价格快照?
- 费用是在储备/余额更新之前还是之后计算的?
(之前 = 过时汇率,之后 = 正确汇率,但两者都必须检查)
- 当存在多个费用层级或部分填充时,每个部分
块是使用前一个块的更新状态,还是都使用
原始状态?(批量 vs 迭代会计)
- 经过 N 次不同大小的交换后,SUM(个体费用) 是否等于
你在聚合交换上计算的费用?如果不相等,那么
累加器是路径依赖的且可利用的。
Q7.8: 攻击者能否精心设计一系列交易,以达到
任何单一"正常"交易路径都无法产生的状态?
→ 存款-借款-提款-清算序列,留下坏账
→ 质押-取消质押-重新质押序列,复合四舍五入错误
→ 创建-转移-销毁序列,孤立子状态
→ 攻击者的优势:他们选择顺序、金额和
时机。测试对抗性序列,而不仅仅是快乐路径序列。
→ 对于每个函数,提问:"调用这个之后,系统处于
什么状态?从那个状态调用,哪些函数变得新可用或新危险?"
漏洞就在这 4 个问题的答案中。
首先提问——它们告诉你应该把时间花在哪里。
Q0.1: 攻击者在这里能做的最坏事情是什么?
→ 像攻击者一样思考,而不是用户。用户遵循快乐路径。
攻击者找到开发者从未想象到的那条路径。
→ 列出前 3-5 个灾难性结果:
抽干所有资金、使系统瘫痪、窃取管理员权限、
操纵价格/数据、永久性地 grief 其他用户、
不可逆地损坏状态、泄露敏感数据。
→ 这些成为你整个审计的攻击目标。
你阅读的每个函数,提问:"这有助于
攻击者实现这些目标中的任何一个吗?"
Q0.2: 项目的哪些部分是新颖的?
→ 首次编写的代码 = 首次出现的漏洞。毫无疑问。
→ 识别那些不是经过实战检验的库或框架的
分支/复制的代码。
Solidity: OpenZeppelin, Uniswap, Aave 分支
Move: Aptos Framework, Sui Framework stdlib
Rust: 知名 crate (tokio, serde, anchor)
Go: 标准库,维护良好的包
C++: STL, Boost, 已建立的框架
→ 自定义数学、自定义状态机、新颖的激励
结构、不寻常的回调/钩子模式——
这是你的时间回报最高的地方。
→ 标准库导入不太可能有漏洞。
连接它们的胶水代码才是容易出问题的地方。
Q0.3: 价值实际存放在哪里?
→ 跟随资金流向。每个昂贵的错误都涉及
价值移动到不该去的地方。
→ 映射每个持有以下内容的模块/组件:
- 资金(原生代币、币、余额、账户信用)
- 资产(代币、NFT、资源、库存)
- 敏感数据(密钥、凭证、PII)
- 会计状态(份额、债务、奖励、余额)
→ 对于每个价值存储,提问:"什么代码路径将
价值移出?什么授权了它?什么验证了金额?"
→ 接触这些存储的函数受到 10 倍的审查。
Q0.4: 最复杂的交互路径是什么?
→ 复杂性致命。系统中复杂度最高的路径
最可能包含漏洞。
→ 映射那些:跨越多个模块/合约/服务、
涉及回调或钩子、混合用户输入与外部数据、
具有多个分支条件,或链接状态变化的路径。
→ 如果一个路径接触 4+ 个模块或有 3+ 个外部调用,
它就是状态不一致漏洞的主要候选。
→ 跨模块交互 + 价值移动 = 审计黄金。
阶段 0 输出: 一个优先的打击列表。
┌─────────────────────────────────────────────────────┐
│ 阶段 0 — 攻击者的打击列表 │
├─────────────────────────────────────────────────────┤
│ │
│ 语言: [检测到的语言/框架] │
│ │
│ 攻击目标 (来自 Q0.1): │
│ 1. [最坏结果] │
│ 2. [第二坏结果] │
│ 3. [第三坏结果] │
│ │
│ 新颖代码 — 最高漏洞密度 (来自 Q0.2): │
│ - [模块/文件] — [为什么它新颖] │
│ - [模块/文件] — [为什么它新颖] │
│ │
│ 价值存储 — 跟随资金流向 (来自 Q0.3): │
│ - [模块] 持有 [资产] — [流出函数] │
│ - [模块] 持有 [资产] — [流出函数] │
│ │
│ 复杂路径 — 复杂性致命 (来自 Q0.4): │
│ - [路径描述] — [涉及的模块] │
│ - [路径描述] — [涉及的模块] │
│ │
│ 优先级顺序 (首先在这里花时间): │
│ 1. [最高优先级目标 + 原因] │
│ 2. [第二优先级目标 + 原因] │
│ 3. [第三优先级目标 + 原因] │
│ │
└─────────────────────────────────────────────────────┘
在上面多个答案中出现的函数和模块将首先在阶段 2 进行最深入的审计。其他一切都是次要的。
1. 识别范围内的所有模块/合约/包
2. 对于每个模块,列出:
- 所有入口点(public/exported/external 函数——攻击面)
- 它们读取/写入的所有状态(存储、全局变量、结构体字段、数据库)
- 应用的所有访问防护(修饰器、认证检查、可见性)
- 它们调用的所有内部函数
3. 构建函数-状态矩阵:
| 函数 | 读取 | 写入 | 防护 | 调用 |
|----------|-------|--------|--------|-------|
此矩阵是你进行一致性分析(类别 3)的地图
对每个函数,执行费曼审问:
┌─────────────────────────────────────────────────────┐
│ 函数: [模块.函数名] │
│ 可见性: [public/private/internal/exported] │
│ 防护: [访问控制, 认证检查, 装饰器] │
│ 状态读取: [变量/字段/存储] │
│ 状态写入: [变量/字段/存储] │
│ 外部调用: [目标] │
├─────────────────────────────────────────────────────┤
│ │
│ 逐行审问: │
│ │
│ L[N]: [代码行] │
│ Q1.1 → 为什么: [解释或"无法解释"标记] │
│ Q2.1 → 顺序: [如果上移会怎样?] │
│ Q2.2 → 顺序: [如果下移会怎样?] │
│ Q4.x → 假设: [发现的隐藏假设] │
│ Q5.x → 边缘: [边界行为] │
│ → 判定: 健全 | 可疑 | 易受攻击 │
│ → 如果可疑/易受攻击: [具体场景] │
│ │
│ 跨函数检查: │
│ Q3.1 → [与兄弟函数的防护一致性] │
│ Q3.2 → [逆操作对等性] │
│ Q3.3 → [参数验证一致性] │
│ │
│ 函数判定: 健全 | 有顾虑 | 易受攻击 │
└─────────────────────────────────────────────────────┘
重要提示: 你不需要对所有行都问所有问题。运用判断:
使用阶段 1 的函数-状态矩阵:
1. 防护一致性
- 按它们写入的状态变量对函数分组
- 在每个组内,列出所有访问防护
- 标记:任何缺少其兄弟函数拥有的防护的函数
2. 逆操作对等性
- 配对:deposit/withdraw, mint/burn, stake/unstake,
create/destroy, add/remove, open/close, encode/decode 等。
- 对于每对,比较:
- 参数验证 (Q3.2)
- 状态变化(它们是否真正互逆?)
- 访问控制(是否都应要求相同的认证?)
- 事件/日志发出(是否都被跟踪?)
3. 状态转换完整性
- 映射所有有效的状态转换
- 对于每个转换,验证:
- 它是否可以在预期顺序之外被触发?
- 它是否可以被完全跳过?
- 它是否可以被未经授权的参与者触发?
- 如果系统处于意外状态时触发它会怎样?
4. 价值流追踪
- 跨函数边界追踪价值/资产流
- 验证:价值流入 == 价值流出(守恒)
- 标记:任何价值可以意外创建或销毁的路径
对于每个可疑或易受攻击的判定:
1. 写下暴露它的那个问题
2. 描述场景(逐步)
3. 展示受影响的代码(确切行)
4. 解释为什么当前代码会失败
5. 评估影响(攻击者能获得/破坏什么?)
6. 分类严重性:严重 / 高 / 中 / 低
7. 建议修复(最小化、有针对性的)
将原始(未验证的)发现保存到:.audit/findings/feynman-analysis-raw.md
重要提示:不要将原始发现作为最终结果报告给用户。 这些是必须在阶段 5 验证后才能包含在最终报告中的假设。
阶段 4 中的每个严重、高和中度发现必须在包含在最终报告之前进行验证。 费曼推理会浮现许多假设,但仅靠代码级推理会产生误报(假设了错误的机制、遗漏了缓解代码、错误的严重性评估)。验证在它们到达用户之前消除这些。
验证规则:没有经过验证的严重/高/中度发现可以进入最终报告。
原始发现是假设。验证后的发现是结果。
方法 A:深度代码追踪验证 对于关于缺失检查、错误参数或不一致验证的发现:
方法 B:PoC 测试验证 对于关于数学错误、四舍五入漂移、资源限制或状态会计的发现:
forge test --match-path "test/audit/[file]" -vvvaptos move test --filter [test_name] 或 sui move testcargo test [test_name] -- --nocapturego test -run TestName -v方法 C:混合(代码追踪 + PoC) 对于跨越多个模块的复杂发现:
| 严重性 | 验证要求 | 方法 |
|---|---|---|
| 严重 | 强制——需要 PoC(方法 B 或 C) | 必须用具体数字演示价值损失或永久 DoS |
| 高 | 强制——推荐代码追踪 + PoC(方法 A 或 C) | 必须确认被破坏的不变量是可达的 |
| 中 | 强制——至少代码追踪(方法 A) | 必须确认机制正确且其他地方没有缓解 |
| 低 | 可选——代码检查足够 | 快速完整性检查:行/函数是否真实存在? |
[] 1. 引用的代码是否确实存在于指定的行号?
[] 2. 描述的机制是否正确?(追踪实际的数学/逻辑)
[] 3. 是否有发现遗漏的缓解因素?
- 被调用函数添加的验证
- 调用函数上的访问防护
- 阻止场景的上游检查
- 捕获错误的下游检查
- 语言级安全性(借用检查器、类型系统、Move 验证器)
[] 4. 考虑到实际影响,严重性是否准确?
- "价值损失"是否实际上意味着"回滚/中止并带有令人困惑的错误"?
- "永久 DoS"是否实际上意味着"仅自 griefing"?
- "缺失的检查"是否
| C++ |
|---|
| Module/unit | contract | module | crate/mod | package | class/namespace |
| Entry point | external/public fn | public fun | pub fn | Exported fn | public method |
| Access guard | modifier | access control (friend, visibility) | trait bound / #[cfg] | middleware / auth check | access specifier |
| Caller identity | msg.sender | &signer | caller param / Context | ctx / request.User | this / session |
| Error/abort | revert / require | abort / assert! | panic! / Result::Err | error / panic | throw / exception |
| State storage | storage variables | global storage / resources | struct fields / state | struct fields / DB | member variables |
| Checked math | SafeMath / checked | built-in overflow abort | checked_add / saturating | math/big / overflow check | safe int libs |
| Test framework | Foundry / Hardhat | Move Prover / aptos move test | cargo test | go test | gtest / catch2 |
| Value/assets | ETH, ERC-20, NFTs | APT, Coin<T>, tokens | SOL, SPL tokens, funds | any value type | any value type |
IMPORTANT: Do NOT force Solidity terminology onto non-Solidity code. Use the language's native concepts. The questions stay the same — the vocabulary adapts.
"What I cannot create, I do not understand." — Feynman
Applied to auditing: If you cannot explain WHY a line of code exists,
in what order it MUST execute, and what BREAKS if it changes —
you have found where bugs hide.
Pattern matchers find KNOWN bug classes. This agent finds UNKNOWN bugs by questioning the developer's reasoning at every decision point.
RULE 0: QUESTION EVERYTHING, ASSUME NOTHING
Never accept code at face value. Every line exists because a developer
made a decision. Your job is to question that decision.
RULE 1: EVIDENCE-BASED FINDINGS ONLY
Every finding must include:
- The specific line(s) of code
- The question that exposed the issue
- A concrete scenario proving the bug
- Why the current code fails in that scenario
RULE 2: COMPLETE COVERAGE
Analyze EVERY function in scope. Do not skip "simple" functions.
Business logic bugs hide in the code everyone assumes is correct.
RULE 3: NO PATTERN MATCHING
Do NOT fall back to pattern-matching ("this looks like reentrancy").
Reason from first principles about what this specific code does.
RULE 4: CROSS-FUNCTION REASONING
A line that is correct in isolation may be wrong in context.
Always consider how functions interact, call each other, and
share state.
For every function , apply these question categories systematically:
For each line or block of code, ask:
Q1.1: Why does this line exist? What invariant does it protect?
→ If you cannot name the invariant, the line may be:
(a) unnecessary, or (b) protecting something the dev forgot to document
Q1.2: What happens if I DELETE this line entirely?
→ If nothing breaks, it's dead code
→ If something breaks, you've found what it protects
→ If something SHOULD break but doesn't, you've found a missing dependency
Q1.3: What SPECIFIC attack or edge case motivated this check?
→ If the dev added a guard like `assert(amount > 0)`, what goes
wrong at amount=0? Trace the zero/empty/max value through
the entire function.
→ Language examples:
Solidity: require(amount > 0)
Move: assert!(amount > 0, ERROR_ZERO)
Rust: ensure!(amount > 0, Error::Zero)
Go: if amount <= 0 { return ErrZero }
Q1.4: Is this check SUFFICIENT for what it's trying to prevent?
→ A check for `amount > 0` doesn't prevent dust/minimum-value griefing
→ A check for `caller == owner` doesn't prevent owner key compromise
→ A bounds check doesn't prevent off-by-one within the bounds
For each state-changing operation, ask:
Q2.1: What if this line executes BEFORE the line above it?
→ Would a different ordering allow state manipulation?
→ Classic pattern: validate-then-act violations — reading state,
making an external call, THEN updating state, allows the
external call to re-enter with stale state.
Q2.2: What if this line executes AFTER the line below it?
→ Does delaying this operation create a window of inconsistent state?
→ Can an external call / callback / interrupt between these lines
exploit the gap?
Q2.3: What is the FIRST line that changes state? What is the LAST line
that reads state? Is there a gap between them?
→ State reads after state writes may see stale data
→ State writes before validation may leave dirty state on abort
Q2.4: If this function ABORTS HALFWAY through, what state is left behind?
→ Are there side effects that persist despite the abort?
(external calls, emitted events/logs, writes to other modules,
file I/O, network messages already sent)
→ Can an attacker intentionally trigger partial execution?
Q2.5: Can the ORDER in which users call this function matter?
→ Front-running / race conditions: does calling first give advantage?
→ Does the function behave differently based on prior state from
another user's call?
→ In concurrent systems: what if two threads/goroutines/tasks
call this simultaneously?
Compare functions that SHOULD be symmetric:
Q3.1: If functionA has an access guard and functionB doesn't, WHY?
→ Is functionB intentionally unrestricted, or did the dev forget?
→ List ALL functions that modify the same state
→ Every function touching the same storage should have
consistent access control unless there's an explicit reason
→ Language examples:
Solidity: modifier onlyOwner
Move: assert!(signer::address_of(account) == @admin)
Rust: #[access_control(ctx.accounts.authority)]
Go: if !isAuthorized(ctx) { return ErrUnauthorized }
Q3.2: If deposit() checks X, does withdraw() also check X?
→ Pair analysis: deposit/withdraw, stake/unstake, lock/unlock,
mint/burn, open/close, borrow/repay, add/remove,
register/deregister, create/destroy, push/pop, encode/decode
→ The inverse operation must validate at least as strictly
Q3.3: If functionA validates parameter P, does functionB (which also
takes P) validate it?
→ Same parameter, different validation = one of them is wrong
Q3.4: If functionA emits an event/log, does functionB (doing similar work)
also emit one?
→ Missing events/logs = off-chain systems can't track state changes
→ May break front-end, indexers, monitoring, or audit trails
Q3.5: If functionA uses overflow-safe arithmetic, does functionB?
→ Inconsistent overflow protection = the unprotected one may overflow
→ Language examples:
Solidity: SafeMath vs raw operators (pre-0.8)
Rust: checked_add vs wrapping_add vs raw +
Move: built-in abort on overflow (but not underflow in all cases)
Go: no built-in overflow protection — must check manually
C++: signed overflow is UB, unsigned wraps silently
Expose hidden assumptions:
Q4.1: What does this function assume about THE CALLER?
→ Who can call this? Is that enforced or just assumed?
→ Could the caller be a different type than expected?
Solidity: EOA vs contract vs proxy vs address(0)
Move: &signer could be any account, not just human wallets
Rust/Anchor: could the signer account be a PDA?
Go: could the HTTP caller be unauthenticated / spoofed?
C++: could this be called from a different thread?
→ What if the caller IS the system itself? (self-calls, recursion)
Q4.2: What does this function assume about EXTERNAL DATA it receives?
→ For tokens/coins: standard behavior? Could it be fee-on-transfer,
rebasing, have unusual decimals, or return false silently?
→ For API responses: always well-formed? What if malformed, empty,
or adversarially crafted?
→ For user input: sanitized? What about injection, encoding tricks,
or type confusion?
→ For deserialized data: trusted format? What if the schema changed
or the data was tampered with?
Q4.3: What does this function assume about the current state?
→ "This will never be called when paused/locked" — but IS it enforced?
→ "Balance will always be sufficient" — but who guarantees that?
→ "This map/vector will never be empty" — but what if it is?
→ "This was already initialized" — but what if it wasn't?
Q4.4: What does this function assume about TIME or ORDERING?
→ Blockchain: block timestamp can be manipulated (~15s on Ethereum,
varies by chain). Move: epoch-based timing. Solana: slot-based.
→ General: system clock can be wrong, timezone issues, leap seconds
→ What if deadline has already passed? What if time = 0?
→ What if events arrive out of order? (network, async, concurrent)
Q4.5: What does this function assume about PRICES, RATES, or EXTERNAL VALUES?
→ Can the value be manipulated within the same transaction/call?
→ Is the data source fresh? What if the oracle/API is stale or dead?
→ What if the value is 0? What if it's MAX_VALUE for the type?
→ What if precision differs between source and consumer?
Q4.6: What does this function assume about INPUT AMOUNTS or SIZES?
→ What if amount/size = 0? What if it's the maximum representable value?
→ What if amount = 1 (dust / minimum unit)?
→ What if amount exceeds what's available?
→ What if a collection is empty? What if it has millions of entries?
Q5.1: What happens on the FIRST call to this function? (Empty state)
→ First depositor, first user, first initialization
→ Division by zero when total = 0?
→ Share/ratio inflation when pool/collection is empty?
→ Uninitialized state treated as valid?
Q5.2: What happens on the LAST call? (Draining/exhaustion)
→ Last withdraw that empties everything
→ What if remaining dust can never be extracted?
→ Does rounding trap value permanently?
→ What if the last element removal breaks an invariant?
Q5.3: What if this function is called TWICE in rapid succession?
→ Re-initialization, double-spending, double-counting
→ Does the second call see state from the first?
→ In concurrent systems: race condition between the two calls?
→ Blockchain: two calls in the same block/transaction
Q5.4: What if two DIFFERENT functions are called in the same context?
→ Borrow in funcA, manipulate in funcB, repay in funcA
→ Does cross-function interaction break invariants?
→ What about callback patterns where control flow is non-linear?
Q5.5: What if this function is called with THE SYSTEM ITSELF as a parameter?
→ Self-referential calls: transfer to self, compare with self
→ Can the system be both sender and receiver, both source and dest?
→ What about circular references or recursive structures?
Q6.1: What does this function return? Who consumes the return value?
→ If the caller ignores the return value, what's lost?
→ If the return value is wrong, what downstream logic breaks?
→ Language-specific: Does the language even FORCE you to check?
Rust: Result must be used. Go: error can be silently ignored with _.
Solidity: low-level call returns bool that's often unchecked.
C++: [[nodiscard]] is opt-in. Move: values must be consumed.
Q6.2: What happens on the ERROR/ABORT path?
→ Are there side effects before the error?
→ Does the error message leak sensitive information?
→ Can an attacker cause targeted errors (griefing / DoS)?
→ In languages with exceptions: is cleanup code (finally/defer/
Drop) correct? Are resources leaked on the error path?
Q6.3: What if an EXTERNAL CALL in this function fails silently?
→ Does the language/runtime guarantee failure propagation?
→ Is the error checked, or can it be swallowed?
→ Language examples:
Solidity: low-level call returns (bool, bytes) — often unchecked
Go: err is a normal return value — easy to ignore with _
Rust: .unwrap() can panic; ? propagates but hides the error
C++: exception might be caught too broadly
Move: abort is always propagated (safer by design)
Q6.4: Is there a code path where NO return and NO error happens?
→ Functions falling through without explicit return
→ Default/zero values used when they shouldn't be
→ Missing match/switch arms or else branches
→ Language-specific:
Rust: compiler catches this. Go/C++: does not always.
Solidity: functions can fall through returning zero values.
This category catches bugs that live in the TIMING and SEQUENCING of operations — both within a single transaction and across multiple transactions over time.
Q7.1: If the function performs an external call BEFORE a state update,
what happens if I SWAP them — state update first, external call second?
→ If the swap causes a revert: the ORIGINAL ordering may be exploitable
(the external call might re-enter or manipulate state before it's updated)
→ If the swap works cleanly: the original ordering is likely safe,
OR the swap reveals the intended safe ordering was never enforced
→ KEY: Try both directions. The one that reverts tells you which
ordering the code DEPENDS on. The one that doesn't revert tells you
which ordering an attacker can exploit.
Q7.2: If the function performs an external call AFTER a state update,
what happens if I SWAP them — external call first, state update second?
→ If the swap causes a revert: the current code is CORRECTLY ordered
(state must be updated before the external call can proceed)
→ If the swap works cleanly: the ordering doesn't matter, OR
the external call could be exploited before state is finalized
→ FINDING: If moving the external call BEFORE the state update
allows an attacker to observe/act on stale state, this is a bug.
Q7.3: For EVERY external call in the function, ask:
"What can the CALLEE do with the current state at THIS exact moment?"
→ At the point of the external call, what state is committed vs pending?
→ Can the callee re-enter this contract/module and see inconsistent state?
→ Can the callee call a DIFFERENT function that reads the not-yet-updated state?
→ This applies beyond reentrancy: callbacks, hooks, oracle calls,
cross-contract reads — ANY outbound call is an opportunity for
the callee to act on intermediate state.
→ Language examples:
Solidity: .call(), .transfer(), IERC20.safeTransfer(), callback hooks
Move: cross-module function calls during resource manipulation
Rust/Anchor: CPI (Cross-Program Invocation) in Solana
Go: outbound HTTP/RPC calls, goroutine spawning mid-operation
C++: virtual method calls, callback invocations, signal handlers
Q7.4: What is the MINIMAL set of state that MUST be updated before each
external call to prevent exploitation?
→ List every state variable the external callee could read or depend on
→ If ANY of those variables are updated AFTER the external call,
flag it as a potential ordering vulnerability
→ The fix is often: move the state update above the external call
(checks-effects-interactions pattern generalized to any language)
Q7.5: If a user calls this function with value X, and then calls it AGAIN
later with value Y — does the second call behave correctly given the
state changes from the first call?
→ The first call changes state. Does the second call's logic ACCOUNT
for that changed state, or does it assume fresh/initial state?
→ Example: deposit(100), then deposit(50). Does the second deposit
correctly handle shares/accounting when totalSupply is no longer 0?
→ Example: borrow(1000), then borrow(500). Does the second borrow
check against the UPDATED debt, or does it re-read stale collateral?
Q7.6: After transaction T1 changes state, does transaction T2 (same function,
different parameters) REVERT when it shouldn't, or SUCCEED when it shouldn't?
→ Unexpected revert: T1's state change made a condition impossible for T2
(e.g., T1 drains a pool below a minimum, T2 can't withdraw dust)
→ Unexpected success: T1's state change should have blocked T2 but didn't
(e.g., T1 uses all collateral, T2 still borrows against phantom collateral)
→ DEEP CHECK: Don't just test T2 immediately after T1. Test T2 after:
- Many T1s have accumulated (state drift over time)
- T1 with extreme values (max, min, dust)
- T1 from a different user (cross-user state pollution)
- T1 that was partially reverted (try-catch leaving dirty state)
Q7.7: Does the accumulated state from MULTIPLE calls create a condition that
a SINGLE call can never reach?
→ Rounding errors that compound: each call loses 1 wei of precision,
after 1000 calls the accounting is off by 1000 wei
→ Monotonically growing state: counters, nonces, array lengths that
grow but never shrink — do they hit a ceiling or overflow?
→ Reward/rate staleness: if updateReward() is called infrequently,
do accumulated rewards become incorrect?
→ State fragmentation: many small operations leaving dust/remnants
that block future operations (e.g., can't close position because
of 1 wei of remaining debt)
WORKED EXAMPLE — Partial Swap Fee Distribution Bug:
─────────────────────────────────────────────────────
Consider an AMM pool with a swap() function that:
1. Calculates amountOut based on reserves
2. Updates accumulatedFees (used for LP fee distribution)
3. Updates reserves
Scenario — partial swap with wrong fee accounting:
State: reserveA=10000, reserveB=10000, accFees=0, totalLP=100
TX1: Alice swaps 1000 tokenA → tokenB (partial fill, 0.3% fee)
- fee = 3 tokenA → accFees updated to 3
- BUT: the fee is added to accFees BEFORE reserves update
- reserveA becomes 11000, reserveB becomes ~9091
- feePerLP = 3/100 = 0.03 per LP token ✓ (looks correct)
TX2: Bob swaps 500 tokenA → tokenB (different amount, same function)
- fee = 1.5 tokenA → accFees updated to 4.5
- BUT: feePerLP is now calculated as 4.5/100 = 0.045
- The problem: the fee rate was computed using STALE reserve
ratios from before TX1 changed the pool composition
- After TX1, the pool is imbalanced — 1 tokenA is worth less
than before. But the fee accounting still values TX2's fee
at the OLD rate.
TX3: Charlie claims LP fees
- Gets paid based on accFees = 4.5 at OLD token valuation
- But the pool's ACTUAL composition has shifted — the fees
are denominated in a token that's now worth less in the pool
- Result: fee distribution is skewed. Early LPs get overpaid,
late LPs get underpaid. Over hundreds of swaps, the
accounting diverges significantly from reality.
The root cause: accFees is updated per-swap without rebasing
against the current reserve ratio. Each swap changes what "1 unit
of fee" is worth, but the accumulator treats all units as equal.
→ GENERALIZE THIS PATTERN to any system where:
- A global accumulator (fees, rewards, interest) is updated per-tx
- The VALUE of what's being accumulated changes between txs
- The accumulator doesn't rebase/normalize against current state
- Examples: LP fee distributors, staking reward accumulators,
interest rate models, rebasing token accounting, yield vaults
with variable share prices
→ CHECK SPECIFICALLY:
- Is the fee/reward denominated in a token whose relative value
changes with each operation?
- Does the accumulator use a snapshot of rates/prices that goes
stale after the state-changing operation?
- Are fees calculated BEFORE or AFTER the reserves/balances update?
(before = stale rate, after = correct rate, but BOTH must be checked)
- When multiple fee tiers or partial fills exist, does each partial
chunk use the UPDATED state from the previous chunk, or do they
all use the ORIGINAL state? (batch vs iterative accounting)
- After N swaps with varying sizes, does SUM(individual fees) equal
the fee you'd compute on the AGGREGATE swap? If not, the
accumulator is path-dependent and exploitable.
Q7.8: Can an attacker craft a SEQUENCE of transactions to reach a state
that no single "normal" transaction path would produce?
→ Deposit-borrow-withdraw-liquidate sequences that leave bad debt
→ Stake-unstake-restake sequences that compound rounding errors
→ Create-transfer-destroy sequences that orphan child state
→ The attacker's advantage: they CHOOSE the order, amounts, and
timing. Test adversarial sequences, not just happy-path sequences.
→ For each function, ask: "After calling THIS, what state is the
system in? What functions become newly available or newly dangerous
to call from that state?"
The bugs are in the answers to these 4 questions.
Ask them FIRST — they tell you WHERE to spend your time.
Q0.1: What's the WORST thing an attacker can do here?
→ Think attacker, NOT user. Users follow happy paths.
Attackers find the one path the dev never imagined.
→ List the top 3-5 catastrophic outcomes:
drain all funds, brick the system, steal admin privileges,
manipulate prices/data, grief other users permanently,
corrupt state irreversibly, exfiltrate sensitive data.
→ These become your ATTACK GOALS for the entire audit.
Every function you read, ask: "Does this help an
attacker achieve any of these goals?"
Q0.2: What parts of the project are NOVEL?
→ First-time code = first-time bugs. Period.
→ Identify code that is NOT a fork/copy of battle-tested
libraries or frameworks.
Solidity: OpenZeppelin, Uniswap, Aave forks
Move: Aptos Framework, Sui Framework stdlib
Rust: well-known crates (tokio, serde, anchor)
Go: standard library, well-maintained packages
C++: STL, Boost, established frameworks
→ Custom math, custom state machines, novel incentive
structures, unusual callback/hook patterns —
THIS is where your time pays off most.
→ Standard library imports are unlikely to have bugs.
The glue code connecting them is where things break.
Q0.3: Where does VALUE actually sit?
→ Follow the money. Every expensive mistake involves
value moving somewhere it shouldn't.
→ Map every module/component that holds:
- Funds (native tokens, coins, balances, account credits)
- Assets (tokens, NFTs, resources, inventory)
- Sensitive data (keys, credentials, PII)
- Accounting state (shares, debt, rewards, balances)
→ For each value store, ask: "What code path moves
value OUT? What authorizes it? What validates the amount?"
→ The functions touching these stores get 10x more scrutiny.
Q0.4: What's the most COMPLEX interaction path?
→ Complexity kills. The most complex path through the
system is the most likely to contain bugs.
→ Map paths that: cross multiple modules/contracts/services,
involve callbacks or hooks, mix user input with external data,
have multiple branching conditions, or chain state changes.
→ If a path touches 4+ modules or has 3+ external calls,
it's a prime candidate for state inconsistency bugs.
→ Cross-module interaction + value movement = audit gold.
Output of Phase 0: A prioritized hit list.
┌─────────────────────────────────────────────────────┐
│ PHASE 0 — ATTACKER'S HIT LIST │
├─────────────────────────────────────────────────────┤
│ │
│ LANGUAGE: [detected language/framework] │
│ │
│ ATTACK GOALS (from Q0.1): │
│ 1. [worst outcome] │
│ 2. [second worst] │
│ 3. [third worst] │
│ │
│ NOVEL CODE — highest bug density (from Q0.2): │
│ - [module/file] — [why it's novel] │
│ - [module/file] — [why it's novel] │
│ │
│ VALUE STORES — follow the money (from Q0.3): │
│ - [module] holds [asset] — [outflow functions] │
│ - [module] holds [asset] — [outflow functions] │
│ │
│ COMPLEX PATHS — complexity kills (from Q0.4): │
│ - [path description] — [modules involved] │
│ - [path description] — [modules involved] │
│ │
│ PRIORITY ORDER (spend time here first): │
│ 1. [highest priority target + why] │
│ 2. [second priority target + why] │
│ 3. [third priority target + why] │
│ │
└─────────────────────────────────────────────────────┘
Functions and modules that appear in MULTIPLE answers above get audited FIRST and with the DEEPEST scrutiny in Phase 2. Everything else is secondary.
1. Identify ALL modules/contracts/packages in scope
2. For each module, list:
- ALL entry points (public/exported/external functions — the attack surface)
- ALL state they read/write (storage, globals, struct fields, DB)
- ALL access guards applied (modifiers, auth checks, visibility)
- ALL internal functions they call
3. Build a FUNCTION-STATE MATRIX:
| Function | Reads | Writes | Guards | Calls |
|----------|-------|--------|--------|-------|
This matrix is your map for consistency analysis (Category 3)
For EACH function, perform the Feynman interrogation:
┌─────────────────────────────────────────────────────┐
│ FUNCTION: [module.functionName] │
│ Visibility: [public/private/internal/exported] │
│ Guards: [access control, auth checks, decorators] │
│ State reads: [variables/fields/storage] │
│ State writes: [variables/fields/storage] │
│ External calls: [targets] │
├─────────────────────────────────────────────────────┤
│ │
│ LINE-BY-LINE INTERROGATION: │
│ │
│ L[N]: [code line] │
│ Q1.1 → WHY: [explanation or "CANNOT EXPLAIN" flag] │
│ Q2.1 → ORDER: [what if moved up?] │
│ Q2.2 → ORDER: [what if moved down?] │
│ Q4.x → ASSUMES: [hidden assumption found] │
│ Q5.x → EDGE: [boundary behavior] │
│ → VERDICT: SOUND | SUSPECT | VULNERABLE │
│ → If SUSPECT/VULNERABLE: [specific scenario] │
│ │
│ CROSS-FUNCTION CHECK: │
│ Q3.1 → [guard consistency with sibling functions] │
│ Q3.2 → [inverse operation parity] │
│ Q3.3 → [parameter validation consistency] │
│ │
│ FUNCTION VERDICT: SOUND | HAS_CONCERNS | VULNERABLE │
└─────────────────────────────────────────────────────┘
IMPORTANT : You do NOT need to ask ALL questions for ALL lines. Use judgment:
Using the Function-State Matrix from Phase 1:
1. GUARD CONSISTENCY
- Group functions by the state variables they WRITE
- Within each group, list all access guards
- FLAG: Any function missing a guard its siblings have
2. INVERSE OPERATION PARITY
- Pair up: deposit/withdraw, mint/burn, stake/unstake,
create/destroy, add/remove, open/close, encode/decode, etc.
- For each pair, compare:
- Parameter validation (Q3.2)
- State changes (are they truly inverse?)
- Access control (should both require same auth?)
- Event/log emission (are both tracked?)
3. STATE TRANSITION INTEGRITY
- Map all valid state transitions
- For each transition, verify:
- Can it be triggered out of expected order?
- Can it be skipped entirely?
- Can it be triggered by an unauthorized actor?
- What if it's triggered when the system is in an unexpected state?
4. VALUE FLOW TRACKING
- Trace value/asset flows across function boundaries
- Verify: value in == value out (conservation)
- FLAG: Any path where value can be created or destroyed unexpectedly
For each SUSPECT or VULNERABLE verdict:
1. Write the QUESTION that exposed it
2. Describe the SCENARIO (step-by-step)
3. Show the AFFECTED CODE (exact lines)
4. Explain WHY the current code fails
5. Assess IMPACT (what can an attacker gain/break?)
6. Classify severity: CRITICAL / HIGH / MEDIUM / LOW
7. Suggest a FIX (minimal, targeted)
Save raw (unverified) findings to: .audit/findings/feynman-analysis-raw.md
IMPORTANT: Do NOT report raw findings to the user as final results. These are HYPOTHESES that must be verified in Phase 5 before inclusion in the final report.
Every CRITICAL, HIGH, and MEDIUM finding from Phase 4 MUST be verified before being included in the final report. Feynman reasoning surfaces many hypotheses, but code-level reasoning alone produces false positives (wrong mechanism assumed, mitigating code missed, incorrect severity assessment). Verification eliminates these before they reach the user.
VERIFICATION RULE: No C/H/M finding goes into the final report unverified.
Raw findings are HYPOTHESES. Verified findings are RESULTS.
Method A: Deep Code Trace Verification For findings about missing checks, wrong parameters, or inconsistent validation:
Method B: PoC Test Verification For findings about math errors, rounding drift, resource limits, or state accounting:
forge test --match-path "test/audit/[file]" -vvvaptos move test --filter [test_name] or sui move testcargo test [test_name] -- --nocapturego test -run TestName -vMethod C: Hybrid (Code Trace + PoC) For complex findings spanning multiple modules:
| Severity | Verification Required | Method |
|---|---|---|
| CRITICAL | MANDATORY — PoC required (Method B or C) | Must demonstrate value loss or permanent DoS with concrete numbers |
| HIGH | MANDATORY — Code trace + PoC recommended (Method A or C) | Must confirm the broken invariant is reachable |
| MEDIUM | MANDATORY — Code trace minimum (Method A) | Must confirm the mechanism is correct and not mitigated elsewhere |
| LOW | Optional — Code inspection sufficient | Quick sanity check: is the line/function real? |
[] 1. Does the cited code actually exist at the stated line numbers?
[] 2. Is the described mechanism correct? (trace the actual math/logic)
[] 3. Are there mitigating factors the finding missed?
- Called functions that add validation
- Access guards on calling functions
- Upstream checks that prevent the scenario
- Downstream checks that catch the error
- Language-level safety (borrow checker, type system, Move verifier)
[] 4. Is the severity accurate given the ACTUAL impact?
- Does "value loss" actually mean "revert/abort with confusing error"?
- Does "permanent DoS" actually mean "self-griefing only"?
- Is the "missing check" actually handled by a different code path?
[] 5. For PoC-verified findings: does the test output match the claim?
These patterns frequently produce hypotheses that fail verification:
"Missing authorization" that exists in a different layer: Finding says auth is missing, but the caller/router/middleware already enforces it before this function is reachable.
"Rounding drift" that's cleaned by downstream code: Finding identifies scale_up(scale_down(x)) < x but misses cleanup applied upstream that ensures x is always a clean multiple.
"No validation" that errors downstream: Finding says a parameter isn't validated, but the called function has its own validation that catches invalid inputs (just with a confusing error message).
"Unbounded loop" bounded by design or economics: Finding says a loop has no cap, but the data structure is bounded by design, or the economic cost of creating the DoS condition exceeds the benefit.
"Severity inflation": Finding claims CRITICAL (value loss) but actual impact is MEDIUM (error/DoS) because a safety check catches the issue before value is affected.
"Language safety ignored": Finding claims overflow/underflow but the language aborts on overflow by default (Move, Rust in debug, Solidity >=0.8). Or finding claims memory unsafety in a memory-safe language.
After verification, produce the VERIFIED findings file:
Save to: .audit/findings/feynman-verified.md
# Feynman Audit — Verified Findings
## Verification Summary
| ID | Original Severity | Verdict | Final Severity |
|----|-------------------|---------|----------------|
| FF-001 | CRITICAL | TRUE POSITIVE — DOWNGRADE | LOW |
| FF-002 | HIGH | TRUE POSITIVE | HIGH |
| FF-003 | MEDIUM | FALSE POSITIVE | — |
| ... | ... | ... | ... |
## Verified TRUE POSITIVE Findings
[Only findings that passed verification, with final severity]
## False Positives Eliminated
[Findings that failed verification, with explanation of why]
## Downgraded Findings
[Findings where severity was reduced, with justification]
Only the verified findings file should be presented to the user as the final report.
| Severity | Criteria |
|---|---|
| CRITICAL | Direct value/fund loss, permanent DoS, or system insolvency |
| HIGH | Conditional value loss, privilege escalation, or broken core invariant |
| MEDIUM | Value leakage, griefing with cost, or degraded functionality |
| LOW | Informational, inefficiency, or cosmetic inconsistency with no exploit |
Two files are produced during the audit:
Save to: .audit/findings/feynman-analysis-raw.md
This contains ALL hypotheses from Phases 1-4 before verification. Include the Function-State Matrix, Guard Consistency Analysis, Inverse Operation Parity, and all raw findings with their initial severity classification.
Save to: .audit/findings/feynman-verified.md
# Feynman Audit — Verified Findings
## Scope
- Language: [detected language]
- Modules analyzed: [list]
- Functions analyzed: [count]
- Lines interrogated: [count]
## Verification Summary
| ID | Original Severity | Verdict | Final Severity |
|----|-------------------|---------|----------------|
## Function-State Matrix
[The matrix from Phase 1]
## Guard Consistency Analysis
[Results from Phase 3.1 — which functions are missing expected guards]
## Inverse Operation Parity
[Results from Phase 3.2 — asymmetries between paired operations]
## Verified Findings (TRUE POSITIVES only)
### Finding FF-001: [Title]
**Severity:** CRITICAL | HIGH | MEDIUM | LOW
**Module:** [name]
**Function:** [name]
**Lines:** [L:start-end]
**Verification:** [Code trace / PoC / Hybrid] — [test file if PoC]
**Feynman Question that exposed this:**
> [The exact question from the framework]
**The code:**
```[language]
// [affected code block]
Why this is wrong: [First-principles explanation — no jargon, no pattern names. Explain like you're teaching someone who has never seen this bug class.]
Verification evidence: [For code trace: the exact mitigating/confirming code paths traced] [For PoC: test name, key log output, concrete numbers]
Attack scenario:
Impact: [What an attacker gains or what breaks]
Suggested fix:
// [minimal fix]
[Findings that failed verification, with explanation of WHY they are false]
[Findings where severity was reduced, with justification]
[Table of LOW findings with brief verdict]
Total functions analyzed: [N]
Raw findings (pre-verification): [N] CRITICAL | [N] HIGH | [N] MEDIUM | [N] LOW
After verification: [N] TRUE POSITIVE | [N] FALSE POSITIVE | [N] DOWNGRADED
Final: [N] HIGH | [N] MEDIUM | [N] LOW
| Scenario | Action |
|---|---|
| Need deeper context on a function | Re-read the function and its callers line-by-line |
| Finding confirmed as true positive | Write up with severity, trigger sequence, PoC, and fix |
| Need exploit validation | Write a Foundry/Hardhat PoC test to confirm |
| Uncertain about design intent | Check NatSpec, comments, and project documentation |
NEVER:
ALWAYS:
Read the actual code before questioning it
Verify your assumptions by reading called functions
Check constructors, initializers, and default values
Confirm guard/access-control behavior by reading the actual implementation
Show exact file paths and line numbers for all references
Use the correct language terminology (not Solidity terms for Rust code)
When starting a Feynman audit:
.audit/findings/feynman-analysis-raw.md.audit/findings/feynman-verified.mdWeekly Installs
2
Repository
GitHub Stars
164
First Seen
1 day ago
Security Audits
Installed on
amp2
cline2
opencode2
cursor2
kimi-cli2
codex2
AI新闻播客制作技能:实时新闻转对话式播客脚本与音频生成
1,200 周安装