npx skills add https://github.com/trailofbits/skills --skill harness-writing模糊测试工具是接收来自模糊测试器的随机数据并将其路由到被测系统(SUT)的入口函数。工具的质量直接决定了哪些代码路径会被执行以及是否能发现关键错误。编写不当的工具可能会遗漏整个子系统或产生不可复现的崩溃。
工具是模糊测试器随机字节生成与应用程序API之间的桥梁。它必须将原始字节解析为有意义的输入,调用目标函数,并优雅地处理边界情况。任何模糊测试设置中最重要的部分就是工具——如果编写不当,应用程序的关键部分可能无法被覆盖。
| 概念 | 描述 |
|---|---|
| 工具 | 接收模糊测试器输入并调用被测目标代码的函数 |
| SUT | 被测系统——正在接受模糊测试的代码 |
| 入口点 | 模糊测试器要求的函数签名(例如 LLVMFuzzerTestOneInput) |
| FuzzedDataProvider | 从原始字节中有结构地提取类型化数据的辅助类 |
| 确定性 | 确保相同输入始终产生相同行为的属性 |
| 交错模糊测试 | 基于输入执行多个操作的单一工具 |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
在以下情况跳过此技术:
| 任务 | 模式 |
|---|---|
| 最小C++工具 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) |
| 最小Rust工具 | `fuzz_target!( |
| 大小验证 | if (size < MIN_SIZE) return 0; |
| 转换为整数 | uint32_t val = *(uint32_t*)(data); |
| 使用FuzzedDataProvider | FuzzedDataProvider fuzzed_data(data, size); |
| 提取类型化数据(C++) | auto val = fuzzed_data.ConsumeIntegral<uint32_t>(); |
| 提取字符串(C++) | auto str = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF); |
在代码库中查找以下函数:
良好的目标通常是:
从调用目标函数的最简单工具开始:
C/C++:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
target_function(data, size);
return 0;
}
Rust:
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
target_function(data);
});
拒绝太小或太大而无意义的输入:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// 确保最小大小以获得有意义的输入
if (size < MIN_INPUT_SIZE || size > MAX_INPUT_SIZE) {
return 0;
}
target_function(data, size);
return 0;
}
原理: 模糊测试器生成各种大小的随机输入。你的工具必须处理空、极小、极大或格式错误的输入,而不会在工具本身引起意外问题(SUT中的崩溃是可以的——这正是我们要寻找的)。
对于需要类型化数据(整数、字符串等)的API,使用类型转换或像 FuzzedDataProvider 这样的辅助工具:
简单类型转换:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size != 2 * sizeof(uint32_t)) {
return 0;
}
uint32_t numerator = *(uint32_t*)(data);
uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));
divide(numerator, denominator);
return 0;
}
使用FuzzedDataProvider:
#include "FuzzedDataProvider.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
FuzzedDataProvider fuzzed_data(data, size);
size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
return 0;
}
运行模糊测试器并监控:
迭代改进工具以提高这些指标。
使用场景: 当目标期望整数或浮点数等基本类型时
实现:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// 确保恰好是2个4字节数字
if (size != 2 * sizeof(uint32_t)) {
return 0;
}
// 将输入拆分为两个整数
uint32_t numerator = *(uint32_t*)(data);
uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));
divide(numerator, denominator);
return 0;
}
Rust等效实现:
fuzz_target!(|data: &[u8]| {
if data.len() != 2 * std::mem::size_of::<i32>() {
return;
}
let numerator = i32::from_ne_bytes([data[0], data[1], data[2], data[3]]);
let denominator = i32::from_ne_bytes([data[4], data[5], data[6], data[7]]);
divide(numerator, denominator);
});
工作原理: 任何8字节输入都是有效的。模糊测试器学习到输入必须恰好是8字节,并且每个位翻转都会产生一个新的、可能有趣的输入。
使用场景: 当目标需要多个字符串、整数或可变长度数据时
实现:
#include "FuzzedDataProvider.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
FuzzedDataProvider fuzzed_data(data, size);
// 提取不同类型的数据
size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
// 使用终止符消费可变长度字符串
std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
if (result != NULL) {
free(result);
}
return 0;
}
优势: FuzzedDataProvider 处理从字节流中提取结构化数据的复杂性。对于需要不同类型多个参数的API特别有用。
使用场景: 当多个相关操作应在单个工具中测试时
实现:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size < 1 + 2 * sizeof(int32_t)) {
return 0;
}
// 第一个字节选择操作
uint8_t mode = data[0];
// 后续字节是操作数
int32_t numbers[2];
memcpy(numbers, data + 1, 2 * sizeof(int32_t));
int32_t result = 0;
switch (mode % 4) {
case 0:
result = add(numbers[0], numbers[1]);
break;
case 1:
result = subtract(numbers[0], numbers[1]);
break;
case 2:
result = multiply(numbers[0], numbers[1]);
break;
case 3:
result = divide(numbers[0], numbers[1]);
break;
}
// 防止编译器优化掉调用
printf("%d", result);
return 0;
}
优势:
使用时机:
使用场景: 当模糊测试使用自定义结构体的Rust代码时
实现:
use arbitrary::Arbitrary;
#[derive(Debug, Arbitrary)]
pub struct Name {
data: String
}
impl Name {
pub fn check_buf(&self) {
let data = self.data.as_bytes();
if data.len() > 0 && data[0] == b'a' {
if data.len() > 1 && data[1] == b'b' {
if data.len() > 2 && data[2] == b'c' {
process::abort();
}
}
}
}
}
使用arbitrary的工具:
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: your_project::Name| {
data.check_buf();
});
添加到Cargo.toml:
[dependencies]
arbitrary = { version = "1", features = ["derive"] }
优势: arbitrary crate自动处理将原始字节反序列化为Rust结构体,减少样板代码并确保有效的结构体构造。
限制: arbitrary crate不提供反向序列化,因此无法手动构造映射到特定结构体的字节数组。这最适合从空语料库开始(对libFuzzer很好,对AFL++有问题)。
| 技巧 | 优势 |
|---|---|
| 从解析器开始 | 错误密度高,入口点清晰,易于编写工具 |
| 模拟I/O操作 | 防止阻塞I/O导致的挂起,实现确定性 |
| 使用FuzzedDataProvider | 简化从原始字节提取结构化数据 |
| 重置全局状态 | 确保每次迭代独立且可复现 |
| 在工具中释放资源 | 防止长时间活动中内存耗尽 |
| 避免在工具中记录日志 | 记录日志很慢——模糊测试需要每秒100-1000次执行 |
| 首先手动测试工具 | 在开始活动前使用已知输入运行工具 |
| 尽早检查覆盖率 | 确保工具到达预期的代码路径 |
对于高度结构化的输入格式,考虑使用Protocol Buffers作为中间格式并配合自定义变异器:
// 在.proto文件中定义输入格式
// 使用libprotobuf-mutator生成有效变异
// 这确保模糊测试器变异消息内容,而不是protobuf编码本身
这种方法设置更复杂,但防止模糊测试器在无法解析的输入上浪费时间。详见结构感知模糊测试文档。
问题: 随机值或时间依赖性导致不可复现的崩溃。
解决方案:
用从模糊测试器输入中种子的确定性PRNG替换 rand():
uint32_t seed = fuzzed_data.ConsumeIntegral<uint32_t>();
srand(seed);
模拟返回时间、PID或随机数据的系统调用
避免从 /dev/random 或 /dev/urandom 读取
如果你的SUT使用全局状态(单例、静态变量),请在迭代之间重置它:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// 每次迭代前重置全局状态
global_reset();
target_function(data, size);
// 清理资源
global_cleanup();
return 0;
}
原理: 全局状态可能导致在N次迭代后崩溃,而不是在特定输入上崩溃,使错误不可复现。
遵循这些规则以确保有效的模糊测试工具:
| 规则 | 原理 |
|---|---|
| 处理所有输入大小 | 模糊测试器生成空、极小、极大输入——工具必须优雅处理 |
绝不调用exit() | 调用 exit() 会停止模糊测试器进程。如果需要,在SUT中使用 abort() |
| 连接所有线程 | 每次迭代必须在下次迭代开始前运行完成 |
| 保持快速 | 目标为每秒100-1000次执行。避免记录日志、高复杂度、过多内存使用 |
| 保持确定性 | 相同输入必须始终产生相同行为以确保可复现性 |
| 避免全局状态 | 全局状态降低可复现性——如果不可避免,在迭代之间重置 |
| 使用窄目标 | 不要在同一工具中模糊测试PNG和TCP——不同格式需要单独目标 |
| 释放资源 | 防止内存泄漏导致长时间活动中资源耗尽 |
注意: 这些指南不仅适用于工具代码,也适用于整个SUT。如果SUT违反这些规则,考虑修补它(见模糊测试障碍技术)。
| 反模式 | 问题 | 正确方法 |
|---|---|---|
| 全局状态不重置 | 非确定性崩溃 | 在工具开始时重置所有全局变量 |
| 阻塞I/O或网络调用 | 挂起模糊测试器,浪费时间 | 模拟I/O,使用内存缓冲区 |
| 工具中的内存泄漏 | 资源耗尽终止活动 | 返回前释放所有分配 |
在SUT中调用exit() | 停止整个模糊测试过程 | 使用 abort() 或返回错误码 |
| 工具中大量记录日志 | 将执行/秒降低几个数量级 | 模糊测试期间禁用日志记录 |
| 每次迭代操作过多 | 减慢模糊测试器速度 | 保持迭代快速且专注 |
| 混合不相关的输入格式 | 语料库条目在不同格式间无用 | 不同格式使用单独工具 |
| 不验证输入大小 | 工具在边界情况下崩溃 | 访问 data 前检查 size |
工具签名:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// 你的代码在这里
return 0; // 非零返回值保留供将来使用
}
编译:
clang++ -fsanitize=fuzzer,address -g harness.cc -o fuzz_target
集成技巧:
FuzzedDataProvider.h 进行结构化输入提取-fsanitize=fuzzer 编译以链接模糊测试运行时-fsanitize=address,undefined)以检测更多错误-g 在崩溃时获得更好的堆栈跟踪运行:
./fuzz_target corpus_dir/
资源:
AFL++支持多种工具风格。为获得最佳性能,使用持久模式:
持久模式工具:
#include <unistd.h>
int main(int argc, char **argv) {
#ifdef __AFL_HAVE_MANUAL_CONTROL
__AFL_INIT();
#endif
unsigned char buf[MAX_SIZE];
while (__AFL_LOOP(10000)) {
// 从标准输入读取输入
ssize_t len = read(0, buf, sizeof(buf));
if (len <= 0) break;
// 调用目标函数
target_function(buf, len);
}
return 0;
}
编译:
afl-clang-fast++ -g harness.cc -o fuzz_target
集成技巧:
__AFL_LOOP)获得10-100倍加速__AFL_INIT())以跳过设置开销AFL_USE_ASAN=1 或 AFL_USE_UBSAN=1 进行清理器构建运行:
afl-fuzz -i seeds/ -o findings/ -- ./fuzz_target
工具签名:
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
// 你的代码在这里
});
使用结构化输入(arbitrary crate):
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: YourStruct| {
data.check();
});
创建工具:
cargo fuzz init
cargo fuzz add my_target
集成技巧:
arbitrary crate进行自动结构体反序列化fuzz/fuzz_targets/ 目录中运行:
cargo +nightly fuzz run my_target
资源:
工具签名:
// +build gofuzz
package mypackage
func Fuzz(data []byte) int {
// 调用目标函数
target(data)
// 返回码:
// -1 如果输入无效
// 0 如果输入有效但不有趣
// 1 如果输入有趣(例如,添加了新覆盖率)
return 0
}
构建:
go-fuzz-build
集成技巧:
运行:
go-fuzz -bin=./mypackage-fuzz.zip -workdir=fuzz
| 问题 | 原因 | 解决方案 |
|---|---|---|
| 每秒执行次数低 | 工具太慢(记录日志、I/O、复杂度) | 分析工具,移除瓶颈,模拟I/O |
| 未发现崩溃 | 覆盖率未到达有错误的代码 | 检查覆盖率,改进工具以到达更多路径 |
| 不可复现的崩溃 | 非确定性或全局状态 | 移除随机性,在迭代之间重置全局变量 |
| 模糊测试器立即退出 | 工具调用 exit() | 将 exit() 替换为 abort() 或返回错误 |
| 内存不足错误 | 工具或SUT中的内存泄漏 | 释放分配,使用泄漏清理器查找泄漏 |
| 空输入时崩溃 | 工具未验证大小 | 添加 if (size < MIN_SIZE) return 0; |
| 语料库不增长 | 输入限制过多或格式过于严格 | 使用FuzzedDataProvider或结构感知模糊测试 |
| 技能 | 应用方式 |
|---|---|
| libfuzzer | 使用带有FuzzedDataProvider的 LLVMFuzzerTestOneInput 工具签名 |
| aflpp | 支持带有 __AFL_LOOP 的持久模式工具以获得性能 |
| cargo-fuzz | 使用Rust特定的 fuzz_target! 宏与arbitrary crate集成 |
| atheris | Python工具接收字节,调用Python函数 |
| ossfuzz | 要求工具在特定目录结构中用于云模糊测试 |
| 技能 | 关系 |
|---|---|
| coverage-analysis | 测量工具有效性——是否到达目标代码? |
| address-sanitizer | 检测工具发现的错误(缓冲区溢出、释放后使用) |
| fuzzing-dictionary | 提供令牌帮助模糊测试器通过工具中的格式检查 |
| fuzzing-obstacles | 当SUT违反工具规则时修补它(退出、非确定性) |
在libFuzzer中拆分输入 - Google模糊测试文档 解释在单个模糊测试工具中处理多个输入参数的技术,包括使用魔术分隔符和FuzzedDataProvider。
使用Protocol Buffers进行结构感知模糊测试 使用protobuf作为中间格式并配合自定义变异器的高级技术,确保模糊测试器变异消息内容而不是格式编码。
libFuzzer文档 官方LLVM文档,涵盖工具要求、最佳实践和高级功能。
cargo-fuzz手册 使用cargo-fuzz和arbitrary crate编写Rust模糊测试工具的全面指南。
每周安装
1.1K
仓库
GitHub星标
3.9K
首次出现
2026年1月19日
安全审计
安装于
claude-code970
opencode926
gemini-cli906
codex901
cursor879
github-copilot847
A fuzzing harness is the entrypoint function that receives random data from the fuzzer and routes it to your system under test (SUT). The quality of your harness directly determines which code paths get exercised and whether critical bugs are found. A poorly written harness can miss entire subsystems or produce non-reproducible crashes.
The harness is the bridge between the fuzzer's random byte generation and your application's API. It must parse raw bytes into meaningful inputs, call target functions, and handle edge cases gracefully. The most important part of any fuzzing setup is the harness—if written poorly, critical parts of your application may not be covered.
| Concept | Description |
|---|---|
| Harness | Function that receives fuzzer input and calls target code under test |
| SUT | System Under Test—the code being fuzzed |
| Entry point | Function signature required by the fuzzer (e.g., LLVMFuzzerTestOneInput) |
| FuzzedDataProvider | Helper class for structured extraction of typed data from raw bytes |
| Determinism | Property that ensures same input always produces same behavior |
| Interleaved fuzzing | Single harness that exercises multiple operations based on input |
Apply this technique when:
Skip this technique when:
| Task | Pattern |
|---|---|
| Minimal C++ harness | extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) |
| Minimal Rust harness | `fuzz_target!( |
| Size validation | if (size < MIN_SIZE) return 0; |
| Cast to integers | uint32_t val = *(uint32_t*)(data); |
| Use FuzzedDataProvider | FuzzedDataProvider fuzzed_data(data, size); |
| Extract typed data (C++) | auto val = fuzzed_data.ConsumeIntegral<uint32_t>(); |
Find functions in your codebase that:
Good targets are typically:
Start with the simplest possible harness that calls your target function:
C/C++:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
target_function(data, size);
return 0;
}
Rust:
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
target_function(data);
});
Reject inputs that are too small or too large to be meaningful:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Ensure minimum size for meaningful input
if (size < MIN_INPUT_SIZE || size > MAX_INPUT_SIZE) {
return 0;
}
target_function(data, size);
return 0;
}
Rationale: The fuzzer generates random inputs of all sizes. Your harness must handle empty, tiny, huge, or malformed inputs without causing unexpected issues in the harness itself (crashes in the SUT are fine—that's what we're looking for).
For APIs that require typed data (integers, strings, etc.), use casting or helpers like FuzzedDataProvider:
Simple casting:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size != 2 * sizeof(uint32_t)) {
return 0;
}
uint32_t numerator = *(uint32_t*)(data);
uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));
divide(numerator, denominator);
return 0;
}
Using FuzzedDataProvider:
#include "FuzzedDataProvider.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
FuzzedDataProvider fuzzed_data(data, size);
size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
return 0;
}
Run the fuzzer and monitor:
Iterate on the harness to improve these metrics.
Use Case: When target expects primitive types like integers or floats
Implementation:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Ensure exactly 2 4-byte numbers
if (size != 2 * sizeof(uint32_t)) {
return 0;
}
// Split input into two integers
uint32_t numerator = *(uint32_t*)(data);
uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));
divide(numerator, denominator);
return 0;
}
Rust equivalent:
fuzz_target!(|data: &[u8]| {
if data.len() != 2 * std::mem::size_of::<i32>() {
return;
}
let numerator = i32::from_ne_bytes([data[0], data[1], data[2], data[3]]);
let denominator = i32::from_ne_bytes([data[4], data[5], data[6], data[7]]);
divide(numerator, denominator);
});
Why it works: Any 8-byte input is valid. The fuzzer learns that inputs must be exactly 8 bytes, and every bit flip produces a new, potentially interesting input.
Use Case: When target requires multiple strings, integers, or variable-length data
Implementation:
#include "FuzzedDataProvider.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
FuzzedDataProvider fuzzed_data(data, size);
// Extract different types of data
size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
// Consume variable-length strings with terminator
std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
if (result != NULL) {
free(result);
}
return 0;
}
Why it helps: FuzzedDataProvider handles the complexity of extracting structured data from a byte stream. It's particularly useful for APIs that need multiple parameters of different types.
Use Case: When multiple related operations should be tested in a single harness
Implementation:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size < 1 + 2 * sizeof(int32_t)) {
return 0;
}
// First byte selects operation
uint8_t mode = data[0];
// Next bytes are operands
int32_t numbers[2];
memcpy(numbers, data + 1, 2 * sizeof(int32_t));
int32_t result = 0;
switch (mode % 4) {
case 0:
result = add(numbers[0], numbers[1]);
break;
case 1:
result = subtract(numbers[0], numbers[1]);
break;
case 2:
result = multiply(numbers[0], numbers[1]);
break;
case 3:
result = divide(numbers[0], numbers[1]);
break;
}
// Prevent compiler from optimizing away the calls
printf("%d", result);
return 0;
}
Advantages:
When to use:
Use Case: When fuzzing Rust code that uses custom structs
Implementation:
use arbitrary::Arbitrary;
#[derive(Debug, Arbitrary)]
pub struct Name {
data: String
}
impl Name {
pub fn check_buf(&self) {
let data = self.data.as_bytes();
if data.len() > 0 && data[0] == b'a' {
if data.len() > 1 && data[1] == b'b' {
if data.len() > 2 && data[2] == b'c' {
process::abort();
}
}
}
}
}
Harness with arbitrary:
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: your_project::Name| {
data.check_buf();
});
Add to Cargo.toml:
[dependencies]
arbitrary = { version = "1", features = ["derive"] }
Why it helps: The arbitrary crate automatically handles deserialization of raw bytes into your Rust structs, reducing boilerplate and ensuring valid struct construction.
Limitation: The arbitrary crate doesn't offer reverse serialization, so you can't manually construct byte arrays that map to specific structs. This works best when starting from an empty corpus (fine for libFuzzer, problematic for AFL++).
| Tip | Why It Helps |
|---|---|
| Start with parsers | High bug density, clear entry points, easy to harness |
| Mock I/O operations | Prevents hangs from blocking I/O, enables determinism |
| Use FuzzedDataProvider | Simplifies extraction of structured data from raw bytes |
| Reset global state | Ensures each iteration is independent and reproducible |
| Free resources in harness | Prevents memory exhaustion during long campaigns |
| Avoid logging in harness | Logging is slow—fuzzing needs 100s-1000s exec/sec |
| Test harness manually first | Run harness with known inputs before starting campaign |
| Check coverage early | Ensure harness reaches expected code paths |
For highly structured input formats, consider using Protocol Buffers as an intermediate format with custom mutators:
// Define your input format in .proto file
// Use libprotobuf-mutator to generate valid mutations
// This ensures fuzzer mutates message contents, not the protobuf encoding itself
This approach is more setup but prevents the fuzzer from wasting time on unparseable inputs. See structure-aware fuzzing documentation for details.
Problem: Random values or timing dependencies cause non-reproducible crashes.
Solutions:
Replace rand() with deterministic PRNG seeded from fuzzer input:
uint32_t seed = fuzzed_data.ConsumeIntegral<uint32_t>();
srand(seed);
Mock system calls that return time, PIDs, or random data
Avoid reading from /dev/random or /dev/urandom
If your SUT uses global state (singletons, static variables), reset it between iterations:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Reset global state before each iteration
global_reset();
target_function(data, size);
// Clean up resources
global_cleanup();
return 0;
}
Rationale: Global state can cause crashes after N iterations rather than on a specific input, making bugs non-reproducible.
Follow these rules to ensure effective fuzzing harnesses:
| Rule | Rationale |
|---|---|
| Handle all input sizes | Fuzzer generates empty, tiny, huge inputs—harness must handle gracefully |
Never callexit() | Calling exit() stops the fuzzer process. Use abort() in SUT if needed |
| Join all threads | Each iteration must run to completion before next iteration starts |
| Be fast | Aim for 100s-1000s executions/sec. Avoid logging, high complexity, excess memory |
| Maintain determinism | Same input must always produce same behavior for reproducibility |
| Avoid global state | Global state reduces reproducibility—reset between iterations if unavoidable |
Note: These guidelines apply not just to harness code, but to the entire SUT. If the SUT violates these rules, consider patching it (see the fuzzing obstacles technique).
| Anti-Pattern | Problem | Correct Approach |
|---|---|---|
| Global state without reset | Non-deterministic crashes | Reset all globals at start of harness |
| Blocking I/O or network calls | Hangs fuzzer, wastes time | Mock I/O, use in-memory buffers |
| Memory leaks in harness | Resource exhaustion kills campaign | Free all allocations before returning |
Callingexit() in SUT | Stops entire fuzzing process | Use abort() or return error codes |
| Heavy logging in harness | Reduces exec/sec by orders of magnitude | Disable logging during fuzzing |
| Too many operations per iteration |
Harness signature:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Your code here
return 0; // Non-zero return is reserved for future use
}
Compilation:
clang++ -fsanitize=fuzzer,address -g harness.cc -o fuzz_target
Integration tips:
FuzzedDataProvider.h for structured input extraction-fsanitize=fuzzer to link the fuzzing runtime-fsanitize=address,undefined) to detect more bugs-g for better stack traces when crashes occurRunning:
./fuzz_target corpus_dir/
Resources:
AFL++ supports multiple harness styles. For best performance, use persistent mode:
Persistent mode harness:
#include <unistd.h>
int main(int argc, char **argv) {
#ifdef __AFL_HAVE_MANUAL_CONTROL
__AFL_INIT();
#endif
unsigned char buf[MAX_SIZE];
while (__AFL_LOOP(10000)) {
// Read input from stdin
ssize_t len = read(0, buf, sizeof(buf));
if (len <= 0) break;
// Call target function
target_function(buf, len);
}
return 0;
}
Compilation:
afl-clang-fast++ -g harness.cc -o fuzz_target
Integration tips:
__AFL_LOOP) for 10-100x speedup__AFL_INIT()) to skip setup overheadAFL_USE_ASAN=1 or AFL_USE_UBSAN=1 for sanitizer buildsRunning:
afl-fuzz -i seeds/ -o findings/ -- ./fuzz_target
Harness signature:
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
// Your code here
});
With structured input (arbitrary crate):
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: YourStruct| {
data.check();
});
Creating harness:
cargo fuzz init
cargo fuzz add my_target
Integration tips:
arbitrary crate for automatic struct deserializationfuzz/fuzz_targets/ directoryRunning:
cargo +nightly fuzz run my_target
Resources:
Harness signature:
// +build gofuzz
package mypackage
func Fuzz(data []byte) int {
// Call target function
target(data)
// Return codes:
// -1 if input is invalid
// 0 if input is valid but not interesting
// 1 if input is interesting (e.g., added new coverage)
return 0
}
Building:
go-fuzz-build
Integration tips:
Running:
go-fuzz -bin=./mypackage-fuzz.zip -workdir=fuzz
| Issue | Cause | Solution |
|---|---|---|
| Low executions/sec | Harness is too slow (logging, I/O, complexity) | Profile harness, remove bottlenecks, mock I/O |
| No crashes found | Coverage not reaching buggy code | Check coverage, improve harness to reach more paths |
| Non-reproducible crashes | Non-determinism or global state | Remove randomness, reset globals between iterations |
| Fuzzer exits immediately | Harness calls exit() | Replace exit() with abort() or return error |
| Out of memory errors | Memory leaks in harness or SUT |
| Skill | How It Applies |
|---|---|
| libfuzzer | Uses LLVMFuzzerTestOneInput harness signature with FuzzedDataProvider |
| aflpp | Supports persistent mode harnesses with __AFL_LOOP for performance |
| cargo-fuzz | Uses Rust-specific fuzz_target! macro with arbitrary crate integration |
| atheris | Python harness takes bytes, calls Python functions |
| ossfuzz | Requires harnesses in specific directory structure for cloud fuzzing |
| Skill | Relationship |
|---|---|
| coverage-analysis | Measure harness effectiveness—are you reaching target code? |
| address-sanitizer | Detects bugs found by harness (buffer overflows, use-after-free) |
| fuzzing-dictionary | Provide tokens to help fuzzer pass format checks in harness |
| fuzzing-obstacles | Patch SUT when it violates harness rules (exit, non-determinism) |
Split Inputs in libFuzzer - Google Fuzzing Docs Explains techniques for handling multiple input parameters in a single fuzzing harness, including use of magic separators and FuzzedDataProvider.
Structure-Aware Fuzzing with Protocol Buffers Advanced technique using protobuf as intermediate format with custom mutators to ensure fuzzer mutates message contents rather than format encoding.
libFuzzer Documentation Official LLVM documentation covering harness requirements, best practices, and advanced features.
cargo-fuzz Book Comprehensive guide to writing Rust fuzzing harnesses with cargo-fuzz and the arbitrary crate.
Weekly Installs
1.1K
Repository
GitHub Stars
3.9K
First Seen
Jan 19, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
claude-code970
opencode926
gemini-cli906
codex901
cursor879
github-copilot847
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
102,200 周安装
NestJS专家服务 | 企业级TypeScript后端开发与架构设计
1,000 周安装
安全代码卫士:AI驱动的安全编码指南与最佳实践,防止SQL注入、XSS攻击
1,000 周安装
ESLint迁移到Oxlint完整指南:JavaScript/TypeScript项目性能优化工具
1,000 周安装
Chrome CDP 命令行工具:轻量级浏览器自动化,支持截图、执行JS、无障碍快照
1,000 周安装
Sanity内容建模最佳实践:结构化内容设计原则与无头CMS指南
1,000 周安装
AI Sprint规划器 - 敏捷团队Scrum迭代计划工具,自动估算故事点与容量管理
1,000 周安装
| Extract string (C++) | auto str = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF); |
| Don't fuzz PNG and TCP in same harness—different formats need separate targets |
| Free resources | Prevent memory leaks that cause resource exhaustion during long campaigns |
| Slows down fuzzer |
| Keep iterations fast and focused |
| Mixing unrelated input formats | Corpus entries not useful across formats | Separate harnesses for different formats |
| Not validating input size | Harness crashes on edge cases | Check size before accessing data |
| Free allocations, use leak sanitizer to find leaks |
| Crashes on empty input | Harness doesn't validate size | Add if (size < MIN_SIZE) return 0; |
| Corpus not growing | Inputs too constrained or format too strict | Use FuzzedDataProvider or structure-aware fuzzing |