npx skills add https://github.com/trailofbits/skills --skill semgrep通过自动语言检测、Task 子代理并行执行以及合并的 SARIF 输出,运行 Semgrep 扫描。
--metrics=off — Semgrep 默认发送遥测数据;--config auto 也会回传数据。每个 semgrep 命令都必须包含 --metrics=off,以防止安全审计期间的数据泄露。广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
semgrep-rule-creator 技能semgrep-rule-variant-creator 技能所有扫描结果、SARIF 文件和临时数据都存储在一个输出目录中。
OUTPUT_DIR。./static_analysis_semgrep_1。如果该目录已存在,则递增为 _2、_3 等。无论哪种情况,在写入任何文件之前,始终使用 mkdir -p 创建目录。
# 解析输出目录
if [ -n "$USER_SPECIFIED_DIR" ]; then
OUTPUT_DIR="$USER_SPECIFIED_DIR"
else
BASE="static_analysis_semgrep"
N=1
while [ -e "${BASE}_${N}" ]; do
N=$((N + 1))
done
OUTPUT_DIR="${BASE}_${N}"
fi
mkdir -p "$OUTPUT_DIR/raw" "$OUTPUT_DIR/results"
输出目录在步骤 1 开始时一次性解析,并在所有后续步骤中使用。
$OUTPUT_DIR/
├── rulesets.txt # 已批准的规则集(步骤 3 后记录)
├── raw/ # 每次扫描的原始输出(未过滤)
│ ├── python-python.json
│ ├── python-python.sarif
│ ├── python-django.json
│ ├── python-django.sarif
│ └── ...
└── results/ # 最终合并的输出
└── results.sarif
必需: Semgrep CLI (semgrep --version)。如果未安装,请参阅 Semgrep 安装文档。
可选: Semgrep Pro — 启用跨文件污点跟踪、过程间分析以及支持更多语言(Apex、C#、Elixir)。使用以下命令检查:
semgrep --pro --validate --config p/default 2>/dev/null && echo "Pro available" || echo "OSS only"
限制: OSS 模式无法跨文件跟踪数据流。Pro 模式使用 -j 1 进行跨文件分析(每个规则集速度较慢,但并行规则集可以弥补)。
在工作流的步骤 2 中选择模式。模式影响扫描器标志和后处理。
| 模式 | 覆盖范围 | 报告的发现项 |
|---|---|---|
| 全部运行 | 所有规则集,所有严重级别 | 全部内容 |
| 仅重要项 | 所有规则集,进行前过滤和后过滤 | 仅安全漏洞,中高置信度/影响 |
仅重要项 模式应用两层过滤器:
--severity MEDIUM --severity HIGH --severity CRITICAL(CLI 标志)category=security、confidence∈{MEDIUM,HIGH}、impact∈{MEDIUM,HIGH}有关元数据标准和 jq 过滤命令,请参阅 scan-modes.md。
┌──────────────────────────────────────────────────────────────────┐
│ MAIN AGENT (this skill) │
│ Step 1: Detect languages + check Pro availability │
│ Step 2: Select scan mode + rulesets (ref: rulesets.md) │
│ Step 3: Present plan + rulesets, get approval [⛔ HARD GATE] │
│ Step 4: Spawn parallel scan Tasks (approved rulesets + mode) │
│ Step 5: Merge results and report │
└──────────────────────────────────────────────────────────────────┘
│ Step 4
▼
┌─────────────────┐
│ Scan Tasks │
│ (parallel) │
├─────────────────┤
│ Python scanner │
│ JS/TS scanner │
│ Go scanner │
│ Docker scanner │
└─────────────────┘
请遵循 scan-workflow.md 中的详细工作流。 摘要:
| 步骤 | 操作 | 关卡 | 关键参考 |
|---|---|---|---|
| 1 | 解析输出目录,检测语言 + Pro 可用性 | — | 使用 Glob,而非 Bash |
| 2 | 选择扫描模式 + 规则集 | — | rulesets.md |
| 3 | 呈现计划,获取明确批准 | ⛔ 硬性 | AskUserQuestion |
| 4 | 生成并行扫描任务 | — | scanner-task-prompt.md |
| 5 | 合并结果并报告 | — | 合并脚本(见下文) |
任务强制执行: 调用时,创建 5 个具有 blockedBy 依赖关系的任务(每个步骤阻塞前一个步骤)。步骤 3 是硬性关卡 — 仅在用户明确批准后标记为完成。
合并命令(步骤 5):
uv run {baseDir}/scripts/merge_sarif.py $OUTPUT_DIR/raw $OUTPUT_DIR/results/results.sarif
| 代理 | 工具 | 用途 |
|---|---|---|
static-analysis:semgrep-scanner | Bash | 为语言类别执行并行 semgrep 扫描 |
在步骤 4 中生成任务子代理时,使用 subagent_type: static-analysis:semgrep-scanner。
| 捷径 | 错误原因 |
|---|---|
| "用户要求扫描,那就是批准" | 原始请求 ≠ 计划批准。需呈现计划,使用 AskUserQuestion,等待明确的“是” |
| "步骤 3 任务是阻塞的,直接标记完成即可" | 谎报任务状态会破坏强制执行机制。仅在真实批准后标记完成 |
| "我已经知道他们想要什么" | 假设会导致扫描错误的目录/规则集。呈现计划以供验证 |
| "直接使用默认规则集" | 用户必须在扫描前查看并批准确切的规则集 |
| "未经询问添加额外规则集" | 未经同意修改已批准的列表会破坏信任 |
| "第三方规则集是可选的" | Trail of Bits、0xdea、Decurity 能捕获官方注册表中没有的漏洞 — 必需 |
"使用 --config auto" | 发送遥测数据;对规则集的控制较少 |
| "一次一个任务" | 破坏了并行性;应一起生成所有任务 |
"Pro 太慢,跳过 --pro" | 跨文件分析能捕获 250% 以上的真正阳性结果;值得花时间 |
| "Semgrep 原生支持 GitHub URL" | 对于具有非标准 YAML 的仓库,URL 处理会失败;始终先克隆 |
| "清理是可选的" | 克隆的仓库会污染用户的工作空间并在多次运行中累积 |
"使用 . 或相对路径作为目标" | 子代理需要绝对路径以避免歧义 |
| "让用户稍后选择输出目录" | 输出目录必须在步骤 1 解析,在任何文件创建之前 |
| 文件 | 内容 |
|---|---|
| rulesets.md | 完整的规则集目录和选择算法 |
| scan-modes.md | 前/后过滤标准和 jq 命令 |
| scanner-task-prompt.md | 用于生成扫描器子代理的模板 |
| 工作流 | 用途 |
| --- | --- |
| scan-workflow.md | 完整的 5 步扫描执行流程 |
$OUTPUT_DIR 内semgrep 命令都使用了 --metrics=off$OUTPUT_DIR/rulesets.txt$OUTPUT_DIR/raw/results.sarif 存在于 $OUTPUT_DIR/results/ 中且是有效的 JSONraw/ 中$OUTPUT_DIR/repos/ 中清理每周安装量
1.4K
仓库
GitHub 星标数
3.9K
首次出现
2026年1月19日
安全审计
安装于
claude-code1.2K
codex1.1K
opencode1.1K
gemini-cli1.1K
cursor1.0K
github-copilot1.0K
Run a Semgrep scan with automatic language detection, parallel execution via Task subagents, and merged SARIF output.
--metrics=off — Semgrep sends telemetry by default; --config auto also phones home. Every semgrep command must include --metrics=off to prevent data leakage during security audits.semgrep-rule-creator skillsemgrep-rule-variant-creator skillAll scan results, SARIF files, and temporary data are stored in a single output directory.
OUTPUT_DIR../static_analysis_semgrep_1. If that already exists, increment to _2, _3, etc.In both cases, always create the directory with mkdir -p before writing any files.
# Resolve output directory
if [ -n "$USER_SPECIFIED_DIR" ]; then
OUTPUT_DIR="$USER_SPECIFIED_DIR"
else
BASE="static_analysis_semgrep"
N=1
while [ -e "${BASE}_${N}" ]; do
N=$((N + 1))
done
OUTPUT_DIR="${BASE}_${N}"
fi
mkdir -p "$OUTPUT_DIR/raw" "$OUTPUT_DIR/results"
The output directory is resolved once at the start of Step 1 and used throughout all subsequent steps.
$OUTPUT_DIR/
├── rulesets.txt # Approved rulesets (logged after Step 3)
├── raw/ # Per-scan raw output (unfiltered)
│ ├── python-python.json
│ ├── python-python.sarif
│ ├── python-django.json
│ ├── python-django.sarif
│ └── ...
└── results/ # Final merged output
└── results.sarif
Required: Semgrep CLI (semgrep --version). If not installed, see Semgrep installation docs.
Optional: Semgrep Pro — enables cross-file taint tracking, inter-procedural analysis, and additional languages (Apex, C#, Elixir). Check with:
semgrep --pro --validate --config p/default 2>/dev/null && echo "Pro available" || echo "OSS only"
Limitations: OSS mode cannot track data flow across files. Pro mode uses -j 1 for cross-file analysis (slower per ruleset, but parallel rulesets compensate).
Select mode in Step 2 of the workflow. Mode affects both scanner flags and post-processing.
| Mode | Coverage | Findings Reported |
|---|---|---|
| Run all | All rulesets, all severity levels | Everything |
| Important only | All rulesets, pre- and post-filtered | Security vulns only, medium-high confidence/impact |
Important only applies two filter layers:
--severity MEDIUM --severity HIGH --severity CRITICAL (CLI flag)category=security, confidence∈{MEDIUM,HIGH}, impact∈{MEDIUM,HIGH}See scan-modes.md for metadata criteria and jq filter commands.
┌──────────────────────────────────────────────────────────────────┐
│ MAIN AGENT (this skill) │
│ Step 1: Detect languages + check Pro availability │
│ Step 2: Select scan mode + rulesets (ref: rulesets.md) │
│ Step 3: Present plan + rulesets, get approval [⛔ HARD GATE] │
│ Step 4: Spawn parallel scan Tasks (approved rulesets + mode) │
│ Step 5: Merge results and report │
└──────────────────────────────────────────────────────────────────┘
│ Step 4
▼
┌─────────────────┐
│ Scan Tasks │
│ (parallel) │
├─────────────────┤
│ Python scanner │
│ JS/TS scanner │
│ Go scanner │
│ Docker scanner │
└─────────────────┘
Follow the detailed workflow inscan-workflow.md. Summary:
| Step | Action | Gate | Key Reference |
|---|---|---|---|
| 1 | Resolve output dir, detect languages + Pro availability | — | Use Glob, not Bash |
| 2 | Select scan mode + rulesets | — | rulesets.md |
| 3 | Present plan, get explicit approval | ⛔ HARD | AskUserQuestion |
| 4 | Spawn parallel scan Tasks | — | scanner-task-prompt.md |
| 5 | Merge results and report | — | Merge script (below) |
Task enforcement: On invocation, create 5 tasks with blockedBy dependencies (each step blocks the previous). Step 3 is a HARD GATE — mark complete ONLY after user explicitly approves.
Merge command (Step 5):
uv run {baseDir}/scripts/merge_sarif.py $OUTPUT_DIR/raw $OUTPUT_DIR/results/results.sarif
| Agent | Tools | Purpose |
|---|---|---|
static-analysis:semgrep-scanner | Bash | Executes parallel semgrep scans for a language category |
Use subagent_type: static-analysis:semgrep-scanner in Step 4 when spawning Task subagents.
| Shortcut | Why It's Wrong |
|---|---|
| "User asked for scan, that's approval" | Original request ≠ plan approval. Present plan, use AskUserQuestion, await explicit "yes" |
| "Step 3 task is blocking, just mark complete" | Lying about task status defeats enforcement. Only mark complete after real approval |
| "I already know what they want" | Assumptions cause scanning wrong directories/rulesets. Present plan for verification |
| "Just use default rulesets" | User must see and approve exact rulesets before scan |
| "Add extra rulesets without asking" | Modifying approved list without consent breaks trust |
| "Third-party rulesets are optional" | Trail of Bits, 0xdea, Decurity catch vulnerabilities not in official registry — REQUIRED |
| "Use --config auto" | Sends metrics; less control over rulesets |
| "One Task at a time" | Defeats parallelism; spawn all Tasks together |
| "Pro is too slow, skip --pro" | Cross-file analysis catches 250% more true positives; worth the time |
| "Semgrep handles GitHub URLs natively" | URL handling fails on repos with non-standard YAML; always clone first |
| "Cleanup is optional" |
| File | Content |
|---|---|
| rulesets.md | Complete ruleset catalog and selection algorithm |
| scan-modes.md | Pre/post-filter criteria and jq commands |
| scanner-task-prompt.md | Template for spawning scanner subagents |
| Workflow | Purpose |
| --- | --- |
| scan-workflow.md | Complete 5-step scan execution process |
$OUTPUT_DIRsemgrep command used --metrics=off$OUTPUT_DIR/rulesets.txt$OUTPUT_DIR/raw/results.sarif exists in $OUTPUT_DIR/results/ and is valid JSONraw/Weekly Installs
1.4K
Repository
GitHub Stars
3.9K
First Seen
Jan 19, 2026
Security Audits
Gen Agent Trust HubPassSocketWarnSnykWarn
Installed on
claude-code1.2K
codex1.1K
opencode1.1K
gemini-cli1.1K
cursor1.0K
github-copilot1.0K
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
102,200 周安装
| Cloned repos pollute the user's workspace and accumulate across runs |
"Use . or relative path as target" | Subagents need absolute paths to avoid ambiguity |
| "Let the user pick an output dir later" | Output directory must be resolved at Step 1, before any files are created |
$OUTPUT_DIR/repos/