npx skills add https://github.com/trailofbits/skills --skill codeql支持的语言:Python、JavaScript/TypeScript、Go、Java/Kotlin、C/C++、C#、Ruby、Swift。
技能资源: 参考文件和模板位于 {baseDir}/references/ 和 {baseDir}/workflows/。
数据库质量不容妥协。 能够构建的数据库并不自动就是好的。始终运行质量评估(文件计数、基线代码行数、提取器错误)并与预期的源文件进行比较。缓存的构建会产生零有用的提取。
数据扩展能捕获 CodeQL 遗漏的内容。 即使使用标准框架(Django、Spring、Express)的项目,在数据库调用、请求解析或 shell 执行周围也有自定义包装器。跳过创建数据扩展的工作流意味着会遗漏项目特定代码路径中的漏洞。
显式的套件引用可防止静默丢弃查询。 切勿将包名称直接传递给 codeql database analyze —— 每个包的 defaultSuiteFile 会应用隐藏过滤器,可能导致零结果。始终生成自定义的 .qls 套件文件。
零发现需要调查,而非庆祝。 零结果可能表明数据库质量差、缺少模型、查询包错误或套件静默过滤。在报告干净结果之前先进行调查。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
macOS Apple Silicon 需要针对编译语言使用变通方法。 退出码 137 是 arm64e/arm64 不匹配,而非构建失败。在回退到 build-mode=none 之前,尝试 Homebrew arm64 工具链或 Rosetta。
按步骤执行工作流。 一旦选定工作流,请逐步执行,不要跳过阶段。每个阶段都制约着下一个阶段——跳过质量评估或数据扩展会导致分析不完整。
所有生成的文件(数据库、构建日志、诊断信息、扩展、结果)都存储在一个输出目录中。
OUTPUT_DIR。./static_analysis_codeql_1。如果该目录已存在,则递增到 _2、_3 等。无论哪种情况,在写入任何文件之前,始终使用 mkdir -p 创建目录。
# 解析输出目录
if [ -n "$USER_SPECIFIED_DIR" ]; then
OUTPUT_DIR="$USER_SPECIFIED_DIR"
else
BASE="static_analysis_codeql"
N=1
while [ -e "${BASE}_${N}" ]; do
N=$((N + 1))
done
OUTPUT_DIR="${BASE}_${N}"
fi
mkdir -p "$OUTPUT_DIR"
输出目录在任何工作流执行之前只解析一次。所有工作流都接收 $OUTPUT_DIR 并将其产物存储在那里:
$OUTPUT_DIR/
├── rulesets.txt # 选定的查询包(步骤3后记录)
├── codeql.db/ # CodeQL 数据库(包含 codeql-database.yml 的目录)
├── build.log # 构建日志
├── codeql-config.yml # 排除配置(解释型语言)
├── diagnostics/ # 诊断查询和 CSV 文件
├── extensions/ # 数据扩展 YAML 文件
├── raw/ # 未经过滤的分析输出
│ ├── results.sarif
│ └── <mode>.qls
└── results/ # 最终结果(针对仅重要结果进行过滤,针对全部运行进行复制)
└── results.sarif
CodeQL 数据库通过其目录中存在 codeql-database.yml 标记文件来识别。在搜索现有数据库时,始终收集所有匹配项——可能存在来自先前运行或针对不同语言的多个数据库。
发现命令:
# 查找所有 CodeQL 数据库(顶层目录和一层子目录深度)
find . -maxdepth 3 -name "codeql-database.yml" -not -path "*/\.*" 2>/dev/null \
| while read -r yml; do dirname "$yml"; done
$OUTPUT_DIR 内: find "$OUTPUT_DIR" -maxdepth 2 -name "codeql-database.yml"find . -maxdepth 3 -name "codeql-database.yml" —— 覆盖项目顶层目录(./db-name/)和一层子目录深度(./subdir/db-name/)的数据库。不搜索更深层目录。切勿假设数据库名为 codeql.db —— 通过其标记文件来发现它。
当发现多个数据库时:
对于每个发现的数据库,收集元数据以帮助用户选择:
# 对于每个数据库,提取语言和创建时间
for db in $FOUND_DBS; do
CODEQL_LANG=$(codeql resolve database --format=json -- "$db" 2>/dev/null | jq -r '.languages[0]')
CREATED=$(grep '^creationMetadata:' -A5 "$db/codeql-database.yml" 2>/dev/null | grep 'creationTime' | awk '{print $2}')
echo "$db — language: $CODEQL_LANG, created: $CREATED"
done
然后使用 AskUserQuestion 让用户选择要使用的数据库,或者构建一个新的数据库。如果用户在提示中明确说明了要使用哪个数据库或要构建新数据库,则跳过 AskUserQuestion。
对于常见情况("扫描此代码库以查找漏洞"):
# 1. 验证 CodeQL 是否已安装
if ! command -v codeql >/dev/null 2>&1; then
echo "NOT INSTALLED: codeql binary not found on PATH"
else
codeql --version || echo "ERROR: codeql found but --version failed (check installation)"
fi
# 2. 解析输出目录
BASE="static_analysis_codeql"; N=1
while [ -e "${BASE}_${N}" ]; do N=$((N + 1)); done
OUTPUT_DIR="${BASE}_${N}"; mkdir -p "$OUTPUT_DIR"
然后使用下面的工作流执行完整流水线:构建数据库 → 创建数据扩展 → 运行分析。
这些捷径会导致遗漏发现。不要接受它们:
security-extended 完全遗漏的类别。arm64e/arm64 不匹配引起的,而不是根本性的构建失败。参见 macos-arm64e-workaround.md。defaultSuiteFile 会应用隐藏过滤器,可能导致零结果。始终使用显式的套件引用。$OUTPUT_DIR 中。将文件散落在工作目录中会使清理变得不可能,并可能覆盖之前的运行结果。此技能有三个工作流。一旦选定工作流,请逐步执行,不要跳过阶段。
| 工作流 | 目的 |
|---|---|
| build-database | 按顺序使用构建方法创建 CodeQL 数据库 |
| create-data-extensions | 检测或生成项目 API 的数据扩展模型 |
| run-analysis | 选择规则集、执行查询、处理结果 |
如果用户明确指定要做什么(例如,"构建一个数据库"、"对 ./my-db 运行分析"),则直接执行该工作流。如果用户的提示已经清楚地表明了他们的意图,则不要为数据库选择调用 AskUserQuestion —— 例如,"构建一个新数据库"、"分析 static_analysis_codeql_2 中的 codeql 数据库"、"从头开始运行完整扫描"。
针对"测试"、"扫描"、"分析"或类似指令的默认流水线: 首先发现现有数据库,然后决定。
# 通过查找 codeql-database.yml 标记文件来查找所有 CodeQL 数据库
# 搜索顶层目录和一层子目录深度
FOUND_DBS=()
while IFS= read -r yml; do
db_dir=$(dirname "$yml")
codeql resolve database -- "$db_dir" >/dev/null 2>&1 && FOUND_DBS+=("$db_dir")
done < <(find . -maxdepth 3 -name "codeql-database.yml" -not -path "*/\.*" 2>/dev/null)
echo "Found ${#FOUND_DBS[@]} existing database(s)"
| 条件 | 操作 |
|---|---|
| 未找到数据库 | 解析新的 $OUTPUT_DIR,执行 构建 → 扩展 → 分析(完整流水线) |
| 找到一个数据库 | 使用 AskUserQuestion:重用它还是构建新的? |
| 找到多个数据库 | 使用 AskUserQuestion:列出所有带元数据的数据库,让用户选择一个或构建新的 |
| 用户明确说明了意图 | 跳过 AskUserQuestion,直接根据他们的指示操作 |
当发现现有数据库且用户未明确指定使用哪个时,通过 AskUserQuestion 呈现:
header: "Existing CodeQL Databases"
question: "I found existing CodeQL database(s). What would you like to do?"
options:
- label: "<db_path_1> (language: python, created: 2026-02-24)"
description: "Reuse this database"
- label: "<db_path_2> (language: cpp, created: 2026-02-23)"
description: "Reuse this database"
- label: "Build a new database"
description: "Create a fresh database in a new output directory"
选择之后:
$OUTPUT_DIR 设置为其父目录(或包含它的目录),将 $DB_NAME 设置为所选路径,然后继续执行 扩展 → 分析。$OUTPUT_DIR,执行 构建 → 扩展 → 分析。如果用户的意图不明确(既未明确数据库选择,也未明确工作流),则询问:
I can help with CodeQL analysis. What would you like to do?
1. **Full scan (Recommended)** - Build database, create extensions, then run analysis
2. **Build database** - Create a new CodeQL database from this codebase
3. **Create data extensions** - Generate custom source/sink models for project APIs
4. **Run analysis** - Run security queries on existing database
[If databases found: "I found N existing database(s): <list paths with language>"]
[Show output directory: "Output will be stored in <OUTPUT_DIR>"]
| 文件 | 内容 |
|---|---|
| 工作流 | |
| workflows/build-database.md | 使用构建方法序列创建数据库 |
| workflows/create-data-extensions.md | 数据扩展生成流水线 |
| workflows/run-analysis.md | 查询执行和结果处理 |
| 参考资料 | |
| references/macos-arm64e-workaround.md | Apple Silicon 构建跟踪变通方法 |
| references/build-fixes.md | 构建失败修复目录 |
| references/quality-assessment.md | 数据库质量指标和改进 |
| references/extension-yaml-format.md | 数据扩展 YAML 列定义和示例 |
| references/sarif-processing.md | 用于 SARIF 输出处理的 jq 命令 |
| references/diagnostic-query-templates.md | 用于源/汇枚举的 QL 查询 |
| references/important-only-suite.md | 仅重要套件模板和生成 |
| references/run-all-suite.md | 全部运行套件模板 |
| references/ruleset-catalog.md | 按语言划分的可用查询包 |
| references/threat-models.md | 威胁模型配置 |
| references/language-details.md | 特定于语言的构建和提取细节 |
| references/performance-tuning.md | 内存、线程和超时配置 |
一次完整的 CodeQL 分析运行应满足:
$OUTPUT_DIR 内codeql-database.yml 标记发现)且通过质量评估(基线代码行数 > 0,错误 < 5%)$OUTPUT_DIR/extensions/ 中创建,要么有明确理由说明并跳过$OUTPUT_DIR/rulesets.txt$OUTPUT_DIR/raw/results.sarif 中$OUTPUT_DIR/results/results.sarif 中(针对仅重要结果进行过滤,针对全部运行进行复制)$OUTPUT_DIR/build.log 中,包含所有命令、修复和质量评估每周安装量
1.3K
代码库
GitHub 星标数
3.9K
首次出现
Jan 19, 2026
安全审计
安装于
claude-code1.2K
codex1.1K
opencode1.1K
gemini-cli1.1K
cursor1.0K
github-copilot985
Supported languages: Python, JavaScript/TypeScript, Go, Java/Kotlin, C/C++, C#, Ruby, Swift.
Skill resources: Reference files and templates are located at {baseDir}/references/ and {baseDir}/workflows/.
Database quality is non-negotiable. A database that builds is not automatically good. Always run quality assessment (file counts, baseline LoC, extractor errors) and compare against expected source files. A cached build produces zero useful extraction.
Data extensions catch what CodeQL misses. Even projects using standard frameworks (Django, Spring, Express) have custom wrappers around database calls, request parsing, or shell execution. Skipping the create-data-extensions workflow means missing vulnerabilities in project-specific code paths.
Explicit suite references prevent silent query dropping. Never pass pack names directly to codeql database analyze — each pack's defaultSuiteFile applies hidden filters that can produce zero results. Always generate a custom .qls suite file.
Zero findings needs investigation, not celebration. Zero results can indicate poor database quality, missing models, wrong query packs, or silent suite filtering. Investigate before reporting clean.
macOS Apple Silicon requires workarounds for compiled languages. Exit code 137 is arm64e/arm64 mismatch, not a build failure. Try Homebrew arm64 tools or Rosetta before falling back to build-mode=none.
Follow workflows step by step. Once a workflow is selected, execute it step by step without skipping phases. Each phase gates the next — skipping quality assessment or data extensions leads to incomplete analysis.
All generated files (database, build logs, diagnostics, extensions, results) are stored in a single output directory.
OUTPUT_DIR../static_analysis_codeql_1. If that already exists, increment to _2, _3, etc.In both cases, always create the directory with mkdir -p before writing any files.
# Resolve output directory
if [ -n "$USER_SPECIFIED_DIR" ]; then
OUTPUT_DIR="$USER_SPECIFIED_DIR"
else
BASE="static_analysis_codeql"
N=1
while [ -e "${BASE}_${N}" ]; do
N=$((N + 1))
done
OUTPUT_DIR="${BASE}_${N}"
fi
mkdir -p "$OUTPUT_DIR"
The output directory is resolved once at the start before any workflow executes. All workflows receive $OUTPUT_DIR and store their artifacts there:
$OUTPUT_DIR/
├── rulesets.txt # Selected query packs (logged after Step 3)
├── codeql.db/ # CodeQL database (dir containing codeql-database.yml)
├── build.log # Build log
├── codeql-config.yml # Exclusion config (interpreted languages)
├── diagnostics/ # Diagnostic queries and CSVs
├── extensions/ # Data extension YAMLs
├── raw/ # Unfiltered analysis output
│ ├── results.sarif
│ └── <mode>.qls
└── results/ # Final results (filtered for important-only, copied for run-all)
└── results.sarif
A CodeQL database is identified by the presence of a codeql-database.yml marker file inside its directory. When searching for existing databases, always collect all matches — there may be multiple databases from previous runs or for different languages.
Discovery command:
# Find ALL CodeQL databases (top-level and one subdirectory deep)
find . -maxdepth 3 -name "codeql-database.yml" -not -path "*/\.*" 2>/dev/null \
| while read -r yml; do dirname "$yml"; done
$OUTPUT_DIR: find "$OUTPUT_DIR" -maxdepth 2 -name "codeql-database.yml"find . -maxdepth 3 -name "codeql-database.yml" — covers databases at the project top level (./db-name/) and one subdirectory deep (./subdir/db-name/). Does not search deeper.Never assume a database is named codeql.db — discover it by its marker file.
When multiple databases are found:
For each discovered database, collect metadata to help the user choose:
# For each database, extract language and creation time
for db in $FOUND_DBS; do
CODEQL_LANG=$(codeql resolve database --format=json -- "$db" 2>/dev/null | jq -r '.languages[0]')
CREATED=$(grep '^creationMetadata:' -A5 "$db/codeql-database.yml" 2>/dev/null | grep 'creationTime' | awk '{print $2}')
echo "$db — language: $CODEQL_LANG, created: $CREATED"
done
Then use AskUserQuestion to let the user select which database to use, or to build a new one. SkipAskUserQuestion if the user explicitly stated which database to use or to build a new one in their prompt.
For the common case ("scan this codebase for vulnerabilities"):
# 1. Verify CodeQL is installed
if ! command -v codeql >/dev/null 2>&1; then
echo "NOT INSTALLED: codeql binary not found on PATH"
else
codeql --version || echo "ERROR: codeql found but --version failed (check installation)"
fi
# 2. Resolve output directory
BASE="static_analysis_codeql"; N=1
while [ -e "${BASE}_${N}" ]; do N=$((N + 1)); done
OUTPUT_DIR="${BASE}_${N}"; mkdir -p "$OUTPUT_DIR"
Then execute the full pipeline: build database → create data extensions → run analysis using the workflows below.
These shortcuts lead to missed findings. Do not accept them:
security-extended misses entirely.arm64e/arm64 mismatch, not a fundamental build failure. See macos-arm64e-workaround.md.defaultSuiteFile applies hidden filters and can produce zero results. Always use an explicit suite reference.This skill has three workflows. Once a workflow is selected, execute it step by step without skipping phases.
| Workflow | Purpose |
|---|---|
| build-database | Create CodeQL database using build methods in sequence |
| create-data-extensions | Detect or generate data extension models for project APIs |
| run-analysis | Select rulesets, execute queries, process results |
If user explicitly specifies what to do (e.g., "build a database", "run analysis on ./my-db"), execute that workflow directly. Do NOT callAskUserQuestion for database selection if the user's prompt already makes their intent clear — e.g., "build a new database", "analyze the codeql database in static_analysis_codeql_2", "run a full scan from scratch".
Default pipeline for "test", "scan", "analyze", or similar: Discover existing databases first, then decide.
# Find ALL CodeQL databases by looking for codeql-database.yml marker file
# Search top-level dirs and one subdirectory deep
FOUND_DBS=()
while IFS= read -r yml; do
db_dir=$(dirname "$yml")
codeql resolve database -- "$db_dir" >/dev/null 2>&1 && FOUND_DBS+=("$db_dir")
done < <(find . -maxdepth 3 -name "codeql-database.yml" -not -path "*/\.*" 2>/dev/null)
echo "Found ${#FOUND_DBS[@]} existing database(s)"
| Condition | Action |
|---|---|
| No databases found | Resolve new $OUTPUT_DIR, execute build → extensions → analysis (full pipeline) |
| One database found | Use AskUserQuestion: reuse it or build new? |
| Multiple databases found | Use AskUserQuestion: list all with metadata, let user pick one or build new |
| User explicitly stated intent | Skip AskUserQuestion, act on their instructions directly |
When existing databases are found and the user did not explicitly specify which to use , present via AskUserQuestion:
header: "Existing CodeQL Databases"
question: "I found existing CodeQL database(s). What would you like to do?"
options:
- label: "<db_path_1> (language: python, created: 2026-02-24)"
description: "Reuse this database"
- label: "<db_path_2> (language: cpp, created: 2026-02-23)"
description: "Reuse this database"
- label: "Build a new database"
description: "Create a fresh database in a new output directory"
After selection:
$OUTPUT_DIR to its parent directory (or the directory containing it), set $DB_NAME to the selected path, then proceed to extensions → analysis.$OUTPUT_DIR, execute build → extensions → analysis.If the user's intent is ambiguous (neither database selection nor workflow is clear), ask:
I can help with CodeQL analysis. What would you like to do?
1. **Full scan (Recommended)** - Build database, create extensions, then run analysis
2. **Build database** - Create a new CodeQL database from this codebase
3. **Create data extensions** - Generate custom source/sink models for project APIs
4. **Run analysis** - Run security queries on existing database
[If databases found: "I found N existing database(s): <list paths with language>"]
[Show output directory: "Output will be stored in <OUTPUT_DIR>"]
| File | Content |
|---|---|
| Workflows | |
| workflows/build-database.md | Database creation with build method sequence |
| workflows/create-data-extensions.md | Data extension generation pipeline |
| workflows/run-analysis.md | Query execution and result processing |
| References | |
| references/macos-arm64e-workaround.md | Apple Silicon build tracing workarounds |
| references/build-fixes.md | Build failure fix catalog |
A complete CodeQL analysis run should satisfy:
$OUTPUT_DIRcodeql-database.yml marker) with quality assessment passed (baseline LoC > 0, errors < 5%)$OUTPUT_DIR/extensions/ or explicitly skipped with justification$OUTPUT_DIR/rulesets.txt$OUTPUT_DIR/raw/results.sarif$OUTPUT_DIR/results/results.sarif (filtered for important-only, copied for run-all)$OUTPUT_DIR/build.log with all commands, fixes, and quality assessmentsWeekly Installs
1.3K
Repository
GitHub Stars
3.9K
First Seen
Jan 19, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
claude-code1.2K
codex1.1K
opencode1.1K
gemini-cli1.1K
cursor1.0K
github-copilot985
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
102,200 周安装
$OUTPUT_DIR. Scattering files in the working directory makes cleanup impossible and risks overwriting previous runs.| Database quality metrics and improvements |
| references/extension-yaml-format.md | Data extension YAML column definitions and examples |
| references/sarif-processing.md | jq commands for SARIF output processing |
| references/diagnostic-query-templates.md | QL queries for source/sink enumeration |
| references/important-only-suite.md | Important-only suite template and generation |
| references/run-all-suite.md | Run-all suite template |
| references/ruleset-catalog.md | Available query packs by language |
| references/threat-models.md | Threat model configuration |
| references/language-details.md | Language-specific build and extraction details |
| references/performance-tuning.md | Memory, threading, and timeout configuration |