npx skills add https://github.com/semgrep/skills --skill semgrep用于安全扫描和自定义规则创建的快速、基于模式的静态分析。
如果您的环境中提供了 Semgrep MCP 工具,请优先使用它们进行扫描:
semgrep_scan — 使用内置规则集扫描代码文件以查找安全漏洞。传入绝对文件路径和可选的配置(例如 p/security-audit、auto)。semgrep_scan_with_custom_rule — 使用您编写的自定义 YAML 规则扫描代码。内联传入代码内容和规则。semgrep_findings — 从 Semgrep AppSec 平台获取仓库的现有发现结果。semgrep_rule_schema — 获取用于编写 Semgrep 规则的完整模式。get_supported_languages — 列出 Semgrep 支持的所有语言。广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
当 MCP 工具不可用时,请回退到下面的 CLI 命令。
理想场景:
# pip(推荐)
python3 -m pip install semgrep
# Homebrew
brew install semgrep
# Docker
docker run --rm -v "${PWD}:/src" semgrep/semgrep semgrep --config auto /src
semgrep --config auto . # 自动检测规则
semgrep --config p/<RULESET> . # 单个规则集
semgrep --config p/security-audit --config p/trailofbits . # 多个规则集
| 规则集 | 描述 |
|---|---|
p/default | 通用安全和代码质量 |
p/security-audit | 全面的安全规则 |
p/owasp-top-ten | OWASP Top 10 漏洞 |
p/cwe-top-25 | CWE Top 25 漏洞 |
p/trailofbits | Trail of Bits 安全规则 |
p/python | Python 专用 |
p/javascript | JavaScript 专用 |
p/golang | Go 专用 |
semgrep --config p/security-audit --sarif -o results.sarif . # SARIF
semgrep --config p/security-audit --json -o results.json . # JSON
semgrep --config p/python app.py # 单个文件
semgrep --config p/javascript src/ # 目录
semgrep --config auto --include='**/test/**' . # 包含测试文件
tests/fixtures/
**/testdata/
generated/
vendor/
node_modules/
password = get_from_vault() # nosemgrep: hardcoded-password
dangerous_but_safe() # nosemgrep
| 方法 | 适用场景 |
|---|---|
| 污点模式 | 数据从未受信任的源流向危险的接收点(注入漏洞) |
| 模式匹配 | 无需数据流要求的语法模式(已弃用的 API、硬编码值) |
对于注入漏洞,优先使用污点模式。仅靠模式匹配无法区分 eval(user_input)(易受攻击)和 eval("safe_literal")(安全)。
rules:
- id: hardcoded-password
languages: [python]
message: "检测到硬编码密码: $PASSWORD"
severity: ERROR
pattern: password = "$PASSWORD"
rules:
- id: command-injection
languages: [python]
message: 用户输入流向命令执行
severity: ERROR
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form[...]
pattern-sinks:
- pattern: os.system(...)
- pattern: subprocess.call($CMD, shell=True, ...)
pattern-sanitizers:
- pattern: shlex.quote(...)
| 语法 | 描述 | 示例 |
|---|---|---|
... | 匹配任何内容 | func(...) |
$VAR | 捕获元变量 | $FUNC($INPUT) |
<... ...> | 深度表达式匹配 | <... user_input ...> |
| 运算符 | 描述 | |
| --- | --- | |
pattern | 匹配精确模式 | |
patterns | 所有模式必须匹配(AND) | |
pattern-either | 任意模式匹配(OR) | |
pattern-not | 排除匹配项 | |
pattern-inside | 仅在上下文中匹配 | |
pattern-not-inside | 仅在上下文外匹配 | |
metavariable-regex | 对捕获的值使用正则表达式 |
必须先进行测试。 使用注解创建测试文件:
# test_rule.py
def test_vulnerable():
user_input = request.args.get("id")
# ruleid: my-rule-id
cursor.execute("SELECT * FROM users WHERE id = " + user_input)
def test_safe():
user_input = request.args.get("id")
# ok: my-rule-id
cursor.execute("SELECT * FROM users WHERE id = ?", (user_input,))
运行测试:
semgrep --test --config rule.yaml test-file
| 任务 | 命令 |
|---|---|
| 运行测试 | semgrep --test --config rule.yaml test-file |
| 验证 YAML | semgrep --validate --config rule.yaml |
| 转储 AST | semgrep --dump-ast -l <lang> <file> |
| 调试污点流 | semgrep --dataflow-traces -f rule.yaml file |
ruleid: 和 ok: 注解semgrep --dump-ast 以理解代码结构输出结构:
<rule-id>/
├── <rule-id>.yaml # Semgrep 规则
└── <rule-id>.<ext> # 测试文件
官方 Semgrep 文档:
本地参考:
过于宽泛:
# 错误:匹配任何函数调用
pattern: $FUNC(...)
# 正确:特定的危险函数
pattern: eval(...)
缺少安全案例:
# 错误:仅测试易受攻击的案例
# ruleid: my-rule
dangerous(user_input)
# 正确:包含安全案例
# ruleid: my-rule
dangerous(user_input)
# ok: my-rule
dangerous(sanitize(user_input))
| 捷径 | 为何错误 |
|---|---|
| "Semgrep 没发现任何问题,代码是干净的" | Semgrep 是基于模式的;无法跟踪复杂的跨函数数据流 |
| "模式看起来完整了" | 未经测试的规则存在隐藏的误报/漏报 |
| "它匹配了易受攻击的案例" | 匹配漏洞只是工作的一半;需确保安全案例不被匹配 |
| "污点模式太小题大做" | 对于注入漏洞,污点模式能提供更好的精确度 |
| "一个测试用例就够了" | 需要包含边界情况:不同的编码风格、经过清理的输入、安全的替代方案 |
name: Semgrep
on:
push:
branches: [main]
pull_request:
schedule:
- cron: '0 0 1 * *'
jobs:
semgrep:
runs-on: ubuntu-latest
container:
image: returntocorp/semgrep
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run Semgrep
run: |
if [ "${{ github.event_name }}" = "pull_request" ]; then
semgrep ci --baseline-commit ${{ github.event.pull_request.base.sha }}
else
semgrep ci
fi
env:
SEMGREP_RULES: >-
p/security-audit
p/owasp-top-ten
p/trailofbits
规则编写:
通用:
每周安装次数
292
仓库
GitHub 星标数
163
首次出现
2026年1月20日
安全审计
安装于
gemini-cli263
codex260
opencode256
github-copilot256
amp238
kimi-cli237
Fast, pattern-based static analysis for security scanning and custom rule creation.
If Semgrep MCP tools are available in your environment, prefer them for scanning:
semgrep_scan — Scan code files for security vulnerabilities using built-in rulesets. Pass absolute file paths and an optional config (e.g., p/security-audit, auto).semgrep_scan_with_custom_rule — Scan code with a custom YAML rule you've written. Pass code content inline along with the rule.semgrep_findings — Fetch existing findings from the Semgrep AppSec Platform for a repository.semgrep_rule_schema — Get the full schema for writing Semgrep rules.get_supported_languages — List all languages Semgrep supports.When MCP tools aren't available, fall back to the CLI commands below.
Ideal scenarios:
# pip (recommended)
python3 -m pip install semgrep
# Homebrew
brew install semgrep
# Docker
docker run --rm -v "${PWD}:/src" semgrep/semgrep semgrep --config auto /src
semgrep --config auto . # Auto-detect rules
semgrep --config p/<RULESET> . # Single ruleset
semgrep --config p/security-audit --config p/trailofbits . # Multiple
| Ruleset | Description |
|---|---|
p/default | General security and code quality |
p/security-audit | Comprehensive security rules |
p/owasp-top-ten | OWASP Top 10 vulnerabilities |
p/cwe-top-25 | CWE Top 25 vulnerabilities |
p/trailofbits | Trail of Bits security rules |
p/python | Python-specific |
semgrep --config p/security-audit --sarif -o results.sarif . # SARIF
semgrep --config p/security-audit --json -o results.json . # JSON
semgrep --config p/python app.py # Single file
semgrep --config p/javascript src/ # Directory
semgrep --config auto --include='**/test/**' . # Include tests
tests/fixtures/
**/testdata/
generated/
vendor/
node_modules/
password = get_from_vault() # nosemgrep: hardcoded-password
dangerous_but_safe() # nosemgrep
| Approach | Use When |
|---|---|
| Taint mode | Data flows from untrusted source to dangerous sink (injection vulnerabilities) |
| Pattern matching | Syntactic patterns without data flow requirements (deprecated APIs, hardcoded values) |
Prioritize taint mode for injection vulnerabilities. Pattern matching alone can't distinguish between eval(user_input) (vulnerable) and eval("safe_literal") (safe).
rules:
- id: hardcoded-password
languages: [python]
message: "Hardcoded password detected: $PASSWORD"
severity: ERROR
pattern: password = "$PASSWORD"
rules:
- id: command-injection
languages: [python]
message: User input flows to command execution
severity: ERROR
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form[...]
pattern-sinks:
- pattern: os.system(...)
- pattern: subprocess.call($CMD, shell=True, ...)
pattern-sanitizers:
- pattern: shlex.quote(...)
| Syntax | Description | Example |
|---|---|---|
... | Match anything | func(...) |
$VAR | Capture metavariable | $FUNC($INPUT) |
<... ...> | Deep expression match | <... user_input ...> |
| Operator | Description |
Test-first is mandatory. Create test files with annotations:
# test_rule.py
def test_vulnerable():
user_input = request.args.get("id")
# ruleid: my-rule-id
cursor.execute("SELECT * FROM users WHERE id = " + user_input)
def test_safe():
user_input = request.args.get("id")
# ok: my-rule-id
cursor.execute("SELECT * FROM users WHERE id = ?", (user_input,))
Run tests:
semgrep --test --config rule.yaml test-file
| Task | Command |
|---|---|
| Run tests | semgrep --test --config rule.yaml test-file |
| Validate YAML | semgrep --validate --config rule.yaml |
| Dump AST | semgrep --dump-ast -l <lang> <file> |
| Debug taint flow | semgrep --dataflow-traces -f rule.yaml file |
ruleid: and ok: annotations before the rulesemgrep --dump-ast to understand code structureOutput structure:
<rule-id>/
├── <rule-id>.yaml # Semgrep rule
└── <rule-id>.<ext> # Test file
Official Semgrep Documentation:
Local References:
Too broad:
# BAD: Matches any function call
pattern: $FUNC(...)
# GOOD: Specific dangerous function
pattern: eval(...)
Missing safe cases:
# BAD: Only tests vulnerable case
# ruleid: my-rule
dangerous(user_input)
# GOOD: Include safe cases
# ruleid: my-rule
dangerous(user_input)
# ok: my-rule
dangerous(sanitize(user_input))
| Shortcut | Why It's Wrong |
|---|---|
| "Semgrep found nothing, code is clean" | Semgrep is pattern-based; can't track complex cross-function data flow |
| "The pattern looks complete" | Untested rules have hidden false positives/negatives |
| "It matches the vulnerable case" | Matching vulnerabilities is half the job; verify safe cases don't match |
| "Taint mode is overkill" | For injection vulnerabilities, taint mode gives better precision |
| "One test case is enough" | Include edge cases: different coding styles, sanitized inputs, safe alternatives |
name: Semgrep
on:
push:
branches: [main]
pull_request:
schedule:
- cron: '0 0 1 * *'
jobs:
semgrep:
runs-on: ubuntu-latest
container:
image: returntocorp/semgrep
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run Semgrep
run: |
if [ "${{ github.event_name }}" = "pull_request" ]; then
semgrep ci --baseline-commit ${{ github.event.pull_request.base.sha }}
else
semgrep ci
fi
env:
SEMGREP_RULES: >-
p/security-audit
p/owasp-top-ten
p/trailofbits
Rule Writing:
General:
Weekly Installs
292
Repository
GitHub Stars
163
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
gemini-cli263
codex260
opencode256
github-copilot256
amp238
kimi-cli237
通过 LiteLLM 代理让 Claude Code 对接 GitHub Copilot 运行 | 高级变通方案指南
22,200 周安装
p/javascript | JavaScript-specific |
p/golang | Go-specific |
| --- | --- |
pattern | Match exact pattern |
patterns | All must match (AND) |
pattern-either | Any matches (OR) |
pattern-not | Exclude matches |
pattern-inside | Match only inside context |
pattern-not-inside | Match only outside context |
metavariable-regex | Regex on captured value |