semgrep-rule-creator by trailofbits/skills
npx skills add https://github.com/trailofbits/skills --skill semgrep-rule-creator创建具备完善测试和验证的生产级 Semgrep 规则。
理想场景:
请勿将此技能用于:
static-analysis 技能)编写 Semgrep 规则时,请拒绝以下常见简化理由:
semgrep --test --config <rule-id>.yaml <rule-id>.<ext> 进行验证。未经测试的规则存在隐藏的误报/漏报。- 匹配所有内容,对检测无用:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
# 错误:匹配任何函数调用
pattern: $FUNC(...)
# 正确:特定的危险函数
pattern: eval(...)
测试中缺少安全案例 - 导致未检测到的误报:
# 错误:仅测试漏洞案例
# ruleid: my-rule
dangerous(user_input)
# 正确:包含安全案例以验证无误报
# ruleid: my-rule
dangerous(user_input)
# ok: my-rule
dangerous(sanitize(user_input))
# ok: my-rule
dangerous("hardcoded_safe_value")
模式过于具体 - 遗漏变体:
# 错误:仅匹配精确格式
pattern: os.system("rm " + $VAR)
# 正确:通过污点跟踪匹配所有 os.system 调用
mode: taint
pattern-sources:
- pattern: input(...)
pattern-sinks:
- pattern: os.system(...)
此工作流程是严格的 - 不得跳过步骤:
languages: generic)todook 和 todoruleid 测试注解:测试文件中禁止使用用于未来规则改进的 todoruleid: <rule-id> 和 todook: <rule-id> 注解本技能指导创建用于检测安全漏洞和代码模式的 Semgrep 规则。规则是迭代创建的:分析问题、先编写测试、分析 AST 结构、编写规则、迭代直到所有测试通过、优化规则。
方法选择:
为何优先使用污点分析模式? 模式匹配能找到语法但会遗漏上下文。模式 eval($X) 会同时匹配 eval(user_input)(易受攻击)和 eval("safe_literal")(安全)。污点分析模式跟踪数据流,因此仅当不受信任的数据实际到达接收器时才发出警报——显著减少了注入漏洞的误报。
方法间迭代:可以尝试实验。如果从污点分析模式开始但效果不佳(例如,污点未按预期传播,误报/漏报过多),可切换到模式匹配。反之,如果模式匹配在安全案例上产生过多误报,可尝试污点分析模式。目标是获得有效的规则——而非僵化地坚持一种方法。
输出结构 - 在以后缀规则 ID 命名的目录中恰好包含 2 个文件:
<rule-id>/
├── <rule-id>.yaml # Semgrep 规则
└── <rule-id>.<ext> # 带有 ruleid/ok 注解的测试文件
rules:
- id: insecure-eval
languages: [python]
severity: HIGH
message: 传递给 eval() 的用户输入允许代码执行
mode: taint
pattern-sources:
- pattern: request.args.get(...)
pattern-sinks:
- pattern: eval(...)
测试文件 (insecure-eval.py):
# ruleid: insecure-eval
eval(request.args.get('code'))
# ok: insecure-eval
eval("print('safe')")
运行测试(从规则目录):semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
复制此清单并跟踪进度:
Semgrep 规则进度:
- [ ] 步骤 1:分析问题
- [ ] 步骤 2:先编写测试
- [ ] 步骤 3:分析 AST 结构
- [ ] 步骤 4:编写规则
- [ ] 步骤 5:迭代直到所有测试通过 (semgrep --test)
- [ ] 步骤 6:优化规则(移除冗余,重新测试)
- [ ] 步骤 7:最终运行
必需:在编写任何规则之前,请使用 WebFetch 阅读以下 7 个 Semgrep 文档链接的所有内容:
每周安装量
1.1K
代码仓库
GitHub 星标数
3.9K
首次出现
2026年1月19日
安全审计
安装于
claude-code995
opencode954
gemini-cli933
codex927
cursor901
github-copilot871
Create production-quality Semgrep rules with proper testing and validation.
Ideal scenarios:
Do NOT use this skill for:
static-analysis skill)When writing Semgrep rules, reject these common shortcuts:
semgrep --test --config <rule-id>.yaml <rule-id>.<ext> to verify. Untested rules have hidden false positives/negatives.Too broad - matches everything, useless for detection:
# BAD: Matches any function call
pattern: $FUNC(...)
# GOOD: Specific dangerous function
pattern: eval(...)
Missing safe cases in tests - leads to undetected false positives:
# BAD: Only tests vulnerable case
# ruleid: my-rule
dangerous(user_input)
# GOOD: Include safe cases to verify no false positives
# ruleid: my-rule
dangerous(user_input)
# ok: my-rule
dangerous(sanitize(user_input))
# ok: my-rule
dangerous("hardcoded_safe_value")
Overly specific patterns - misses variations:
# BAD: Only matches exact format
pattern: os.system("rm " + $VAR)
# GOOD: Matches all os.system calls with taint tracking
mode: taint
pattern-sources:
- pattern: input(...)
pattern-sinks:
- pattern: os.system(...)
This workflow is strict - do not skip steps:
languages: generic)todook and todoruleid test annotations: todoruleid: <rule-id> and todook: <rule-id> annotations in tests files for future rule improvements are forbiddenThis skill guides creation of Semgrep rules that detect security vulnerabilities and code patterns. Rules are created iteratively: analyze the problem, write tests first, analyze AST structure, write the rule, iterate until all tests pass, optimize the rule.
Approach selection:
Why prioritize taint mode? Pattern matching finds syntax but misses context. A pattern eval($X) matches both eval(user_input) (vulnerable) and eval("safe_literal") (safe). Taint mode tracks data flow, so it only alerts when untrusted data actually reaches the sink—dramatically reducing false positives for injection vulnerabilities.
Iterating between approaches: It's okay to experiment. If you start with taint mode and it's not working well (e.g., taint doesn't propagate as expected, too many false positives/negatives), switch to pattern matching. Conversely, if pattern matching produces too many false positives on safe cases, try taint mode instead. The goal is a working rule—not rigid adherence to one approach.
Output structure - exactly 2 files in a directory named after the rule-id:
<rule-id>/
├── <rule-id>.yaml # Semgrep rule
└── <rule-id>.<ext> # Test file with ruleid/ok annotations
rules:
- id: insecure-eval
languages: [python]
severity: HIGH
message: User input passed to eval() allows code execution
mode: taint
pattern-sources:
- pattern: request.args.get(...)
pattern-sinks:
- pattern: eval(...)
Test file (insecure-eval.py):
# ruleid: insecure-eval
eval(request.args.get('code'))
# ok: insecure-eval
eval("print('safe')")
Run tests (from rule directory): semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
Copy this checklist and track progress:
Semgrep Rule Progress:
- [ ] Step 1: Analyze the Problem
- [ ] Step 2: Write Tests First
- [ ] Step 3: Analyze AST structure
- [ ] Step 4: Write the rule
- [ ] Step 5: Iterate until all tests pass (semgrep --test)
- [ ] Step 6: Optimize the rule (remove redundancies, re-test)
- [ ] Step 7: Final Run
REQUIRED : Before writing any rule, use WebFetch to read all of these 7 links with Semgrep documentation:
Weekly Installs
1.1K
Repository
GitHub Stars
3.9K
First Seen
Jan 19, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
claude-code995
opencode954
gemini-cli933
codex927
cursor901
github-copilot871
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
102,200 周安装
NestJS专家服务 | 企业级TypeScript后端开发与架构设计
1,000 周安装
安全代码卫士:AI驱动的安全编码指南与最佳实践,防止SQL注入、XSS攻击
1,000 周安装
ESLint迁移到Oxlint完整指南:JavaScript/TypeScript项目性能优化工具
1,000 周安装
Chrome CDP 命令行工具:轻量级浏览器自动化,支持截图、执行JS、无障碍快照
1,000 周安装
Sanity内容建模最佳实践:结构化内容设计原则与无头CMS指南
1,000 周安装
AI Sprint规划器 - 敏捷团队Scrum迭代计划工具,自动估算故事点与容量管理
1,000 周安装