安全套件 (security-suite) - 二进制与仓库安全测试、行为契约分析与CI门控工具

security-suite by boshu2/agentops

172 周安装量

220 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/boshu2/agentops --skill security-suite

自动化测试安全

🇨🇳中文介绍

安全套件

目的： 为授权的二进制文件和仓库管理的提示界面提供可组合、可重复的安全/内部测试原语。

此技能将关注点分离为原语，以确保安全工作流程保持可测试性和可重用性。

防护措施

仅用于您拥有或明确授权评估的二进制文件。
请勿使用此工作流程绕过法律限制或未经授权提取第三方专有内容。
优先采用行为保证和策略门控，而非临时性的逆向工程。

原语模型

collect-static — 文件元数据、运行时启发式信息、链接库、嵌入式归档签名。
collect-dynamic — 沙盒化执行跟踪（进程、文件更改、网络端点）。
collect-contract — 通过帮助界面探测获取的机器可读行为契约。
compare-baseline — 当前契约与基线契约的漂移对比（新增/移除的命令、运行时变更）。
enforce-policy — 允许列表/拒绝列表门控以及基于严重程度的裁决。
collect-redteam — 针对提示注入、工具滥用、秘密泄露和不安全 shell 回归的离线仓库界面攻击包扫描。

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

红队测试包模型

使用 agentops-redteam-pack.json 作为离线仓库界面红队测试检查的起点。

支持的目标字段：

globs
require_groups
forbidden_any
applies_if_any

每个案例都表示一个具体的对抗性提示或操作员绕过尝试，并将其绑定到一个或多个仓库拥有的文件。首个发布的包涵盖了指令优先级、上下文过度暴露、破坏性的 git 滥用、安全门控绕过以及不安全的 shell 或秘密处理回归。

此套件设计用于广泛的二进制文件类别，而不仅仅是 CLI 元数据：

静态运行时/库指纹识别
沙盒化行为观察
命令/契约捕获
漂移分类
策略执行和 CI 裁决
针对提示和操作员契约回归的仓库界面红队测试检查

它特意采用模块化设计，以便您以后可以添加更深层的原语（系统调用跟踪、SBOM 证明验证、模糊测试工具），而无需重写工作流程。

bash skills/security-suite/scripts/validate.sh
bash tests/scripts/test-security-suite-redteam.sh

冒烟测试（推荐）：

python3 skills/security-suite/scripts/security_suite.py run \
  --binary "$(command -v ao)" \
  --out-dir .tmp/security-suite-smoke \
  --policy-file skills/security-suite/references/policy-example.json

仓库界面冒烟测试：

python3 skills/security-suite/scripts/prompt_redteam.py scan \
  --repo-root . \
  --pack-file skills/security-suite/references/agentops-redteam-pack.json \
  --out-dir .tmp/security-suite-redteam-smoke

场景：捕获基线并对新版本进行门控

用户输入： /security-suite run --binary $(command -v ao) --out-dir .tmp/security-suite/ao-v2.4

套件对 ao 二进制文件运行静态分析（文件元数据、链接库、嵌入式归档签名）、动态跟踪（沙盒化 --help 执行，观察进程、文件更改、网络端点）和契约捕获。
它将 static/static-analysis.json、dynamic/dynamic-analysis.json、contract/contract.json 和 suite-summary.json 写入输出目录。

结果： 为 ao v2.4 捕获了完整的基线快照，准备用作未来版本比较的 --baseline-dir。

场景：使用基线和策略的 CI 回归门控

用户输入： /security-suite run --binary ./bin/ao-candidate --out-dir .tmp/ao-candidate --baseline-dir .tmp/security-suite/ao-v2.4 --policy-file skills/security-suite/references/policy-example.json --fail-on-removed --fail-on-policy-fail

套件在候选二进制文件上运行所有三个收集原语，然后将生成的契约与 v2.4 基线进行比较，生成 compare/baseline-diff.json，其中包含任何新增、移除或更改的命令。
它评估策略文件检查项（必需命令、拒绝模式、网络允许列表、文件限制），并写入包含通过/失败裁决的 policy/policy-verdict.json。

结果： 如果有任何命令被移除或策略检查失败，套件将以非零状态退出，从而阻止候选版本在 CI 管道中发布。

场景：对仓库的提示和技能界面进行离线红队测试

用户输入： /security-suite collect-redteam --repo-root .

红队扫描器从 agentops-redteam-pack.json 加载攻击包，并根据具体的攻击案例评估仓库拥有的控制界面。
它将 redteam/redteam-results.json 和 redteam/redteam-results.md 写入选定的输出目录，如果存在未抵抗的失败严重性案例，则以非零状态退出。

结果： 仓库获得针对提示注入、工具滥用、上下文过度暴露、秘密处理和不安全 shell 回归的确定性红队测试裁决，而无需进行托管模型扫描。

问题	原因	解决方案
套件以非零状态退出，但无明显发现	`--fail-on-removed` 或 `--fail-on-policy-fail` 在合法变更时触发	查看 `compare/baseline-diff.json` 和 `policy/policy-verdict.json` 以识别具体的差异，然后相应地更新基线或策略文件。
`dynamic/dynamic-analysis.json` 为空或内容极少	二进制文件需要 `--help` 之外的参数，或者沙盒阻止了执行	如果支持，请提供自定义动态命令；或者验证二进制文件是否能在沙盒环境中运行（检查权限、缺失的共享库）。
`contract/contract.json` 显示零个命令	二进制文件未暴露 `--help` 界面或使用了非标准的帮助标志	验证二进制文件是否支持 `--help`；对于具有非标准帮助界面的二进制文件，请使用正确的调用方式单独运行 `collect-contract`。
策略裁决因 `deny_command_patterns` 而失败	新的子命令匹配了策略文件中的拒绝正则表达式	要么重命名子命令，要么更新策略 JSON 中的 `deny_command_patterns` 以排除合法的模式。
未生成 `baseline-diff.json`	未提供 `--baseline-dir` 或指向的目录不存在	确保基线目录存在，并且包含来自先前运行的有效 `contract/contract.json`。
红队扫描在清理措辞后失败	攻击包不再匹配目标文件中预期的防护语言	查看 `redteam/redteam-results.json`，确认是控制逻辑回归还是正则表达式过于脆弱，然后有意识地更新目标文件或攻击包。

🇺🇸English

Security Suite

Purpose: Provide composable, repeatable security/internal-testing primitives for authorized binaries and repo-managed prompt surfaces.

This skill separates concerns into primitives so security workflows stay testable and reusable.

Guardrails

Use only on binaries you own or are explicitly authorized to assess.
Do not use this workflow to bypass legal restrictions or extract third-party proprietary content without authorization.
Prefer behavioral assurance and policy gating over ad-hoc one-off reverse-engineering.

Primitive Model

collect-static — file metadata, runtime heuristics, linked libraries, embedded archive signatures.
collect-dynamic — sandboxed execution trace (processes, file changes, network endpoints).
collect-contract — machine-readable behavior contract from help-surface probing.
compare-baseline — current vs baseline contract drift (added/removed commands, runtime change).
enforce-policy — allowlist/denylist gates and severity-based verdict.
collect-redteam — offline repo-surface attack-pack scan for prompt-injection, tool-misuse, secret-exfiltration, and unsafe-shell regressions.
run — thin binary orchestrator that composes primitives and writes suite summary.

Quick Start

Single run (default dynamic command is --help):

python3 skills/security-suite/scripts/security_suite.py run \
  --binary "$(command -v ao)" \
  --out-dir .tmp/security-suite/ao-current

Baseline regression gate:

python3 skills/security-suite/scripts/security_suite.py run \
  --binary "$(command -v ao)" \
  --out-dir .tmp/security-suite/ao-current \
  --baseline-dir .tmp/security-suite/ao-baseline \
  --fail-on-removed

Policy gate:

python3 skills/security-suite/scripts/security_suite.py run \
  --binary "$(command -v ao)" \
  --out-dir .tmp/security-suite/ao-current \
  --policy-file skills/security-suite/references/policy-example.json \
  --fail-on-policy-fail

Repo-surface redteam:

python3 skills/security-suite/scripts/prompt_redteam.py scan \
  --repo-root . \
  --pack-file skills/security-suite/references/agentops-redteam-pack.json \
  --out-dir .tmp/security-suite-redteam

Recommended Workflow

Capture baseline on known-good release.
Run suite on candidate binary in CI.
Compare against baseline and enforce policy.
Block promotion on failing verdict.

Output Contract

All outputs are written under --out-dir:

static/static-analysis.json
dynamic/dynamic-analysis.json
contract/contract.json
compare/baseline-diff.json (when baseline supplied)
policy/policy-verdict.json (when policy supplied)
suite-summary.json
redteam/redteam-results.json (when repo-surface redteam is run)

This output structure is intentionally machine-consumable for CI gates.

Policy Model

Use skills/security-suite/references/policy-example.json as a starting point.

Supported checks:

required_top_level_commands
deny_command_patterns
max_created_files
forbid_file_path_patterns
allow_network_endpoint_patterns
deny_network_endpoint_patterns
block_if_removed_commands
min_command_count

Redteam Pack Model

Use agentops-redteam-pack.json as the starting point for offline repo-surface redteam checks.

Supported target fields:

globs
require_groups
forbidden_any
applies_if_any

Each case expresses a concrete adversarial prompt or operator-bypass attempt and binds it to one or more repo-owned files. The first shipped pack covers instruction precedence, context overexposure, destructive git misuse, security gate bypass, and unsafe shell or secret-handling regressions.

Technique Coverage

This suite is designed for broad binary classes, not just CLI metadata:

static runtime/library fingerprinting
sandboxed behavior observation
command/contract capture
drift classification
policy enforcement and CI verdicting
repo-surface redteam checks for prompt and operator-contract regressions

It is intentionally modular so you can add deeper primitives later (syscall tracing, SBOM attestation verification, fuzz harnesses) without rewriting the workflow.

Validation

Run:

bash skills/security-suite/scripts/validate.sh
bash tests/scripts/test-security-suite-redteam.sh

Smoke test (recommended):

python3 skills/security-suite/scripts/security_suite.py run \
  --binary "$(command -v ao)" \
  --out-dir .tmp/security-suite-smoke \
  --policy-file skills/security-suite/references/policy-example.json

Repo-surface smoke test:

python3 skills/security-suite/scripts/prompt_redteam.py scan \
  --repo-root . \
  --pack-file skills/security-suite/references/agentops-redteam-pack.json \
  --out-dir .tmp/security-suite-redteam-smoke

Examples

Scenario: Capture a Baseline and Gate a New Release

User says: /security-suite run --binary $(command -v ao) --out-dir .tmp/security-suite/ao-v2.4

What happens:

The suite runs static analysis (file metadata, linked libraries, embedded archive signatures), dynamic tracing (sandboxed --help execution observing processes, file changes, network endpoints), and contract capture against the ao binary.
It writes static/static-analysis.json, dynamic/dynamic-analysis.json, contract/contract.json, and suite-summary.json under the output directory.

Result: A complete baseline snapshot is captured for ao v2.4, ready to be used as --baseline-dir for future release comparisons.

Scenario: CI Regression Gate With Baseline and Policy

User says: /security-suite run --binary ./bin/ao-candidate --out-dir .tmp/ao-candidate --baseline-dir .tmp/security-suite/ao-v2.4 --policy-file skills/security-suite/references/policy-example.json --fail-on-removed --fail-on-policy-fail

What happens:

The suite runs all three collection primitives on the candidate binary, then compares the resulting contract against the v2.4 baseline to produce compare/baseline-diff.json with any added, removed, or changed commands.
It evaluates the policy file checks (required commands, denied patterns, network allowlists, file limits) and writes policy/policy-verdict.json with a pass/fail verdict.

Result: The suite exits non-zero if any commands were removed or a policy check failed, blocking the candidate from promotion in the CI pipeline.

Scenario: Offline Redteam the Repo's Prompt and Skill Surfaces

User says: /security-suite collect-redteam --repo-root .

What happens:

The redteam scanner loads the attack pack from agentops-redteam-pack.json and evaluates repo-owned control surfaces against concrete attack cases.
It writes redteam/redteam-results.json and redteam/redteam-results.md under the chosen output directory, then exits non-zero if a fail-severity case is not resisted.

Result: The repo gets a deterministic redteam verdict for prompt-injection, tool misuse, context overexposure, secret-handling, and unsafe-shell regressions without needing hosted model scanning.

Troubleshooting

Problem	Cause	Solution
Suite exits non-zero with no clear finding	`--fail-on-removed` or `--fail-on-policy-fail` triggered on a legitimate change	Review `compare/baseline-diff.json` and `policy/policy-verdict.json` to identify the specific delta, then update the baseline or policy file accordingly.
`dynamic/dynamic-analysis.json` is empty or minimal	Binary requires arguments beyond `--help`, or sandbox blocked execution	Supply a custom dynamic command if supported, or verify the binary runs in the sandboxed environment (check permissions, missing shared libraries).

Weekly Installs

123

Repository

boshu2/agentops

GitHub Stars

198

First Seen

Feb 20, 2026

Security Audits

Gen Agent Trust HubWarn SocketFail SnykPass

Installed on

opencode122

codex119

github-copilot118

kimi-cli118

gemini-cli118

cursor118

OpenClaw 安全 Linux 云部署指南：私有优先、SSH隧道、Podman容器化

33,700 周安装

安全套件 (security-suite) - 二进制与仓库安全测试、行为契约分析与CI门控工具

🇨🇳中文介绍

安全套件

防护措施

原语模型

相关 Skills

快速开始

推荐工作流程

输出契约

策略模型

红队测试包模型

技术覆盖范围

验证

示例

场景：捕获基线并对新版本进行门控

场景：使用基线和策略的 CI 回归门控

场景：对仓库的提示和技能界面进行离线红队测试

故障排除

🇺🇸English

Security Suite

Guardrails

Primitive Model

Quick Start

Recommended Workflow

Output Contract

Policy Model

Redteam Pack Model

Technique Coverage

Validation

Examples

Scenario: Capture a Baseline and Gate a New Release

Scenario: CI Regression Gate With Baseline and Policy

Scenario: Offline Redteam the Repo's Prompt and Skill Surfaces

Troubleshooting

最新 Skills