重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
chaos-engineering-resilience by proffesor-for-testing/agentic-qe
npx skills add https://github.com/proffesor-for-testing/agentic-qe --skill chaos-engineering-resilience<default_to_action> 当测试系统韧性或注入故障时:
快速混沌步骤:
关键成功因素:
| 类别 | 故障 | 工具 |
|---|---|---|
| 网络 | 延迟、丢包、分区 |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| tc, toxiproxy |
| 基础设施 | 实例终止、磁盘故障、CPU | Chaos Monkey |
| 应用 | 异常、响应缓慢、泄漏 | Gremlin, LitmusChaos |
| 依赖项 | 服务中断、超时 | WireMock |
Dev (safe) → Staging → 1% prod → 10% → 50% → 100%
↓ ↓ ↓ ↓
Learn Validate Careful Full confidence
| 指标 | 正常值 | 告警阈值 |
|---|---|---|
| 错误率 | < 0.1% | > 1% |
| p99 延迟 | < 200ms | > 500ms |
| 吞吐量 | 基线 | -20% |
// Chaos experiment definition
const experiment = {
name: 'Database latency injection',
hypothesis: 'System handles 500ms DB latency gracefully',
steadyState: {
errorRate: '< 0.1%',
p99Latency: '< 300ms'
},
method: {
type: 'network-latency',
target: 'database',
delay: '500ms',
duration: '5m'
},
rollback: {
automatic: true,
trigger: 'errorRate > 5%'
}
};
// qe-chaos-engineer runs controlled experiments
await Task("Chaos Experiment", {
target: 'payment-service',
failure: 'terminate-random-instance',
blastRadius: '10%',
duration: '5m',
steadyStateHypothesis: {
metric: 'success-rate',
threshold: 0.99
},
autoRollback: true
}, "qe-chaos-engineer");
// Validates:
// - System recovers automatically
// - Error rate stays within threshold
// - No data loss
// - Alerts triggered appropriately
aqe/chaos-engineering/
├── experiments/* - Experiment definitions & results
├── steady-states/* - Baseline measurements
├── runbooks/* - Generated recovery procedures
└── blast-radius/* - Impact analysis
const chaosFleet = await FleetManager.coordinate({
strategy: 'chaos-engineering',
agents: [
'qe-chaos-engineer', // Experiment execution
'qe-performance-tester', // Baseline metrics
'qe-production-intelligence' // Production monitoring
],
topology: 'sequential'
});
主动破坏系统以防止计划外中断。 在用户发现问题之前找到弱点。定义稳态,注入故障,测量影响,修复弱点,创建操作手册。从小处着手,逐步扩大爆炸半径。
使用智能体: qe-chaos-engineer 通过控制爆炸半径、自动回滚和全面的韧性验证来自动化混沌实验。根据实验结果生成操作手册。
每周安装次数
62
代码仓库
GitHub 星标数
294
首次出现
2026年1月24日
安全审计
安装于
github-copilot59
opencode59
codex57
gemini-cli57
cursor56
cline55
<default_to_action> When testing system resilience or injecting failures:
Quick Chaos Steps:
Critical Success Factors:
| Category | Failures | Tools |
|---|---|---|
| Network | Latency, packet loss, partition | tc, toxiproxy |
| Infrastructure | Instance kill, disk failure, CPU | Chaos Monkey |
| Application | Exceptions, slow responses, leaks | Gremlin, LitmusChaos |
| Dependencies | Service outage, timeout | WireMock |
Dev (safe) → Staging → 1% prod → 10% → 50% → 100%
↓ ↓ ↓ ↓
Learn Validate Careful Full confidence
| Metric | Normal | Alert Threshold |
|---|---|---|
| Error rate | < 0.1% | > 1% |
| p99 latency | < 200ms | > 500ms |
| Throughput | baseline | -20% |
// Chaos experiment definition
const experiment = {
name: 'Database latency injection',
hypothesis: 'System handles 500ms DB latency gracefully',
steadyState: {
errorRate: '< 0.1%',
p99Latency: '< 300ms'
},
method: {
type: 'network-latency',
target: 'database',
delay: '500ms',
duration: '5m'
},
rollback: {
automatic: true,
trigger: 'errorRate > 5%'
}
};
// qe-chaos-engineer runs controlled experiments
await Task("Chaos Experiment", {
target: 'payment-service',
failure: 'terminate-random-instance',
blastRadius: '10%',
duration: '5m',
steadyStateHypothesis: {
metric: 'success-rate',
threshold: 0.99
},
autoRollback: true
}, "qe-chaos-engineer");
// Validates:
// - System recovers automatically
// - Error rate stays within threshold
// - No data loss
// - Alerts triggered appropriately
aqe/chaos-engineering/
├── experiments/* - Experiment definitions & results
├── steady-states/* - Baseline measurements
├── runbooks/* - Generated recovery procedures
└── blast-radius/* - Impact analysis
const chaosFleet = await FleetManager.coordinate({
strategy: 'chaos-engineering',
agents: [
'qe-chaos-engineer', // Experiment execution
'qe-performance-tester', // Baseline metrics
'qe-production-intelligence' // Production monitoring
],
topology: 'sequential'
});
Break things on purpose to prevent unplanned outages. Find weaknesses before users do. Define steady state, inject failures, measure impact, fix weaknesses, create runbooks. Start small, increase blast radius gradually.
With Agents: qe-chaos-engineer automates chaos experiments with blast radius control, automatic rollback, and comprehensive resilience validation. Generates runbooks from experiment results.
Weekly Installs
62
Repository
GitHub Stars
294
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
github-copilot59
opencode59
codex57
gemini-cli57
cursor56
cline55
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
111,700 周安装