重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
devops-automator by erichowens/some_claude_skills
npx skills add https://github.com/erichowens/some_claude_skills --skill devops-automator专注于 CI/CD 流水线、基础设施即代码、容器编排和部署自动化的专业 DevOps 工程师。
在以下情况激活: "CI/CD"、"GitHub Actions"、"部署流水线"、"Terraform"、"基础设施即代码"、"IaC"、"Docker"、"Kubernetes"、"K8s"、"Helm"、"容器编排"、"GitOps"、"ArgoCD"、"部署自动化"、"密钥管理"、"监控设置"
不适用于: 应用程序开发 → 语言技能 | 数据库设计 → data-pipeline-engineer | API 设计 → api-architect
| 领域 | 工具与技术 |
|---|---|
| CI/CD | GitHub Actions, GitLab CI, Jenkins |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| Terraform, AWS CDK, Pulumi |
| 容器 | Docker, Kubernetes, Helm |
| GitOps | ArgoCD, Flux, Kustomize |
| 监控 | Prometheus, Grafana, ELK/EFK |
Code Commit → Build → Test → Security Scan → Package
↓
Monitor ← Release Staging ← Smoke Tests ← Deploy Dev
↓
Manual Approval
↓
Deploy Production
App Repo ──CI──▶ Config Repo ──ArgoCD──▶ K8s Cluster
▲ │
└────Continuous Sync─────┘
完整的工作示例位于 ./references/ 目录中:
| 文件 | 描述 | 行数 |
|---|---|---|
github-actions-patterns.yaml | 完整的 CI/CD 流水线 | 217 |
terraform-eks-module.tf | 生产环境 EKS 集群 | 282 |
kubernetes-deployment.yaml | 部署 + HPA + ArgoCD | 200 |
dockerfile-multistage.dockerfile | 优化的多阶段构建 | 51 |
症状 : 几乎相同的工作流文件在多个仓库中重复
修复 : 可重用工作流、Helm 图表、Kustomize 基础、Terraform 模块
症状 : API 密钥、密码提交到 git
修复 : 密钥管理器(Vault, AWS SM)、密封密钥、来自安全来源的环境变量
症状 : 没有部署失败应对计划,需要手动干预
修复 : 蓝/绿部署、带自动回滚的金丝雀部署、ArgoCD 自动回退
症状 : 每次提交都重建一切的单一 45 分钟流水线
修复 : 并行作业、缓存、增量构建、基于路径的触发器
症状 : K8s Pod 没有 CPU/内存限制,消耗所有主机资源
修复 : 始终设置 requests/limits,使用 LimitRanges 和 ResourceQuotas
症状 : Dockerfile 中没有 USER 指令,Pod 以特权模式运行
修复 : 添加 USER 指令,设置 securityContext.runAsNonRoot: true
症状 : 生产环境中使用 FROM node:latest 或 image: app:latest
修复 : 固定特定版本,使用带有 SHA 摘要的不可变标签
症状 : Dockerfile 中缺少 HEALTHCHECK,没有存活/就绪探针
修复 : 添加健康端点,配置具有适当超时的探针
症状 : replicas: 1,没有 Pod 反亲和性,单一可用区
修复 : 多个副本、Pod 反亲和性、拓扑分布约束
症状 : terraform.tfstate 提交到 git 或存储在本地
修复 : 远程后端(S3+DynamoDB, Terraform Cloud, GCS)
症状 : 同一分支的多个 CI 运行,部署竞态条件
修复 : 使用并发组,实现部署锁
症状 : CI 中没有漏洞扫描,没有密钥检测
修复 : 使用 Trivy、Snyk 或 Grype 进行漏洞扫描;使用 TruffleHog 进行密钥检测
症状 : 手动更改基础设施,配置与代码偏离
修复 : ArgoCD 差异检测、CI 中的 terraform plan、定期审计
症状 : IAM 角色具有 * 操作,服务账户具有 cluster-admin 权限
修复 : 最小权限原则、Pod 的 IRSA、审计权限
症状 : 没有指标,日志仅输出到 stdout,没有告警
修复 : 导出指标、结构化日志、定义 SLO、配置告警
运行 ./scripts/validate-devops-skill.sh 以检查:
[ ] 所有密钥都存储在密钥管理系统中(不在代码中)
[ ] 为所有容器定义了资源限制
[ ] 配置了健康检查(存活、就绪)
[ ] 启用了水平 Pod 自动扩缩
[ ] 设置了安全上下文(非 root、只读)
[ ] 配置了监控和告警
[ ] 记录了回滚策略
[ ] 支持多环境(开发、预发布、生产)
[ ] CI 流水线中有并发控制
[ ] Terraform 使用了远程状态后端
[ ] 流水线中包含了漏洞扫描
[ ] 所有依赖项都固定了版本
Read, Write, Edit - 用于配置和清单的文件操作Bash(docker:*) - 构建和管理容器Bash(kubectl:*) - Kubernetes 操作Bash(terraform:*) - 基础设施配置Bash(helm:*) - Helm 图表管理Bash(gh:*) - GitHub CLI 操作每周安装次数
59
仓库
GitHub 星标数
78
首次出现
Jan 24, 2026
安全审计
安装于
opencode50
gemini-cli49
codex48
cursor47
github-copilot45
claude-code40
Expert DevOps engineer specializing in CI/CD pipelines, infrastructure as code, container orchestration, and deployment automation.
Activate on: "CI/CD", "GitHub Actions", "deployment pipeline", "Terraform", "infrastructure as code", "IaC", "Docker", "Kubernetes", "K8s", "Helm", "container orchestration", "GitOps", "ArgoCD", "deployment automation", "secrets management", "monitoring setup"
NOT for: Application development → language skills | Database design → data-pipeline-engineer | API design → api-architect
| Domain | Tools & Technologies |
|---|---|
| CI/CD | GitHub Actions, GitLab CI, Jenkins |
| IaC | Terraform, AWS CDK, Pulumi |
| Containers | Docker, Kubernetes, Helm |
| GitOps | ArgoCD, Flux, Kustomize |
| Monitoring | Prometheus, Grafana, ELK/EFK |
Code Commit → Build → Test → Security Scan → Package
↓
Monitor ← Release Staging ← Smoke Tests ← Deploy Dev
↓
Manual Approval
↓
Deploy Production
App Repo ──CI──▶ Config Repo ──ArgoCD──▶ K8s Cluster
▲ │
└────Continuous Sync─────┘
Full working examples are in ./references/:
| File | Description | Lines |
|---|---|---|
github-actions-patterns.yaml | Complete CI/CD pipeline | 217 |
terraform-eks-module.tf | Production EKS cluster | 282 |
kubernetes-deployment.yaml | Deployment + HPA + ArgoCD | 200 |
dockerfile-multistage.dockerfile | Optimized multi-stage build | 51 |
Symptom : Nearly identical workflow files duplicated across repositories Fix : Reusable workflows, Helm charts, Kustomize bases, Terraform modules
Symptom : API keys, passwords committed to git Fix : Secret managers (Vault, AWS SM), sealed secrets, env vars from secure sources
Symptom : No plan for deployment failure, manual intervention required Fix : Blue/green, canary with automated rollback, ArgoCD auto-revert
Symptom : Single 45-minute pipeline rebuilding everything on every commit Fix : Parallel jobs, caching, incremental builds, path-based triggers
Symptom : K8s pods without CPU/memory limits consuming all host resources Fix : Always set requests/limits, use LimitRanges and ResourceQuotas
Symptom : Dockerfile without USER instruction, pods running privileged Fix : Add USER instruction, set securityContext.runAsNonRoot: true
Symptom : FROM node:latest or image: app:latest in production Fix : Pin specific versions, use immutable tags with SHA digests
Symptom : Missing HEALTHCHECK in Dockerfile, no liveness/readiness probes Fix : Add health endpoints, configure probes with appropriate timeouts
Symptom : replicas: 1, no pod anti-affinity, single availability zone Fix : Multiple replicas, pod anti-affinity, topology spread constraints
Symptom : terraform.tfstate committed to git or stored locally Fix : Remote backend (S3+DynamoDB, Terraform Cloud, GCS)
Symptom : Multiple CI runs for same branch, deployment race conditions Fix : Use concurrency groups, implement deployment locks
Symptom : No vulnerability scanning, no secret detection in CI Fix : Trivy, Snyk, or Grype for vulnerabilities; TruffleHog for secrets
Symptom : Manual changes to infrastructure, config diverges from code Fix : ArgoCD diff detection, terraform plan in CI, regular audits
Symptom : IAM roles with * actions, service accounts with cluster-admin Fix : Principle of least privilege, IRSA for pods, audit permissions
Symptom : No metrics, logs only on stdout, no alerting Fix : Export metrics, structured logging, define SLOs, configure alerts
Run ./scripts/validate-devops-skill.sh to check:
[ ] All secrets in secret management (not in code)
[ ] Resource limits defined for all containers
[ ] Health checks configured (liveness, readiness)
[ ] Horizontal pod autoscaling enabled
[ ] Security contexts set (non-root, read-only)
[ ] Monitoring and alerting configured
[ ] Rollback strategy documented
[ ] Multi-environment support (dev, staging, prod)
[ ] Concurrency controls in CI pipelines
[ ] Remote state backend for Terraform
[ ] Vulnerability scanning in pipeline
[ ] Version pinning for all dependencies
Read, Write, Edit - File operations for configs and manifestsBash(docker:*) - Build and manage containersBash(kubectl:*) - Kubernetes operationsBash(terraform:*) - Infrastructure provisioningBash(helm:*) - Helm chart managementBash(gh:*) - GitHub CLI operationsWeekly Installs
59
Repository
GitHub Stars
78
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode50
gemini-cli49
codex48
cursor47
github-copilot45
claude-code40
Skills CLI 使用指南:AI Agent 技能包管理器安装与管理教程
50,200 周安装