senior-cloud-architect by borghei/claude-skills
npx skills add https://github.com/borghei/claude-skills --skill senior-cloud-architect专家级云架构与基础设施设计。
| 服务 | AWS | GCP | Azure |
|---|---|---|---|
| 计算 | EC2, ECS, EKS | GCE, GKE | VMs, AKS |
| 无服务器 | Lambda | Cloud Functions | Azure Functions |
| 存储 | S3 | Cloud Storage | Blob Storage |
| 数据库 | RDS, DynamoDB | Cloud SQL, Spanner | SQL DB, CosmosDB |
| 机器学习 | SageMaker |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| Vertex AI |
| Azure ML |
| 内容分发网络 | CloudFront | Cloud CDN | Azure CDN |
支柱:
卓越运营
安全
可靠性
性能效率
成本优化
可持续性
┌─────────────────────────────────────────────────────────────┐
│ Route 53 (DNS) │
└─────────────────────────────┬───────────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────────┐
│ CloudFront (CDN) │
│ WAF (Web Application Firewall) │
└─────────────────────────────┬───────────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────────┐
│ Application Load Balancer │
└──────────┬───────────────────────────────────┬──────────────┘
│ │
┌──────────▼──────────┐ ┌──────────▼──────────┐
│ ECS/EKS Cluster │ │ ECS/EKS Cluster │
│ (AZ-a) │ │ (AZ-b) │
└──────────┬──────────┘ └──────────┬──────────┘
│ │
┌──────────▼───────────────────────────────────▼──────────┐
│ ElastiCache (Redis) │
└─────────────────────────────┬───────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────┐
│ RDS Multi-AZ │
│ (Primary + Standby) │
└─────────────────────────────────────────────────────────┘
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "${var.project}-${var.environment}"
cidr = var.vpc_cidr
azs = ["${var.region}a", "${var.region}b", "${var.region}c"]
private_subnets = var.private_subnets
public_subnets = var.public_subnets
enable_nat_gateway = true
single_nat_gateway = var.environment != "production"
enable_dns_hostnames = true
enable_dns_support = true
tags = local.common_tags
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 19.0"
cluster_name = "${var.project}-${var.environment}"
cluster_version = "1.28"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
eks_managed_node_groups = {
main = {
instance_types = var.node_instance_types
min_size = var.node_min_size
max_size = var.node_max_size
desired_size = var.node_desired_size
}
}
tags = local.common_tags
}
module "rds" {
source = "terraform-aws-modules/rds/aws"
version = "~> 6.0"
identifier = "${var.project}-${var.environment}"
engine = "postgres"
engine_version = "15"
family = "postgres15"
major_engine_version = "15"
instance_class = var.db_instance_class
allocated_storage = var.db_allocated_storage
max_allocated_storage = var.db_max_allocated_storage
db_name = var.db_name
username = var.db_username
port = 5432
multi_az = var.environment == "production"
db_subnet_group_name = module.vpc.database_subnet_group
vpc_security_group_ids = [module.security_group.security_group_id]
backup_retention_period = var.environment == "production" ? 30 : 7
skip_final_snapshot = var.environment != "production"
tags = local.common_tags
}
| 类型 | 折扣 | 承诺期 | 使用场景 |
|---|---|---|---|
| 按需实例 | 0% | 无 | 可变工作负载 |
| 预留实例 | 30-72% | 1-3 年 | 稳定状态 |
| 节省计划 | 30-72% | 1-3 年 | 灵活计算 |
| 竞价实例 | 60-90% | 无 | 容错应用 |
合理配置:
def analyze_utilization(instance_id: str, days: int = 14):
"""分析 CPU/内存利用率以提供合理配置建议。"""
cloudwatch = boto3.client('cloudwatch')
metrics = cloudwatch.get_metric_statistics(
Namespace='AWS/EC2',
MetricName='CPUUtilization',
Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
StartTime=datetime.now() - timedelta(days=days),
EndTime=datetime.now(),
Period=3600,
Statistics=['Average', 'Maximum']
)
avg_cpu = sum(p['Average'] for p in metrics['Datapoints']) / len(metrics['Datapoints'])
max_cpu = max(p['Maximum'] for p in metrics['Datapoints'])
if avg_cpu < 10 and max_cpu < 30:
return 'downsize'
elif avg_cpu > 80:
return 'upsize'
else:
return 'optimal'
成本分配标签:
required_tags:
- Environment: production|staging|development
- Project: project-name
- Owner: team-name
- CostCenter: cost-center-id
automation:
- 未标记资源在 24 小时后发出警报
- 开发资源在 7 天后自动终止
- 按标签生成每周成本报告
┌─────────────────────────────────────────────────────────────┐
│ 月度成本摘要 │
├─────────────────────────────────────────────────────────────┤
│ 总计:$45,231 对比上月:+5% │
│ │
│ 按服务: 按环境: │
│ ├── EC2:$18,500 (41%) ├── 生产环境:$38,000 │
│ ├── RDS:$12,000 (27%) ├── 预发布环境:$4,500 │
│ ├── S3:$3,200 (7%) └── 开发环境:$2,731 │
│ ├── Lambda:$1,800 (4%) │
│ └── 其他:$9,731 (21%) 节省机会:$8,200 │
│ │
│ 建议: │
│ • 将 12 个实例转换为预留实例(每月节省 $4,200) │
│ • 删除 5 个未使用的 EBS 卷(每月节省 $180) │
│ • 调整 8 个过度配置的实例大小(每月节省 $1,800) │
└─────────────────────────────────────────────────────────────┘
| 策略 | 恢复时间目标 | 恢复点目标 | 成本 |
|---|---|---|---|
| 备份与恢复 | 小时 | 小时 | $ |
| 试点模式 | 分钟 | 分钟 | $$ |
| 温备模式 | 分钟 | 秒 | $$$ |
| 多站点主动模式 | 秒 | 接近零 | $$$$ |
┌────────────────────────────────────────────────────────────┐
│ 全局负载均衡器 │
│ (Route 53 / Cloud DNS) │
└──────────────┬─────────────────────────────┬───────────────┘
│ │
┌──────────────▼──────────────┐ ┌────────────▼──────────────┐
│ 主区域 │ │ 次要区域 │
│ (us-east-1) │ │ (us-west-2) │
│ │ │ │
│ ┌──────────────────────┐ │ │ ┌──────────────────────┐ │
│ │ 应用层 │ │ │ │ 应用层 │ │
│ │ (活跃) │ │ │ │ (备用/活跃) │ │
│ └──────────┬───────────┘ │ │ └──────────┬───────────┘ │
│ │ │ │ │ │
│ ┌──────────▼───────────┐ │ │ ┌──────────▼───────────┐ │
│ │ 数据库 │──┼─┼──│ 数据库 │ │
│ │ (主实例) │ │ │ │ (只读副本) │ │
│ └──────────────────────┘ │ │ └──────────────────────┘ │
└────────────────────────────┘ └────────────────────────────┘
│
│ 跨区域复制
▼
┌──────────────────────┐
│ S3 备份 │
│ (多区域) │
└──────────────────────┘
backup_policy:
database:
frequency: continuous
retention: 35 days
cross_region: true
encryption: aws/rds
application_data:
frequency: daily
retention: 90 days
versioning: enabled
lifecycle:
- transition_to_ia: 30 days
- transition_to_glacier: 90 days
- expiration: 365 days
configuration:
frequency: on_change
retention: unlimited
storage: git + s3
┌─────────────────────────────────────────────────────────────┐
│ VPC │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ 公共子网 │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │
│ │ │ NAT 网关 │ │ ALB │ │ 堡垒主机 │ │ │
│ │ └─────────────┘ └─────────────┘ └───────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────┐ │
│ │ 私有子网 │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │
│ │ │ 应用层 │ │ 应用层 │ │ 应用层 │ │ │
│ │ └─────────────┘ └─────────────┘ └───────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────┐ │
│ │ 数据子网 │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │
│ │ │ RDS │ │ Redis │ │ Elasticsearch│ │ │
│ │ └─────────────┘ └─────────────┘ └───────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LeastPrivilegeExample",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-bucket/uploads/*",
"Condition": {
"StringEquals": {
"aws:PrincipalTag/Team": "engineering"
},
"IpAddress": {
"aws:SourceIp": ["10.0.0.0/8"]
}
}
}
]
}
references/aws_patterns.md - AWS 架构模式references/gcp_patterns.md - GCP 架构模式references/multi_cloud.md - 多云策略references/cost_optimization.md - 成本优化指南# 基础设施成本分析器
python scripts/cost_analyzer.py --account production --period monthly
# 灾难恢复验证
python scripts/dr_test.py --region us-west-2 --type failover
# 安全审计
python scripts/security_audit.py --framework cis --output report.html
# 资源清单
python scripts/inventory.py --accounts all --format csv
每周安装次数
63
代码仓库
GitHub 星标数
29
首次出现
2026 年 1 月 24 日
安全审计
安装于
claude-code48
opencode40
gemini-cli38
codex36
cursor36
github-copilot32
Expert-level cloud architecture and infrastructure design.
| Service | AWS | GCP | Azure |
|---|---|---|---|
| Compute | EC2, ECS, EKS | GCE, GKE | VMs, AKS |
| Serverless | Lambda | Cloud Functions | Azure Functions |
| Storage | S3 | Cloud Storage | Blob Storage |
| Database | RDS, DynamoDB | Cloud SQL, Spanner | SQL DB, CosmosDB |
| ML | SageMaker | Vertex AI | Azure ML |
| CDN | CloudFront | Cloud CDN | Azure CDN |
Pillars:
Operational Excellence
Security
Reliability
Performance Efficiency
Cost Optimization
Sustainability
┌─────────────────────────────────────────────────────────────┐
│ Route 53 (DNS) │
└─────────────────────────────┬───────────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────────┐
│ CloudFront (CDN) │
│ WAF (Web Application Firewall) │
└─────────────────────────────┬───────────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────────┐
│ Application Load Balancer │
└──────────┬───────────────────────────────────┬──────────────┘
│ │
┌──────────▼──────────┐ ┌──────────▼──────────┐
│ ECS/EKS Cluster │ │ ECS/EKS Cluster │
│ (AZ-a) │ │ (AZ-b) │
└──────────┬──────────┘ └──────────┬──────────┘
│ │
┌──────────▼───────────────────────────────────▼──────────┐
│ ElastiCache (Redis) │
└─────────────────────────────┬───────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────┐
│ RDS Multi-AZ │
│ (Primary + Standby) │
└─────────────────────────────────────────────────────────┘
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "${var.project}-${var.environment}"
cidr = var.vpc_cidr
azs = ["${var.region}a", "${var.region}b", "${var.region}c"]
private_subnets = var.private_subnets
public_subnets = var.public_subnets
enable_nat_gateway = true
single_nat_gateway = var.environment != "production"
enable_dns_hostnames = true
enable_dns_support = true
tags = local.common_tags
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 19.0"
cluster_name = "${var.project}-${var.environment}"
cluster_version = "1.28"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
eks_managed_node_groups = {
main = {
instance_types = var.node_instance_types
min_size = var.node_min_size
max_size = var.node_max_size
desired_size = var.node_desired_size
}
}
tags = local.common_tags
}
module "rds" {
source = "terraform-aws-modules/rds/aws"
version = "~> 6.0"
identifier = "${var.project}-${var.environment}"
engine = "postgres"
engine_version = "15"
family = "postgres15"
major_engine_version = "15"
instance_class = var.db_instance_class
allocated_storage = var.db_allocated_storage
max_allocated_storage = var.db_max_allocated_storage
db_name = var.db_name
username = var.db_username
port = 5432
multi_az = var.environment == "production"
db_subnet_group_name = module.vpc.database_subnet_group
vpc_security_group_ids = [module.security_group.security_group_id]
backup_retention_period = var.environment == "production" ? 30 : 7
skip_final_snapshot = var.environment != "production"
tags = local.common_tags
}
| Type | Discount | Commitment | Use Case |
|---|---|---|---|
| On-Demand | 0% | None | Variable workloads |
| Reserved | 30-72% | 1-3 years | Steady-state |
| Savings Plans | 30-72% | 1-3 years | Flexible compute |
| Spot | 60-90% | None | Fault-tolerant |
Right-sizing:
def analyze_utilization(instance_id: str, days: int = 14):
"""Analyze CPU/memory utilization for right-sizing recommendations."""
cloudwatch = boto3.client('cloudwatch')
metrics = cloudwatch.get_metric_statistics(
Namespace='AWS/EC2',
MetricName='CPUUtilization',
Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
StartTime=datetime.now() - timedelta(days=days),
EndTime=datetime.now(),
Period=3600,
Statistics=['Average', 'Maximum']
)
avg_cpu = sum(p['Average'] for p in metrics['Datapoints']) / len(metrics['Datapoints'])
max_cpu = max(p['Maximum'] for p in metrics['Datapoints'])
if avg_cpu < 10 and max_cpu < 30:
return 'downsize'
elif avg_cpu > 80:
return 'upsize'
else:
return 'optimal'
Cost Allocation Tags:
required_tags:
- Environment: production|staging|development
- Project: project-name
- Owner: team-name
- CostCenter: cost-center-id
automation:
- Untagged resources alert after 24 hours
- Auto-terminate development resources after 7 days
- Weekly cost reports by tag
┌─────────────────────────────────────────────────────────────┐
│ Monthly Cost Summary │
├─────────────────────────────────────────────────────────────┤
│ Total: $45,231 vs Last Month: +5% │
│ │
│ By Service: By Environment: │
│ ├── EC2: $18,500 (41%) ├── Production: $38,000 │
│ ├── RDS: $12,000 (27%) ├── Staging: $4,500 │
│ ├── S3: $3,200 (7%) └── Development: $2,731 │
│ ├── Lambda: $1,800 (4%) │
│ └── Other: $9,731 (21%) Savings Opportunity: $8,200 │
│ │
│ Recommendations: │
│ • Convert 12 instances to Reserved (save $4,200/mo) │
│ • Delete 5 unused EBS volumes (save $180/mo) │
│ • Resize 8 over-provisioned instances (save $1,800/mo) │
└─────────────────────────────────────────────────────────────┘
| Strategy | RTO | RPO | Cost |
|---|---|---|---|
| Backup & Restore | Hours | Hours | $ |
| Pilot Light | Minutes | Minutes | $$ |
| Warm Standby | Minutes | Seconds | $$$ |
| Multi-Site Active | Seconds | Near-zero | $$$$ |
┌────────────────────────────────────────────────────────────┐
│ Global Load Balancer │
│ (Route 53 / Cloud DNS) │
└──────────────┬─────────────────────────────┬───────────────┘
│ │
┌──────────────▼──────────────┐ ┌────────────▼──────────────┐
│ Primary Region │ │ Secondary Region │
│ (us-east-1) │ │ (us-west-2) │
│ │ │ │
│ ┌──────────────────────┐ │ │ ┌──────────────────────┐ │
│ │ Application Layer │ │ │ │ Application Layer │ │
│ │ (Active) │ │ │ │ (Standby/Active) │ │
│ └──────────┬───────────┘ │ │ └──────────┬───────────┘ │
│ │ │ │ │ │
│ ┌──────────▼───────────┐ │ │ ┌──────────▼───────────┐ │
│ │ Database │──┼─┼──│ Database │ │
│ │ (Primary) │ │ │ │ (Read Replica) │ │
│ └──────────────────────┘ │ │ └──────────────────────┘ │
└────────────────────────────┘ └────────────────────────────┘
│
│ Cross-Region Replication
▼
┌──────────────────────┐
│ S3 Backup │
│ (Multi-Region) │
└──────────────────────┘
backup_policy:
database:
frequency: continuous
retention: 35 days
cross_region: true
encryption: aws/rds
application_data:
frequency: daily
retention: 90 days
versioning: enabled
lifecycle:
- transition_to_ia: 30 days
- transition_to_glacier: 90 days
- expiration: 365 days
configuration:
frequency: on_change
retention: unlimited
storage: git + s3
┌─────────────────────────────────────────────────────────────┐
│ VPC │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Public Subnet │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │
│ │ │ NAT GW │ │ ALB │ │ Bastion │ │ │
│ │ └─────────────┘ └─────────────┘ └───────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────┐ │
│ │ Private Subnet │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │
│ │ │ App Tier │ │ App Tier │ │ App Tier │ │ │
│ │ └─────────────┘ └─────────────┘ └───────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────┐ │
│ │ Data Subnet │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │
│ │ │ RDS │ │ Redis │ │ Elasticsearch│ │ │
│ │ └─────────────┘ └─────────────┘ └───────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LeastPrivilegeExample",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-bucket/uploads/*",
"Condition": {
"StringEquals": {
"aws:PrincipalTag/Team": "engineering"
},
"IpAddress": {
"aws:SourceIp": ["10.0.0.0/8"]
}
}
}
]
}
references/aws_patterns.md - AWS architecture patternsreferences/gcp_patterns.md - GCP architecture patternsreferences/multi_cloud.md - Multi-cloud strategiesreferences/cost_optimization.md - Cost optimization guide# Infrastructure cost analyzer
python scripts/cost_analyzer.py --account production --period monthly
# DR validation
python scripts/dr_test.py --region us-west-2 --type failover
# Security audit
python scripts/security_audit.py --framework cis --output report.html
# Resource inventory
python scripts/inventory.py --accounts all --format csv
Weekly Installs
63
Repository
GitHub Stars
29
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
claude-code48
opencode40
gemini-cli38
codex36
cursor36
github-copilot32