deploying-airflow by astronomer/agents
npx skills add https://github.com/astronomer/agents --skill deploying-airflow本技能涵盖将 Airflow DAG 和项目部署到生产环境,无论是使用 Astro(Astronomer 的托管平台)还是基于 Docker Compose 或 Kubernetes 的开源 Airflow。
选择路径: Astro 适合托管运维和更快的 CI/CD。对于开源方案,开发环境使用 Docker Compose,生产环境使用 Helm chart。
Astro 提供 CLI 命令和 GitHub 集成,用于部署 Airflow 项目。
| 命令 | 功能 |
|---|---|
astro deploy | 完整项目部署 —— 构建 Docker 镜像并部署 DAG |
astro deploy --dags | 仅 DAG 部署 —— 仅推送 DAG 文件(快速,无需构建镜像) |
astro deploy --image | 仅镜像部署 —— 仅推送 Docker 镜像(适用于多仓库 CI/CD) |
astro deploy --dbt |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| dbt 项目部署 —— 部署 dbt 项目以与 Airflow 一同运行 |
从您的 Astro 项目构建 Docker 镜像并部署所有内容(DAG、插件、requirements、包):
astro deploy
当您更改了 requirements.txt、Dockerfile、packages.txt、插件或任何非 DAG 文件时使用此命令。
仅推送 dags/ 目录中的文件,无需重建 Docker 镜像:
astro deploy --dags
由于跳过了镜像构建,这比完整部署快得多。当您仅更改了 DAG 文件且未修改依赖项或配置时使用此命令。
仅推送 Docker 镜像,不更新 DAG:
astro deploy --image
这在多仓库设置中很有用,其中 DAG 与镜像分开部署,或者在独立管理镜像和 DAG 部署的 CI/CD 管道中。
部署一个 dbt 项目以与 Cosmos 在 Astro 部署上一起运行:
astro deploy --dbt
Astro 支持分支到部署的映射,以实现自动化部署:
main -> 生产环境,develop -> 预发布环境)在 Astro UI 的 部署设置 > CI/CD 下配置此功能。
Astro 上常见的 CI/CD 策略:
astro deploy --dags 进行快速迭代astro deploy 进行生产发布--image 和 --dags,以实现独立的发布周期当快速连续触发多个部署时,Astro 会在部署队列中按顺序处理它们。每个部署完成后才会开始下一个。
使用官方的 Docker Compose 设置部署 Airflow。这推荐用于学习和探索 —— 对于生产环境,请使用带有 Helm chart 的 Kubernetes(见下文)。
apache/airflow Docker 镜像下载官方的 Airflow 3 Docker Compose 文件:
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml'
这会设置完整的 Airflow 3 架构:
| 服务 | 用途 |
|---|---|
airflow-apiserver | REST API 和 UI(端口 8080) |
airflow-scheduler | 调度 DAG 运行 |
airflow-dag-processor | 解析和处理 DAG 文件 |
airflow-worker | 执行任务(CeleryExecutor) |
airflow-triggerer | 处理可延迟/异步任务 |
postgres | 元数据数据库 |
redis | Celery 消息代理 |
为了使用 LocalExecutor(无 Celery/Redis)进行更简单的设置,创建一个 docker-compose.yaml:
x-airflow-common: &airflow-common
image: apache/airflow:3 # 使用最新的 Airflow 3.x 版本
environment: &airflow-common-env
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__CORE__DAGS_FOLDER: /opt/airflow/dags
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
depends_on:
postgres:
condition: service_healthy
services:
postgres:
image: postgres:16
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 10s
retries: 5
start_period: 5s
airflow-init:
<<: *airflow-common
entrypoint: /bin/bash
command:
- -c
- |
airflow db migrate
airflow users create \
--username admin \
--firstname Admin \
--lastname User \
--role Admin \
--email admin@example.com \
--password admin
depends_on:
postgres:
condition: service_healthy
airflow-apiserver:
<<: *airflow-common
command: airflow api-server
ports:
- "8080:8080"
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
airflow-scheduler:
<<: *airflow-common
command: airflow scheduler
airflow-dag-processor:
<<: *airflow-common
command: airflow dag-processor
airflow-triggerer:
<<: *airflow-common
command: airflow triggerer
volumes:
postgres-db-volume:
Airflow 3 架构说明 :Web 服务器已被 API 服务器(
airflow api-server)取代,并且 DAG 处理器 现在作为独立于调度器的独立进程运行。
# 启动所有服务
docker compose up -d
# 停止所有服务
docker compose down
# 查看日志
docker compose logs -f airflow-scheduler
# 更改 requirements 后重启
docker compose down && docker compose up -d --build
# 运行一次性 Airflow CLI 命令
docker compose exec airflow-apiserver airflow dags list
将包添加到 requirements.txt 并重新构建:
# 添加到 requirements.txt,然后:
docker compose down
docker compose up -d --build
或者使用自定义 Dockerfile:
FROM apache/airflow:3 # 固定到特定版本(例如 3.1.7)以保证可复现性
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
更新 docker-compose.yaml 以从 Dockerfile 构建:
x-airflow-common: &airflow-common
build:
context: .
dockerfile: Dockerfile
# ... 其余配置
通过 docker-compose.yaml 中的环境变量配置 Airflow 设置:
environment:
# 核心设置
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__PARALLELISM: 32
AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 16
# 电子邮件
AIRFLOW__EMAIL__EMAIL_BACKEND: airflow.utils.email.send_email_smtp
AIRFLOW__SMTP__SMTP_HOST: smtp.example.com
# 连接(作为 URI)
AIRFLOW_CONN_MY_DB: postgresql://user:pass@host:5432/db
使用官方的 Apache Airflow Helm chart 在 Kubernetes 上部署 Airflow。
kubectlhelm# 添加 Airflow Helm 仓库
helm repo add apache-airflow https://airflow.apache.org
helm repo update
# 使用默认值安装
helm install airflow apache-airflow/airflow \
--namespace airflow \
--create-namespace
# 使用自定义值安装
helm install airflow apache-airflow/airflow \
--namespace airflow \
--create-namespace \
-f values.yaml
# 执行器类型
executor: KubernetesExecutor # 或 CeleryExecutor, LocalExecutor
# Airflow 镜像(固定到您所需的版本)
defaultAirflowRepository: apache/airflow
defaultAirflowTag: "3" # 或固定版本:"3.1.7"
# DAG 的 Git 同步(推荐用于生产环境)
dags:
gitSync:
enabled: true
repo: https://github.com/your-org/your-dags.git
branch: main
subPath: dags
wait: 60 # 同步间隔秒数
# API 服务器(在 Airflow 3 中取代了 webserver)
apiServer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
replicas: 1
# 调度器
scheduler:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1000m"
memory: "2Gi"
# 独立的 DAG 处理器
dagProcessor:
enabled: true
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
# 触发器(用于可延迟任务)
triggerer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
# Worker 资源(仅限 CeleryExecutor)
workers:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
replicas: 2
# 日志持久化
logs:
persistence:
enabled: true
size: 10Gi
# PostgreSQL(内置)
postgresql:
enabled: true
# 或使用外部数据库
# postgresql:
# enabled: false
# data:
# metadataConnection:
# user: airflow
# pass: airflow
# host: your-rds-host.amazonaws.com
# port: 5432
# db: airflow
# 使用新值升级
helm upgrade airflow apache-airflow/airflow \
--namespace airflow \
-f values.yaml
# 升级到新的 Airflow 版本
helm upgrade airflow apache-airflow/airflow \
--namespace airflow \
--set defaultAirflowTag="<version>"
# 检查 Pod 状态
kubectl get pods -n airflow
# 查看调度器日志
kubectl logs -f deployment/airflow-scheduler -n airflow
# 端口转发 API 服务器
kubectl port-forward svc/airflow-apiserver 8080:8080 -n airflow
# 运行一次性 CLI 命令
kubectl exec -it deployment/airflow-scheduler -n airflow -- airflow dags list
astro dev 进行本地开发每周安装次数
104
仓库
GitHub 星标数
269
首次出现
2026 年 2 月 24 日
安全审计
已安装于
cursor100
github-copilot93
codex93
opencode92
gemini-cli91
amp91
This skill covers deploying Airflow DAGs and projects to production, whether using Astro (Astronomer's managed platform) or open-source Airflow on Docker Compose or Kubernetes.
Choosing a path: Astro is a good fit for managed operations and faster CI/CD. For open-source, use Docker Compose for dev and the Helm chart for production.
Astro provides CLI commands and GitHub integration for deploying Airflow projects.
| Command | What It Does |
|---|---|
astro deploy | Full project deploy — builds Docker image and deploys DAGs |
astro deploy --dags | DAG-only deploy — pushes only DAG files (fast, no image build) |
astro deploy --image | Image-only deploy — pushes only the Docker image (for multi-repo CI/CD) |
astro deploy --dbt | dbt project deploy — deploys a dbt project to run alongside Airflow |
Builds a Docker image from your Astro project and deploys everything (DAGs, plugins, requirements, packages):
astro deploy
Use this when you've changed requirements.txt, Dockerfile, packages.txt, plugins, or any non-DAG file.
Pushes only files in the dags/ directory without rebuilding the Docker image:
astro deploy --dags
This is significantly faster than a full deploy since it skips the image build. Use this when you've only changed DAG files and haven't modified dependencies or configuration.
Pushes only the Docker image without updating DAGs:
astro deploy --image
This is useful in multi-repo setups where DAGs are deployed separately from the image, or in CI/CD pipelines that manage image and DAG deploys independently.
Deploys a dbt project to run with Cosmos on an Astro deployment:
astro deploy --dbt
Astro supports branch-to-deployment mapping for automated deploys:
main -> production, develop -> staging)Configure this in the Astro UI under Deployment Settings > CI/CD.
Common CI/CD strategies on Astro:
astro deploy --dags for fast iteration during developmentastro deploy on merge to main for production releases--image and --dags in separate CI jobs for independent release cyclesWhen multiple deploys are triggered in quick succession, Astro processes them sequentially in a deploy queue. Each deploy completes before the next one starts.
Deploy Airflow using the official Docker Compose setup. This is recommended for learning and exploration — for production, use Kubernetes with the Helm chart (see below).
apache/airflow Docker imageDownload the official Airflow 3 Docker Compose file:
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml'
This sets up the full Airflow 3 architecture:
| Service | Purpose |
|---|---|
airflow-apiserver | REST API and UI (port 8080) |
airflow-scheduler | Schedules DAG runs |
airflow-dag-processor | Parses and processes DAG files |
airflow-worker | Executes tasks (CeleryExecutor) |
airflow-triggerer | Handles deferrable/async tasks |
postgres | Metadata database |
For a simpler setup with LocalExecutor (no Celery/Redis), create a docker-compose.yaml:
x-airflow-common: &airflow-common
image: apache/airflow:3 # Use the latest Airflow 3.x release
environment: &airflow-common-env
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__CORE__DAGS_FOLDER: /opt/airflow/dags
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
depends_on:
postgres:
condition: service_healthy
services:
postgres:
image: postgres:16
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 10s
retries: 5
start_period: 5s
airflow-init:
<<: *airflow-common
entrypoint: /bin/bash
command:
- -c
- |
airflow db migrate
airflow users create \
--username admin \
--firstname Admin \
--lastname User \
--role Admin \
--email admin@example.com \
--password admin
depends_on:
postgres:
condition: service_healthy
airflow-apiserver:
<<: *airflow-common
command: airflow api-server
ports:
- "8080:8080"
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
airflow-scheduler:
<<: *airflow-common
command: airflow scheduler
airflow-dag-processor:
<<: *airflow-common
command: airflow dag-processor
airflow-triggerer:
<<: *airflow-common
command: airflow triggerer
volumes:
postgres-db-volume:
Airflow 3 architecture note : The webserver has been replaced by the API server (
airflow api-server), and the DAG processor now runs as a standalone process separate from the scheduler.
# Start all services
docker compose up -d
# Stop all services
docker compose down
# View logs
docker compose logs -f airflow-scheduler
# Restart after requirements change
docker compose down && docker compose up -d --build
# Run a one-off Airflow CLI command
docker compose exec airflow-apiserver airflow dags list
Add packages to requirements.txt and rebuild:
# Add to requirements.txt, then:
docker compose down
docker compose up -d --build
Or use a custom Dockerfile:
FROM apache/airflow:3 # Pin to a specific version (e.g., 3.1.7) for reproducibility
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
Update docker-compose.yaml to build from the Dockerfile:
x-airflow-common: &airflow-common
build:
context: .
dockerfile: Dockerfile
# ... rest of config
Configure Airflow settings via environment variables in docker-compose.yaml:
environment:
# Core settings
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__PARALLELISM: 32
AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 16
# Email
AIRFLOW__EMAIL__EMAIL_BACKEND: airflow.utils.email.send_email_smtp
AIRFLOW__SMTP__SMTP_HOST: smtp.example.com
# Connections (as URI)
AIRFLOW_CONN_MY_DB: postgresql://user:pass@host:5432/db
Deploy Airflow on Kubernetes using the official Apache Airflow Helm chart.
kubectl configuredhelm installed# Add the Airflow Helm repo
helm repo add apache-airflow https://airflow.apache.org
helm repo update
# Install with default values
helm install airflow apache-airflow/airflow \
--namespace airflow \
--create-namespace
# Install with custom values
helm install airflow apache-airflow/airflow \
--namespace airflow \
--create-namespace \
-f values.yaml
# Executor type
executor: KubernetesExecutor # or CeleryExecutor, LocalExecutor
# Airflow image (pin to your desired version)
defaultAirflowRepository: apache/airflow
defaultAirflowTag: "3" # Or pin: "3.1.7"
# Git-sync for DAGs (recommended for production)
dags:
gitSync:
enabled: true
repo: https://github.com/your-org/your-dags.git
branch: main
subPath: dags
wait: 60 # seconds between syncs
# API server (replaces webserver in Airflow 3)
apiServer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
replicas: 1
# Scheduler
scheduler:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1000m"
memory: "2Gi"
# Standalone DAG processor
dagProcessor:
enabled: true
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
# Triggerer (for deferrable tasks)
triggerer:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
# Worker resources (CeleryExecutor only)
workers:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
replicas: 2
# Log persistence
logs:
persistence:
enabled: true
size: 10Gi
# PostgreSQL (built-in)
postgresql:
enabled: true
# Or use an external database
# postgresql:
# enabled: false
# data:
# metadataConnection:
# user: airflow
# pass: airflow
# host: your-rds-host.amazonaws.com
# port: 5432
# db: airflow
# Upgrade with new values
helm upgrade airflow apache-airflow/airflow \
--namespace airflow \
-f values.yaml
# Upgrade to a new Airflow version
helm upgrade airflow apache-airflow/airflow \
--namespace airflow \
--set defaultAirflowTag="<version>"
# Check pod status
kubectl get pods -n airflow
# View scheduler logs
kubectl logs -f deployment/airflow-scheduler -n airflow
# Port-forward the API server
kubectl port-forward svc/airflow-apiserver 8080:8080 -n airflow
# Run a one-off CLI command
kubectl exec -it deployment/airflow-scheduler -n airflow -- airflow dags list
astro devWeekly Installs
104
Repository
GitHub Stars
269
First Seen
Feb 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
cursor100
github-copilot93
codex93
opencode92
gemini-cli91
amp91
Azure Data Explorer (Kusto) 查询技能:KQL数据分析、日志遥测与时间序列处理
125,100 周安装
redis | Celery message broker |