runpod by digitalsamba/claude-code-video-toolkit
npx skills add https://github.com/digitalsamba/claude-code-video-toolkit --skill runpod通过 RunPod 无服务器服务在云 GPU 上运行开源 AI 模型。按秒计费,无最低消费。
# 1. 在 https://runpod.io 创建账户
# 2. 将 API 密钥添加到 .env 文件
echo "RUNPOD_API_KEY=your_key_here" >> .env
# 3. 使用 --setup 参数部署任意工具
python tools/image_edit.py --setup
python tools/upscale.py --setup
python tools/dewatermark.py --setup
python tools/sadtalker.py --setup
python tools/qwen3_tts.py --setup
每个 --setup 命令会执行以下操作:
.env 文件(例如 RUNPOD_QWEN_EDIT_ENDPOINT_ID)所有镜像均在 GHCR 公开可用,无需身份验证。
| 工具 | Docker 镜像 |
|---|
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| GPU |
|---|
| 显存 |
|---|
| 典型成本 |
|---|
| image_edit | ghcr.io/conalmullan/video-toolkit-qwen-edit:latest | A6000/L40S | 48GB+ | ~$0.05-0.15/任务 |
| upscale | ghcr.io/conalmullan/video-toolkit-realesrgan:latest | RTX 3090/4090 | 24GB | ~$0.01-0.05/任务 |
| dewatermark | ghcr.io/conalmullan/video-toolkit-propainter:latest | RTX 3090/4090 | 24GB | ~$0.05-0.30/任务 |
| sadtalker | ghcr.io/conalmullan/video-toolkit-sadtalker:latest | RTX 4090 | 24GB | ~$0.05-0.15/任务 |
| qwen3_tts | ghcr.io/conalmullan/video-toolkit-qwen3-tts:latest | ADA 24GB | 24GB | ~$0.01-0.05/任务 |
月度总成本: 即使大量使用,通常也不会超过 10 美元。
所有工具都遵循相同的模式:
本地 CLI → 上传输入到云存储 → RunPod API → 轮询结果 → 下载输出
R2_ACCOUNT_ID、R2_ACCESS_KEY_ID、R2_SECRET_ACCESS_KEY、R2_BUCKET_NAME)时使用它,否则回退到免费上传服务/run 端点,然后轮询 /status/{job_id} 直到任务完成workersMin: 0 — 闲置时缩容到零(无成本)
workersMax: 1 — 最大并发任务数(增加以提高吞吐量)
idleTimeout: 5 — 工作进程缩容前的等待秒数
在所有端点中,您共享一个基于 RunPod 计划的总工作进程池。如果达到限制,请减少您当前未主动使用的端点的 workersMax。
每个工具将其端点 ID 存储在 .env 文件中:
| 工具 | 环境变量 |
|---|---|
| image_edit | RUNPOD_QWEN_EDIT_ENDPOINT_ID |
| upscale | RUNPOD_UPSCALE_ENDPOINT_ID |
| dewatermark | RUNPOD_DEWATERMARK_ENDPOINT_ID |
| sadtalker | RUNPOD_SADTALKER_ENDPOINT_ID |
| qwen3_tts | RUNPOD_QWEN3_TTS_ENDPOINT_ID |
要释放工作进程槽位而不删除端点,请通过 RunPod 控制面板或 GraphQL API 设置 workersMax=0。
使用这些 API 以编程方式查询和管理端点。RunPod 禁用了 GraphQL 自省,因此这些字段名称已验证且必须完全匹配。
所有 API 调用都需要 Authorization: Bearer $RUNPOD_API_KEY。
POST https://api.runpod.io/graphqlhttps://api.runpod.ai/v2/{endpoint_id}/...列出所有端点:
query { myself { endpoints { id name gpuIds templateId workersMax workersMin } } }
当前花费速率:
query { myself { currentSpendPerHr spendDetails { localStoragePerHour networkStoragePerHour gpuComputePerHour } } }
列出 Pod:
query { myself { pods { id name runtime { uptimeInSeconds } machine { gpuDisplayName } desiredStatus } } }
常见错误: 字段名称为驼峰式且使用完整单词 —
localStoragePerHour而不是localStoragePerHr。端点是endpoints而不是serverlessWorkers。spending不是字段 — 请使用currentSpendPerHr和spendDetails。
更新端点 GPU 或配置:
mutation { saveEndpoint(input: {
id: "endpoint_id",
name: "endpoint-name",
templateId: "template_id",
gpuIds: "AMPERE_24",
workersMin: 0,
workersMax: 1
}) { id gpuIds } }
saveEndpoint 即使在更新时也需要 name 和 templateId — 请先查询以获取当前值。
| 操作 | 方法 | URL |
|---|---|---|
| 提交任务 | POST | /v2/{id}/run |
| 检查状态 | GET | /v2/{id}/status/{job_id} |
| 取消任务 | POST | /v2/{id}/cancel/{job_id} |
| 列出待处理任务 | GET | /v2/{id}/requests |
| 健康状态/统计 | GET | /v2/{id}/health |
健康响应 包含任务计数和工作进程状态:
{
"jobs": { "completed": 16, "failed": 1, "inProgress": 0, "inQueue": 2, "retried": 0 },
"workers": { "idle": 0, "initializing": 1, "ready": 0, "running": 0, "throttled": 0 }
}
注意:
/requests仅返回待处理/排队中的任务。已完成的任务历史记录无法通过 API 获取 — 请查看 RunPod 网页控制台获取日志。
| ID | GPU | 显存 | 典型成本 |
|---|---|---|---|
AMPERE_24 | RTX 3090 | 24GB | ~$0.34/小时 |
ADA_24 | RTX 4090 | 24GB | ~$0.69/小时 |
AMPERE_48 | A6000 | 48GB | ~$0.76/小时 |
AMPERE_80 | A100 | 80GB | ~$1.99/小时 |
可用性说明: ADA_24(4090)在 RunPod 上经常被限制/不可用。始终为端点配置 多个备用 GPU 类型(逗号分隔),以避免任务无限期卡在队列中:
gpuIds: "AMPERE_24,ADA_24" # 先尝试 3090,回退到 4090
所有工具包工具还强制执行 5 分钟队列超时 — 如果在 300 秒内没有可用的 GPU,任务将自动取消,以防止因初始化循环失败而导致账单失控。
R2 使用 S3 兼容的 API,但需要 --region auto:
AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY_ID" \
AWS_SECRET_ACCESS_KEY="$R2_SECRET_ACCESS_KEY" \
aws s3api list-objects-v2 \
--bucket "$R2_BUCKET_NAME" \
--endpoint-url "https://${R2_ACCOUNT_ID}.r2.cloudflarestorage.com" \
--region auto
常见错误: 省略
--region auto会导致InvalidRegionName错误。R2 有效区域:wnam、enam、weur、eeur、apac、oc、auto。
当您推送新版本的 Docker 镜像时,RunPod 可能仍会使用缓存的旧版本。要强制拉取:
imageName 以使用 @sha256:DIGEST 表示法:latest 标签如果冷启动是个问题,请设置 workersMin: 1(闲置时会产生费用)。
模型需要的显存超过了 GPU 提供的量。选项:
--resize-ratio(默认为 0.5 以确保安全)--steps您已达到计划中的并发工作进程限制。可以:
workersMax=0所有 Dockerfile 位于 docker/runpod-*/。镜像使用 runpod/pytorch 作为基础,以在工具间共享层。
为 RunPod 构建(从 Apple Silicon Mac):
docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/
docker push ghcr.io/conalmullan/video-toolkit-<name>:latest
GHCR 包默认为 私有 — 您必须手动将其设为公开,以便 RunPod 拉取它们。转到 GitHub > Packages > Package Settings > Change Visibility。
workersMin: 0(缩容到零)workersMax=0 来禁用闲置端点而不删除它们每周安装量
68
仓库
GitHub 星标
149
首次出现
2026年2月24日
安全审计
安装于
codex64
amp63
gemini-cli63
kimi-cli63
cursor63
opencode63
Run open-source AI models on cloud GPUs via RunPod serverless. Pay-per-second, no minimums.
# 1. Create account at https://runpod.io
# 2. Add API key to .env
echo "RUNPOD_API_KEY=your_key_here" >> .env
# 3. Deploy any tool with --setup
python tools/image_edit.py --setup
python tools/upscale.py --setup
python tools/dewatermark.py --setup
python tools/sadtalker.py --setup
python tools/qwen3_tts.py --setup
Each --setup command:
.env (e.g. RUNPOD_QWEN_EDIT_ENDPOINT_ID)All images are public on GHCR — no authentication needed.
| Tool | Docker Image | GPU | VRAM | Typical Cost |
|---|---|---|---|---|
| image_edit | ghcr.io/conalmullan/video-toolkit-qwen-edit:latest | A6000/L40S | 48GB+ | ~$0.05-0.15/job |
| upscale | ghcr.io/conalmullan/video-toolkit-realesrgan:latest | RTX 3090/4090 | 24GB | ~$0.01-0.05/job |
| dewatermark | ghcr.io/conalmullan/video-toolkit-propainter:latest | RTX 3090/4090 | 24GB | ~$0.05-0.30/job |
| sadtalker | ghcr.io/conalmullan/video-toolkit-sadtalker:latest | RTX 4090 | 24GB | ~$0.05-0.15/job |
| qwen3_tts |
Total monthly cost: Rarely exceeds $10 even with heavy use.
All tools follow the same pattern:
Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output
R2_ACCOUNT_ID, R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, R2_BUCKET_NAME), falling back to free upload services/run endpoint, then poll /status/{job_id} until completeworkersMin: 0 — Scale to zero when idle (no cost)
workersMax: 1 — Max concurrent jobs (increase for throughput)
idleTimeout: 5 — Seconds before worker scales down
Across all endpoints, you share a total worker pool based on your RunPod plan. If you hit limits, reduce workersMax on endpoints you're not actively using.
Each tool stores its endpoint ID in .env:
| Tool | Env Var |
|---|---|
| image_edit | RUNPOD_QWEN_EDIT_ENDPOINT_ID |
| upscale | RUNPOD_UPSCALE_ENDPOINT_ID |
| dewatermark | RUNPOD_DEWATERMARK_ENDPOINT_ID |
| sadtalker | RUNPOD_SADTALKER_ENDPOINT_ID |
| qwen3_tts | RUNPOD_QWEN3_TTS_ENDPOINT_ID |
To free worker slots without deleting the endpoint, set workersMax=0 via the RunPod dashboard or GraphQL API.
Use these to query and manage endpoints programmatically. RunPod disables GraphQL introspection, so these field names are verified and must be exact.
All API calls require Authorization: Bearer $RUNPOD_API_KEY.
POST https://api.runpod.io/graphqlhttps://api.runpod.ai/v2/{endpoint_id}/...List all endpoints:
query { myself { endpoints { id name gpuIds templateId workersMax workersMin } } }
Current spend rate:
query { myself { currentSpendPerHr spendDetails { localStoragePerHour networkStoragePerHour gpuComputePerHour } } }
List pods:
query { myself { pods { id name runtime { uptimeInSeconds } machine { gpuDisplayName } desiredStatus } } }
Common mistakes: Field names are camelCase with full words —
localStoragePerHournotlocalStoragePerHr. Endpoints areendpointsnotserverlessWorkers.spendingis not a field — usecurrentSpendPerHrandspendDetails.
Update endpoint GPU or config:
mutation { saveEndpoint(input: {
id: "endpoint_id",
name: "endpoint-name",
templateId: "template_id",
gpuIds: "AMPERE_24",
workersMin: 0,
workersMax: 1
}) { id gpuIds } }
saveEndpoint requires name and templateId even for updates — query first to get current values.
| Action | Method | URL |
|---|---|---|
| Submit job | POST | /v2/{id}/run |
| Check status | GET | /v2/{id}/status/{job_id} |
| Cancel job | POST | /v2/{id}/cancel/{job_id} |
| List pending | GET | /v2/{id}/requests |
| Health/stats | GET | /v2/{id}/health |
Health response includes job counts and worker state:
{
"jobs": { "completed": 16, "failed": 1, "inProgress": 0, "inQueue": 2, "retried": 0 },
"workers": { "idle": 0, "initializing": 1, "ready": 0, "running": 0, "throttled": 0 }
}
Note:
/requestsonly returns pending/queued jobs. Completed job history is not available via the API — check the RunPod web console for logs.
| ID | GPU | VRAM | Typical Cost |
|---|---|---|---|
AMPERE_24 | RTX 3090 | 24GB | ~$0.34/hr |
ADA_24 | RTX 4090 | 24GB | ~$0.69/hr |
AMPERE_48 | A6000 | 48GB | ~$0.76/hr |
AMPERE_80 | A100 | 80GB | ~$1.99/hr |
Availability note: ADA_24 (4090) is frequently throttled/unavailable on RunPod. Always configure endpoints with multiple fallback GPU types (comma-separated) to avoid jobs getting stuck in queue indefinitely:
gpuIds: "AMPERE_24,ADA_24" # Try 3090 first, fall back to 4090
All toolkit tools also enforce a 5-minute queue timeout — if no GPU is available within 300 seconds, the job is automatically cancelled to prevent runaway billing from failed initialization cycles.
R2 uses the S3-compatible API but requires --region auto:
AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY_ID" \
AWS_SECRET_ACCESS_KEY="$R2_SECRET_ACCESS_KEY" \
aws s3api list-objects-v2 \
--bucket "$R2_BUCKET_NAME" \
--endpoint-url "https://${R2_ACCOUNT_ID}.r2.cloudflarestorage.com" \
--region auto
Common mistake: Omitting
--region autocausesInvalidRegionNameerror. R2 valid regions:wnam,enam,weur,eeur,apac,oc,auto.
When you push a new Docker image version, RunPod may still use the cached old one. To force a pull:
imageName to use @sha256:DIGEST notation:latest tag after confirmingIf cold starts are a problem, set workersMin: 1 (costs money when idle).
The model needs more VRAM than the GPU provides. Options:
--resize-ratio (default 0.5 for safety)--stepsYou've hit your plan's concurrent worker limit. Either:
workersMax=0 on endpoints you're not usingAll Dockerfiles live in docker/runpod-*/. Images use runpod/pytorch as the base to share layers across tools.
Building for RunPod (from Apple Silicon Mac):
docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/
docker push ghcr.io/conalmullan/video-toolkit-<name>:latest
GHCR packages default to private — you must manually make them public for RunPod to pull them. Go to GitHub > Packages > Package Settings > Change Visibility.
workersMin: 0 on all endpoints (scale to zero)workersMax=0 to disable idle endpoints without deleting themWeekly Installs
68
Repository
GitHub Stars
149
First Seen
Feb 24, 2026
Security Audits
Gen Agent Trust HubPassSocketWarnSnykWarn
Installed on
codex64
amp63
gemini-cli63
kimi-cli63
cursor63
opencode63
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
122,000 周安装
GrepAI PostgreSQL存储技能:pgvector扩展实现团队代码搜索与向量数据库集成
264 周安装
Google Workspace CLI 快速安装指南:gws-install 技能配置与身份验证
275 周安装
Elasticsearch 文件数据导入工具 - 流式处理 NDJSON/CSV/Parquet/Arrow 文件到 ES
277 周安装
专业主题工厂:10+精选字体色彩主题,一键应用幻灯片设计
265 周安装
并行代理调度:如何高效处理多个独立故障的AI代理模式
265 周安装
后端API交接文档模板:为前端开发生成结构化技术文档
267 周安装
ghcr.io/conalmullan/video-toolkit-qwen3-tts:latest |
| ADA 24GB |
| 24GB |
| ~$0.01-0.05/job |