baoyu-image-gen：多平台AI图像生成工具，支持OpenAI、Google、阿里通义等主流API

baoyu-image-gen by jimliu/baoyu-skills

14,400 周安装量

11,600 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/jimliu/baoyu-skills --skill baoyu-image-gen

AI/机器学习开发设计

🇨🇳中文介绍

图像生成 (AI SDK)

基于官方 API 的图像生成。支持 OpenAI、Azure OpenAI、Google、OpenRouter、DashScope（阿里通义万象）、Jimeng（即梦）、Seedream（豆包）和 Replicate 提供商。

脚本目录

代理执行：

{baseDir} = 此 SKILL.md 文件所在目录
脚本路径 = {baseDir}/scripts/main.ts
解析 ${BUN_X} 运行时：如果已安装 bun → bun；如果 npx 可用 → npx -y bun；否则建议安装 bun

步骤 0：加载首选项 ⛔ 阻塞

关键：此步骤必须在任何图像生成之前完成。请勿跳过或延迟。

检查 EXTEND.md 文件是否存在（优先级：项目 → 用户）：

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

733,500 周安装

Vercel React 最佳实践指南 | 58条Next.js性能优化规则与代码重构

252,100 周安装

Vercel Web界面规范检查工具 - 自动检测代码是否符合Web设计指南

202,600 周安装

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

133,200 周安装

# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"



# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }

结果	操作
找到	加载、解析、应用设置。如果 `default_model.[provider]` 为 null → 仅询问模型（流程 2）
未找到	⛔ 运行首次设置 (references/config/first-time-setup.md) → 保存 EXTEND.md → 然后继续

关键：如果未找到，请在生成任何图像之前，使用 AskUserQuestion 完成完整设置（提供商 + 模型 + 质量 + 保存位置）。在 EXTEND.md 创建之前，生成操作将被阻塞。

路径	位置
`.baoyu-skills/baoyu-image-gen/EXTEND.md`	项目目录
`$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md`	用户主目录

架构：references/config/preferences-schema.md

# 基本用法
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png

# 指定宽高比
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# 高质量
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# 从提示文件读取
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png

# 使用参考图像（Google、OpenAI、Azure OpenAI、OpenRouter、Replicate 或 Seedream 4.0/4.5/5.0）
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# 使用参考图像（显式指定提供商/模型）
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

# Azure OpenAI（model 指部署名称）
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5

# OpenRouter（推荐的默认模型）
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter

# OpenRouter 使用参考图像
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png

# 指定提供商
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

# DashScope（阿里通义万象）
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

# DashScope Qwen-Image 2.0 Pro（推荐用于自定义尺寸和文本渲染）
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报，包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872

# DashScope 旧版 Qwen 固定尺寸模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928

# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Replicate 使用特定模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

# 批量模式，使用已保存的提示文件
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json

# 批量模式，显式指定工作线程数
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json

{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}

promptFiles、image 和 ref 中的路径相对于批量文件所在目录进行解析。jobs 是可选的（会被 CLI 的 --jobs 覆盖）。也接受顶级数组格式（不带 jobs 包装器）。

选项	描述
`--prompt <text>`, `-p`	提示文本
`--promptfiles <files...>`	从文件读取提示（拼接）
`--image <path>`	输出图像路径（单图像模式下必需）
`--batchfile <path>`	用于多图像生成的 JSON 批量文件
`--jobs <count>`	批量模式的工作线程数（默认：自动，取配置中的最大值，内置默认值 10）
`--provider google	openai
`--model <id>`, `-m`	模型 ID（Google：`gemini-3-pro-image-preview`；OpenAI：`gpt-image-1.5`；Azure：部署名称，如 `gpt-image-1.5` 或 `image-prod`；OpenRouter：`google/gemini-3.1-flash-image-preview`；DashScope：`qwen-image-2.0-pro`）
`--ar <ratio>`	宽高比（例如 `16:9`、`1:1`、`4:3`）
`--size <WxH>`	尺寸（例如 `1024x1024`）
`--quality normal	2k`
`--imageSize 1K	2K
`--ref <files...>`	参考图像。支持 Google 多模态、OpenAI GPT Image 编辑、Azure OpenAI 编辑（仅 PNG/JPG）、OpenRouter 多模态模型、Replicate 和 Seedream 5.0/4.5/4.0。Jimeng、Seedream 3.0 或已移除的 SeedEdit 3.0 不支持
`--n <count>`	图像数量
`--json`	JSON 输出

变量	描述
`OPENAI_API_KEY`	OpenAI API 密钥
`AZURE_OPENAI_API_KEY`	Azure OpenAI API 密钥
`OPENROUTER_API_KEY`	OpenRouter API 密钥
`GOOGLE_API_KEY`	Google API 密钥
`DASHSCOPE_API_KEY`	DashScope API 密钥（阿里云）
`REPLICATE_API_TOKEN`	Replicate API 令牌
`JIMENG_ACCESS_KEY_ID`	Jimeng（即梦）火山引擎访问密钥 ID
`JIMENG_SECRET_ACCESS_KEY`	Jimeng（即梦）火山引擎秘密访问密钥
`ARK_API_KEY`	Seedream（豆包）火山引擎 ARK API 密钥
`OPENAI_IMAGE_MODEL`	OpenAI 模型覆盖
`AZURE_OPENAI_DEPLOYMENT`	Azure 默认部署名称
`AZURE_OPENAI_IMAGE_MODEL`	Azure 默认部署/模型名称的向后兼容别名
`OPENROUTER_IMAGE_MODEL`	OpenRouter 模型覆盖（默认：`google/gemini-3.1-flash-image-preview`）
`GOOGLE_IMAGE_MODEL`	Google 模型覆盖
`DASHSCOPE_IMAGE_MODEL`	DashScope 模型覆盖（默认：`qwen-image-2.0-pro`）
`REPLICATE_IMAGE_MODEL`	Replicate 模型覆盖（默认：google/nano-banana-pro）
`JIMENG_IMAGE_MODEL`	Jimeng 模型覆盖（默认：jimeng_t2i_v40）
`SEEDREAM_IMAGE_MODEL`	Seedream 模型覆盖（默认：doubao-seedream-5-0-260128）
`OPENAI_BASE_URL`	自定义 OpenAI 端点
`AZURE_OPENAI_BASE_URL`	Azure 资源端点或部署端点
`AZURE_API_VERSION`	Azure 图像 API 版本（默认：`2025-04-01-preview`）
`OPENROUTER_BASE_URL`	自定义 OpenRouter 端点（默认：`https://openrouter.ai/api/v1`）
`OPENROUTER_HTTP_REFERER`	用于 OpenRouter 归属的可选应用/站点 URL
`OPENROUTER_TITLE`	用于 OpenRouter 归属的可选应用名称
`GOOGLE_BASE_URL`	自定义 Google 端点
`DASHSCOPE_BASE_URL`	自定义 DashScope 端点
`REPLICATE_BASE_URL`	自定义 Replicate 端点
`JIMENG_BASE_URL`	自定义 Jimeng 端点（默认：`https://visual.volcengineapi.com`）
`JIMENG_REGION`	Jimeng 区域（默认：`cn-north-1`）
`SEEDREAM_BASE_URL`	自定义 Seedream 端点（默认：`https://ark.cn-beijing.volces.com/api/v3`）
`BAOYU_IMAGE_GEN_MAX_WORKERS`	覆盖批量工作线程上限
`BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY`	覆盖提供商并发数，例如 `BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY`
`BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS`	覆盖提供商启动间隔，例如 `BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS`

加载优先级：CLI 参数 > EXTEND.md > 环境变量 > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

模型优先级（从高到低），适用于所有提供商：

CLI 标志：--model <id>
EXTEND.md：default_model.[provider]
环境变量：<PROVIDER>_IMAGE_MODEL（例如 GOOGLE_IMAGE_MODEL）
内置默认值

对于 Azure，--model / default_model.azure 应为 Azure 部署名称。AZURE_OPENAI_DEPLOYMENT 是首选环境变量，AZURE_OPENAI_IMAGE_MODEL 作为向后兼容的别名保留。

EXTEND.md 覆盖环境变量。如果同时存在 EXTEND.md 的 default_model.google: "gemini-3-pro-image-preview" 和环境变量 GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview，则以 EXTEND.md 为准。

代理必须在每次生成前显示模型信息：

显示：Using [provider] / [model]
显示切换提示：Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL

当用户需要官方的 Qwen-Image 行为时，使用 --model qwen-image-2.0-pro 或设置 default_model.dashscope / DASHSCOPE_IMAGE_MODEL。

官方 DashScope 模型系列：

qwen-image-2.0-pro、qwen-image-2.0-pro-2026-03-03、qwen-image-2.0、qwen-image-2.0-2026-03-03
- 自由格式的 size，格式为 宽*高
- 总像素数必须保持在 512*512 到 2048*2048 之间
- 默认尺寸约为 1024*1024
- 是自定义比例（如 21:9）和文本密集的中文/英文布局的最佳选择
qwen-image-max、qwen-image-max-2025-12-30、qwen-image-plus、qwen-image-plus-2026-01-09、qwen-image
- 仅支持固定尺寸：1664*928、1472*1104、1328*1328、1104*1472、928*1664
- 默认尺寸为 1664*928
- qwen-image 目前与 qwen-image-plus 具有相同能力
旧版 DashScope 模型，如 z-image-turbo、z-image-ultra、wanx-v1
- 仅当用户明确要求旧版行为或兼容性时，才继续使用它们

将 CLI 参数转换为 DashScope 行为时：

--size 优先于 --ar
对于 qwen-image-2.0*，优先使用显式的 --size；否则根据 --ar 推断，并使用下面官方推荐的分辨率
对于 qwen-image-max/plus/image，仅使用五个官方固定尺寸；如果请求的比例不包含在内，则切换到 qwen-image-2.0-pro
--quality 是 baoyu-image-gen 的兼容性预设，并非原生的 DashScope API 字段。将 normal / 2k 映射到下面的 qwen-image-2.0* 表格是一种实现推断，并非官方的 API 保证

常见宽高比的推荐 qwen-image-2.0* 尺寸：

比例	`normal`	`2k`
`1:1`	`1024*1024`	`1536*1536`
`2:3`	`768*1152`	`1024*1536`
`3:2`	`1152*768`	`1536*1024`
`3:4`	`960*1280`	`1080*1440`
`4:3`	`1280*960`	`1440*1080`
`9:16`	`720*1280`	`1080*1920`
`16:9`	`1280*720`	`1920*1080`
`21:9`	`1344*576`	`2048*872`

DashScope 官方 API 也暴露了 negative_prompt、prompt_extend 和 watermark，但 baoyu-image-gen 目前并未将它们作为专用的 CLI 标志暴露。

使用完整的 OpenRouter 模型 ID，例如：

google/gemini-3.1-flash-image-preview（推荐，支持图像输出和参考图像工作流）
google/gemini-2.5-flash-image-preview
black-forest-labs/flux.2-pro
其他支持图像的 OpenRouter 模型 ID

OpenRouter 图像生成使用 /chat/completions，而非 OpenAI 的 /images 端点
如果使用 --ref，请选择支持图像输入和图像输出的多模态模型
--imageSize 映射到 OpenRouter 的 imageGenerationOptions.size；--size <WxH> 在可能的情况下会转换为最接近的 OpenRouter 尺寸并推断宽高比

支持的模型格式：

owner/name（官方模型推荐），例如 google/nano-banana-pro
owner/name:version（社区模型按版本），例如 stability-ai/sdxl:<version>

# 使用 Replicate 默认模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# 显式覆盖模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

提供了 --ref + 没有 --provider → 自动首选 Google，然后是 OpenAI，接着是 OpenRouter，最后是 Replicate（Jimeng 和 Seedream 不支持参考图像）
指定了 --provider → 使用它（如果使用 --ref，必须是 google、openai、openrouter 或 replicate）
只有一个 API 密钥可用 → 使用该提供商
多个可用 → 默认为 Google

预设	Google imageSize	OpenAI Size	OpenRouter size	Replicate resolution	使用场景
`normal`	1K	1024px	1K	1K	快速预览
`2k`（默认）	2K	2048px	2K	2K	封面、插图、信息图

Google/OpenRouter imageSize：可以用 --imageSize 1K|2K|4K 覆盖

支持：1:1、16:9、9:16、4:3、3:4、2.35:1

Google 多模态：使用 imageConfig.aspectRatio
OpenAI：映射到最接近的支持尺寸
OpenRouter：发送 imageGenerationOptions.aspect_ratio；如果只给出 --size <WxH>，则会自动推断宽高比
Replicate：将 aspect_ratio 传递给模型；当提供 --ref 而没有 --ar 时，默认为 match_input_image

默认：顺序生成。

批量并行生成：当 --batchfile 包含 2 个或更多待处理任务时，脚本会自动启用并行生成。

模式	何时使用
顺序（默认）	正常使用、单张图像、小批量
并行批量	批量模式且有 2 个以上任务

情况	首选方法	原因
一张图像，或 1-2 张简单图像	顺序	协调开销更低，调试更容易
多张图像已有保存的提示文件	批量（`--batchfile`）	复用已确定的提示，应用共享的节流/重试，并提供可预测的吞吐量
每张图像仍需要独立的推理、提示编写或风格探索	子代理	工作仍处于探索阶段，因此每张图像在生成前可能需要独立分析
输出来自 `baoyu-article-illustrator` 的 `outline.md` + `prompts/`	批量（`build-batch.ts` -> `--batchfile`）	该工作流已经生成提示文件，因此直接批量执行是预期的路径

一旦提示文件已保存且任务是“生成所有这些”，优先选择批量而非子代理
仅当生成与每张图像的思考、重写或发散性创意探索紧密耦合时，才使用子代理

默认工作线程数是自动的，受配置限制，内置默认值为 10
特定提供商的节流仅在批量模式下应用，内置默认值旨在实现更快的吞吐量，同时避免明显的 RPM 突发
您可以使用 --jobs <count> 覆盖工作线程数
每张图像自动重试最多 3 次
最终输出包括成功计数、失败计数和每张图像的失败原因

缺少 API 密钥 → 显示错误并提供设置说明
生成失败 → 每张图像自动重试最多 3 次
无效的宽高比 → 警告，使用默认值继续
参考图像与不支持的提供商/模型 → 显示错误并提供修复提示

通过 EXTEND.md 进行自定义配置。有关路径和支持的选项，请参阅首选项部分。

🇺🇸English

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI, Azure OpenAI, Google, OpenRouter, DashScope (阿里通义万象), Jimeng (即梦), Seedream (豆包) and Replicate providers.

Script Directory

Agent Execution :

{baseDir} = this SKILL.md file's directory
Script path = {baseDir}/scripts/main.ts
Resolve ${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun

Step 0: Load Preferences ⛔ BLOCKING

CRITICAL : This step MUST complete BEFORE any image generation. Do NOT skip or defer.

Check EXTEND.md existence (priority: project → user):

# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"



# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }

Result	Action
Found	Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2)
Not found	⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue

CRITICAL : If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.

Path	Location
`.baoyu-skills/baoyu-image-gen/EXTEND.md`	Project directory
`$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md`	User home

Schema: references/config/preferences-schema.md

Usage

# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google, OpenAI, Azure OpenAI, OpenRouter, Replicate, or Seedream 4.0/4.5/5.0)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

# Azure OpenAI (model means deployment name)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5

# OpenRouter (recommended default model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter

# OpenRouter with reference images
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png

# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

# DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报，包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872

# DashScope legacy Qwen fixed-size model
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928

# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json

# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json

Batch File Format

{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}

Paths in promptFiles, image, and ref are resolved relative to the batch file's directory. jobs is optional (overridden by CLI --jobs). Top-level array format (without jobs wrapper) is also accepted.

Options

Option	Description
`--prompt <text>`, `-p`	Prompt text
`--promptfiles <files...>`	Read prompt from files (concatenated)
`--image <path>`	Output image path (required in single-image mode)
`--batchfile <path>`	JSON batch file for multi-image generation
`--jobs <count>`	Worker count for batch mode (default: auto, max from config, built-in default 10)

Environment Variables

Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`AZURE_OPENAI_API_KEY`	Azure OpenAI API key
`OPENROUTER_API_KEY`	OpenRouter API key
`GOOGLE_API_KEY`	Google API key
`DASHSCOPE_API_KEY`	DashScope API key (阿里云)
`REPLICATE_API_TOKEN`	Replicate API token

Load Priority : CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

Model Resolution

Model priority (highest → lowest), applies to all providers:

CLI flag: --model <id>
EXTEND.md: default_model.[provider]
Env var: <PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)
Built-in default

For Azure, --model / default_model.azure should be the Azure deployment name. AZURE_OPENAI_DEPLOYMENT is the preferred env var, and AZURE_OPENAI_IMAGE_MODEL remains as a backward-compatible alias.

EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.

Agent MUST display model info before each generation:

Show: Using [provider] / [model]
Show switch hint: Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL

DashScope Models

Use --model qwen-image-2.0-pro or set default_model.dashscope / DASHSCOPE_IMAGE_MODEL when the user wants official Qwen-Image behavior.

Official DashScope model families:

qwen-image-2.0-pro, qwen-image-2.0-pro-2026-03-03, qwen-image-2.0, qwen-image-2.0-2026-03-03
- Free-form size in 宽*高 format
- Total pixels must stay between 512*512 and 2048*2048
- Default size is approximately 1024*1024
- Best choice for custom ratios such as 21:9 and text-heavy Chinese/English layouts

When translating CLI args into DashScope behavior:

--size wins over --ar
For qwen-image-2.0*, prefer explicit --size; otherwise infer from --ar and use the official recommended resolutions below
For qwen-image-max/plus/image, only use the five official fixed sizes; if the requested ratio is not covered, switch to qwen-image-2.0-pro
--quality is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping normal / 2k onto the table below is an implementation inference, not an official API guarantee

Recommended qwen-image-2.0* sizes for common aspect ratios:

Ratio	`normal`	`2k`
`1:1`	`1024*1024`	`1536*1536`
`2:3`	`768*1152`	`1024*1536`
`3:2`

DashScope official APIs also expose negative_prompt, prompt_extend, and watermark, but baoyu-image-gen does not expose them as dedicated CLI flags today.

Official references:

OpenRouter Models

Use full OpenRouter model IDs, e.g.:

google/gemini-3.1-flash-image-preview (recommended, supports image output and reference-image workflows)
google/gemini-2.5-flash-image-preview
black-forest-labs/flux.2-pro
Other OpenRouter image-capable model IDs

Notes:

OpenRouter image generation uses /chat/completions, not the OpenAI /images endpoints
If --ref is used, choose a multimodal model that supports image input and image output
--imageSize maps to OpenRouter imageGenerationOptions.size; --size <WxH> is converted to the nearest OpenRouter size and inferred aspect ratio when possible

Replicate Models

Supported model formats:

owner/name (recommended for official models), e.g. google/nano-banana-pro
owner/name:version (community models by version), e.g. stability-ai/sdxl:<version>

Examples:

# Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

Provider Selection

--ref provided + no --provider → auto-select Google first, then OpenAI, then OpenRouter, then Replicate (Jimeng and Seedream do not support reference images)
--provider specified → use it (if --ref, must be google, openai, openrouter, or replicate)
Only one API key available → use that provider
Multiple available → default to Google

Quality Presets

Preset	Google imageSize	OpenAI Size	OpenRouter size	Replicate resolution	Use Case
`normal`	1K	1024px	1K	1K	Quick previews
`2k` (default)	2K	2048px	2K	2K	Covers, illustrations, infographics

Google/OpenRouter imageSize : Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

Google multimodal: uses imageConfig.aspectRatio
OpenAI: maps to closest supported size
OpenRouter: sends imageGenerationOptions.aspect_ratio; if only --size <WxH> is given, aspect ratio is inferred automatically
Replicate: passes aspect_ratio to model; when --ref is provided without --ar, defaults to match_input_image

Generation Mode

Default : Sequential generation.

Batch Parallel Generation : When --batchfile contains 2 or more pending tasks, the script automatically enables parallel generation.

Mode	When to Use
Sequential (default)	Normal usage, single images, small batches
Parallel batch	Batch mode with 2+ tasks

Execution choice:

Situation	Preferred approach	Why
One image, or 1-2 simple images	Sequential	Lower coordination overhead and easier debugging
Multiple images already have saved prompt files	Batch (`--batchfile`)	Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput
Each image still needs separate reasoning, prompt writing, or style exploration	Subagents	The work is still exploratory, so each image may need independent analysis before generation
Output comes from `baoyu-article-illustrator` with `outline.md` + `prompts/`	Batch (`build-batch.ts` -> )

Rule of thumb:

Prefer batch over subagents once prompt files are already saved and the task is "generate all of these"
Use subagents only when generation is coupled with per-image thinking, rewriting, or divergent creative exploration

Parallel behavior:

Default worker count is automatic, capped by config, built-in default 10
Provider-specific throttling is applied only in batch mode, and the built-in defaults are tuned for faster throughput while still avoiding obvious RPM bursts
You can override worker count with --jobs <count>
Each image retries automatically up to 3 attempts
Final output includes success count, failure count, and per-image failure reasons

Error Handling

Missing API key → error with setup instructions
Generation failure → auto-retry up to 3 attempts per image
Invalid aspect ratio → warning, proceed with default
Reference images with unsupported provider/model → error with fix hint

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

Weekly Installs

14.4K

Repository

jimliu/baoyu-skills

GitHub Stars

11.6K

First Seen

Jan 21, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

opencode13.2K

gemini-cli12.9K

codex12.8K

cursor12.3K

github-copilot12.0K

amp11.6K

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

102,200 周安装

qwen-image-max, qwen-image-max-2025-12-30, qwen-image-plus, qwen-image-plus-2026-01-09, qwen-image

Fixed sizes only: 1664*928, 1472*1104, 1328*1328, 1104*1472, 928*1664
Default size is 1664*928
qwen-image currently has the same capability as qwen-image-plus

Legacy DashScope models such as z-image-turbo, z-image-ultra, wanx-v1

Keep using them only when the user explicitly asks for legacy behavior or compatibility

baoyu-image-gen：多平台AI图像生成工具，支持OpenAI、Google、阿里通义等主流API

🇨🇳中文介绍

图像生成 (AI SDK)

脚本目录

步骤 0：加载首选项 ⛔ 阻塞

相关 Skills

用法

批量文件格式

选项

环境变量

模型解析

DashScope 模型

OpenRouter 模型

Replicate 模型

提供商选择

质量预设

宽高比

生成模式

错误处理

扩展支持