baoyu-image-gen by jimliu/baoyu-skills
npx skills add https://github.com/jimliu/baoyu-skills --skill baoyu-image-gen基于官方 API 的图像生成。支持 OpenAI、Azure OpenAI、Google、OpenRouter、DashScope(阿里通义万象)、Jimeng(即梦)、Seedream(豆包)和 Replicate 提供商。
代理执行:
{baseDir} = 此 SKILL.md 文件所在目录{baseDir}/scripts/main.ts${BUN_X} 运行时:如果已安装 bun → bun;如果 npx 可用 → npx -y bun;否则建议安装 bun关键:此步骤必须在任何图像生成之前完成。请勿跳过或延迟。
检查 EXTEND.md 文件是否存在(优先级:项目 → 用户):
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }
| 结果 | 操作 |
|---|---|
| 找到 | 加载、解析、应用设置。如果 default_model.[provider] 为 null → 仅询问模型(流程 2) |
| 未找到 | ⛔ 运行首次设置 (references/config/first-time-setup.md) → 保存 EXTEND.md → 然后继续 |
关键:如果未找到,请在生成任何图像之前,使用 AskUserQuestion 完成完整设置(提供商 + 模型 + 质量 + 保存位置)。在 EXTEND.md 创建之前,生成操作将被阻塞。
| 路径 | 位置 |
|---|---|
.baoyu-skills/baoyu-image-gen/EXTEND.md | 项目目录 |
$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md | 用户主目录 |
EXTEND.md 支持:默认提供商 | 默认质量 | 默认宽高比 | 默认图像尺寸 | 默认模型 | 批量工作线程上限 | 特定提供商的批量限制
架构:references/config/preferences-schema.md
# 基本用法
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# 指定宽高比
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# 高质量
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# 从提示文件读取
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# 使用参考图像(Google、OpenAI、Azure OpenAI、OpenRouter、Replicate 或 Seedream 4.0/4.5/5.0)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# 使用参考图像(显式指定提供商/模型)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Azure OpenAI(model 指部署名称)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5
# OpenRouter(推荐的默认模型)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
# OpenRouter 使用参考图像
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
# 指定提供商
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope(阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# DashScope Qwen-Image 2.0 Pro(推荐用于自定义尺寸和文本渲染)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
# DashScope 旧版 Qwen 固定尺寸模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate 使用特定模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
# 批量模式,使用已保存的提示文件
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
# 批量模式,显式指定工作线程数
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}
promptFiles、image 和 ref 中的路径相对于批量文件所在目录进行解析。jobs 是可选的(会被 CLI 的 --jobs 覆盖)。也接受顶级数组格式(不带 jobs 包装器)。
| 选项 | 描述 |
|---|---|
--prompt <text>, -p | 提示文本 |
--promptfiles <files...> | 从文件读取提示(拼接) |
--image <path> | 输出图像路径(单图像模式下必需) |
--batchfile <path> | 用于多图像生成的 JSON 批量文件 |
--jobs <count> | 批量模式的工作线程数(默认:自动,取配置中的最大值,内置默认值 10) |
| `--provider google | openai |
--model <id>, -m | 模型 ID(Google:gemini-3-pro-image-preview;OpenAI:gpt-image-1.5;Azure:部署名称,如 gpt-image-1.5 或 image-prod;OpenRouter:google/gemini-3.1-flash-image-preview;DashScope:qwen-image-2.0-pro) |
--ar <ratio> | 宽高比(例如 16:9、1:1、4:3) |
--size <WxH> | 尺寸(例如 1024x1024) |
| `--quality normal | 2k` |
| `--imageSize 1K | 2K |
--ref <files...> | 参考图像。支持 Google 多模态、OpenAI GPT Image 编辑、Azure OpenAI 编辑(仅 PNG/JPG)、OpenRouter 多模态模型、Replicate 和 Seedream 5.0/4.5/4.0。Jimeng、Seedream 3.0 或已移除的 SeedEdit 3.0 不支持 |
--n <count> | 图像数量 |
--json | JSON 输出 |
| 变量 | 描述 |
|---|---|
OPENAI_API_KEY | OpenAI API 密钥 |
AZURE_OPENAI_API_KEY | Azure OpenAI API 密钥 |
OPENROUTER_API_KEY | OpenRouter API 密钥 |
GOOGLE_API_KEY | Google API 密钥 |
DASHSCOPE_API_KEY | DashScope API 密钥(阿里云) |
REPLICATE_API_TOKEN | Replicate API 令牌 |
JIMENG_ACCESS_KEY_ID | Jimeng(即梦)火山引擎访问密钥 ID |
JIMENG_SECRET_ACCESS_KEY | Jimeng(即梦)火山引擎秘密访问密钥 |
ARK_API_KEY | Seedream(豆包)火山引擎 ARK API 密钥 |
OPENAI_IMAGE_MODEL | OpenAI 模型覆盖 |
AZURE_OPENAI_DEPLOYMENT | Azure 默认部署名称 |
AZURE_OPENAI_IMAGE_MODEL | Azure 默认部署/模型名称的向后兼容别名 |
OPENROUTER_IMAGE_MODEL | OpenRouter 模型覆盖(默认:google/gemini-3.1-flash-image-preview) |
GOOGLE_IMAGE_MODEL | Google 模型覆盖 |
DASHSCOPE_IMAGE_MODEL | DashScope 模型覆盖(默认:qwen-image-2.0-pro) |
REPLICATE_IMAGE_MODEL | Replicate 模型覆盖(默认:google/nano-banana-pro) |
JIMENG_IMAGE_MODEL | Jimeng 模型覆盖(默认:jimeng_t2i_v40) |
SEEDREAM_IMAGE_MODEL | Seedream 模型覆盖(默认:doubao-seedream-5-0-260128) |
OPENAI_BASE_URL | 自定义 OpenAI 端点 |
AZURE_OPENAI_BASE_URL | Azure 资源端点或部署端点 |
AZURE_API_VERSION | Azure 图像 API 版本(默认:2025-04-01-preview) |
OPENROUTER_BASE_URL | 自定义 OpenRouter 端点(默认:https://openrouter.ai/api/v1) |
OPENROUTER_HTTP_REFERER | 用于 OpenRouter 归属的可选应用/站点 URL |
OPENROUTER_TITLE | 用于 OpenRouter 归属的可选应用名称 |
GOOGLE_BASE_URL | 自定义 Google 端点 |
DASHSCOPE_BASE_URL | 自定义 DashScope 端点 |
REPLICATE_BASE_URL | 自定义 Replicate 端点 |
JIMENG_BASE_URL | 自定义 Jimeng 端点(默认:https://visual.volcengineapi.com) |
JIMENG_REGION | Jimeng 区域(默认:cn-north-1) |
SEEDREAM_BASE_URL | 自定义 Seedream 端点(默认:https://ark.cn-beijing.volces.com/api/v3) |
BAOYU_IMAGE_GEN_MAX_WORKERS | 覆盖批量工作线程上限 |
BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY | 覆盖提供商并发数,例如 BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY |
BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS | 覆盖提供商启动间隔,例如 BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS |
加载优先级:CLI 参数 > EXTEND.md > 环境变量 > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env
模型优先级(从高到低),适用于所有提供商:
--model <id>default_model.[provider]<PROVIDER>_IMAGE_MODEL(例如 GOOGLE_IMAGE_MODEL)对于 Azure,--model / default_model.azure 应为 Azure 部署名称。AZURE_OPENAI_DEPLOYMENT 是首选环境变量,AZURE_OPENAI_IMAGE_MODEL 作为向后兼容的别名保留。
EXTEND.md 覆盖环境变量。如果同时存在 EXTEND.md 的 default_model.google: "gemini-3-pro-image-preview" 和环境变量 GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview,则以 EXTEND.md 为准。
代理必须在每次生成前显示模型信息:
Using [provider] / [model]Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL当用户需要官方的 Qwen-Image 行为时,使用 --model qwen-image-2.0-pro 或设置 default_model.dashscope / DASHSCOPE_IMAGE_MODEL。
官方 DashScope 模型系列:
qwen-image-2.0-pro、qwen-image-2.0-pro-2026-03-03、qwen-image-2.0、qwen-image-2.0-2026-03-03
size,格式为 宽*高512*512 到 2048*2048 之间1024*102421:9)和文本密集的中文/英文布局的最佳选择qwen-image-max、qwen-image-max-2025-12-30、qwen-image-plus、qwen-image-plus-2026-01-09、qwen-image
1664*928、1472*1104、1328*1328、1104*1472、928*16641664*928qwen-image 目前与 qwen-image-plus 具有相同能力z-image-turbo、z-image-ultra、wanx-v1
将 CLI 参数转换为 DashScope 行为时:
--size 优先于 --arqwen-image-2.0*,优先使用显式的 --size;否则根据 --ar 推断,并使用下面官方推荐的分辨率qwen-image-max/plus/image,仅使用五个官方固定尺寸;如果请求的比例不包含在内,则切换到 qwen-image-2.0-pro--quality 是 baoyu-image-gen 的兼容性预设,并非原生的 DashScope API 字段。将 normal / 2k 映射到下面的 qwen-image-2.0* 表格是一种实现推断,并非官方的 API 保证常见宽高比的推荐 qwen-image-2.0* 尺寸:
| 比例 | normal | 2k |
|---|---|---|
1:1 | 1024*1024 | 1536*1536 |
2:3 | 768*1152 | 1024*1536 |
3:2 | 1152*768 | 1536*1024 |
3:4 | 960*1280 | 1080*1440 |
4:3 | 1280*960 | 1440*1080 |
9:16 | 720*1280 | 1080*1920 |
16:9 | 1280*720 | 1920*1080 |
21:9 | 1344*576 | 2048*872 |
DashScope 官方 API 也暴露了 negative_prompt、prompt_extend 和 watermark,但 baoyu-image-gen 目前并未将它们作为专用的 CLI 标志暴露。
官方参考:
使用完整的 OpenRouter 模型 ID,例如:
google/gemini-3.1-flash-image-preview(推荐,支持图像输出和参考图像工作流)google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-pro注意:
/chat/completions,而非 OpenAI 的 /images 端点--ref,请选择支持图像输入和图像输出的多模态模型--imageSize 映射到 OpenRouter 的 imageGenerationOptions.size;--size <WxH> 在可能的情况下会转换为最接近的 OpenRouter 尺寸并推断宽高比支持的模型格式:
owner/name(官方模型推荐),例如 google/nano-banana-proowner/name:version(社区模型按版本),例如 stability-ai/sdxl:<version>示例:
# 使用 Replicate 默认模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# 显式覆盖模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
--ref + 没有 --provider → 自动首选 Google,然后是 OpenAI,接着是 OpenRouter,最后是 Replicate(Jimeng 和 Seedream 不支持参考图像)--provider → 使用它(如果使用 --ref,必须是 google、openai、openrouter 或 replicate)| 预设 | Google imageSize | OpenAI Size | OpenRouter size | Replicate resolution | 使用场景 |
|---|---|---|---|---|---|
normal | 1K | 1024px | 1K | 1K | 快速预览 |
2k(默认) | 2K | 2048px | 2K | 2K | 封面、插图、信息图 |
Google/OpenRouter imageSize:可以用 --imageSize 1K|2K|4K 覆盖
支持:1:1、16:9、9:16、4:3、3:4、2.35:1
imageConfig.aspectRatioimageGenerationOptions.aspect_ratio;如果只给出 --size <WxH>,则会自动推断宽高比aspect_ratio 传递给模型;当提供 --ref 而没有 --ar 时,默认为 match_input_image默认:顺序生成。
批量并行生成:当 --batchfile 包含 2 个或更多待处理任务时,脚本会自动启用并行生成。
| 模式 | 何时使用 |
|---|---|
| 顺序(默认) | 正常使用、单张图像、小批量 |
| 并行批量 | 批量模式且有 2 个以上任务 |
执行选择:
| 情况 | 首选方法 | 原因 |
|---|---|---|
| 一张图像,或 1-2 张简单图像 | 顺序 | 协调开销更低,调试更容易 |
| 多张图像已有保存的提示文件 | 批量(--batchfile) | 复用已确定的提示,应用共享的节流/重试,并提供可预测的吞吐量 |
| 每张图像仍需要独立的推理、提示编写或风格探索 | 子代理 | 工作仍处于探索阶段,因此每张图像在生成前可能需要独立分析 |
输出来自 baoyu-article-illustrator 的 outline.md + prompts/ | 批量(build-batch.ts -> --batchfile) | 该工作流已经生成提示文件,因此直接批量执行是预期的路径 |
经验法则:
并行行为:
--jobs <count> 覆盖工作线程数通过 EXTEND.md 进行自定义配置。有关路径和支持的选项,请参阅首选项部分。
每周安装量
14.4K
代码仓库
GitHub Stars
11.6K
首次出现
2026年1月21日
安全审计
安装于
opencode13.2K
gemini-cli12.9K
codex12.8K
cursor12.3K
github-copilot12.0K
amp11.6K
Official API-based image generation. Supports OpenAI, Azure OpenAI, Google, OpenRouter, DashScope (阿里通义万象), Jimeng (即梦), Seedream (豆包) and Replicate providers.
Agent Execution :
{baseDir} = this SKILL.md file's directory{baseDir}/scripts/main.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bunCRITICAL : This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }
| Result | Action |
|---|---|
| Found | Load, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue |
CRITICAL : If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|---|---|
.baoyu-skills/baoyu-image-gen/EXTEND.md | Project directory |
$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md | User home |
EXTEND.md Supports : Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits
Schema: references/config/preferences-schema.md
# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google, OpenAI, Azure OpenAI, OpenRouter, Replicate, or Seedream 4.0/4.5/5.0)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Azure OpenAI (model means deployment name)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5
# OpenRouter (recommended default model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
# OpenRouter with reference images
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
# DashScope legacy Qwen fixed-size model
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}
Paths in promptFiles, image, and ref are resolved relative to the batch file's directory. jobs is optional (overridden by CLI --jobs). Top-level array format (without jobs wrapper) is also accepted.
| Option | Description |
|---|---|
--prompt <text>, -p | Prompt text |
--promptfiles <files...> | Read prompt from files (concatenated) |
--image <path> | Output image path (required in single-image mode) |
--batchfile <path> | JSON batch file for multi-image generation |
--jobs <count> | Worker count for batch mode (default: auto, max from config, built-in default 10) |
| Variable | Description |
|---|---|
OPENAI_API_KEY | OpenAI API key |
AZURE_OPENAI_API_KEY | Azure OpenAI API key |
OPENROUTER_API_KEY | OpenRouter API key |
GOOGLE_API_KEY | Google API key |
DASHSCOPE_API_KEY | DashScope API key (阿里云) |
REPLICATE_API_TOKEN | Replicate API token |
Load Priority : CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env
Model priority (highest → lowest), applies to all providers:
--model <id>default_model.[provider]<PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)For Azure, --model / default_model.azure should be the Azure deployment name. AZURE_OPENAI_DEPLOYMENT is the preferred env var, and AZURE_OPENAI_IMAGE_MODEL remains as a backward-compatible alias.
EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.
Agent MUST display model info before each generation:
Using [provider] / [model]Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODELUse --model qwen-image-2.0-pro or set default_model.dashscope / DASHSCOPE_IMAGE_MODEL when the user wants official Qwen-Image behavior.
Official DashScope model families:
qwen-image-2.0-pro, qwen-image-2.0-pro-2026-03-03, qwen-image-2.0, qwen-image-2.0-2026-03-03
size in 宽*高 format512*512 and 2048*20481024*102421:9 and text-heavy Chinese/English layoutsWhen translating CLI args into DashScope behavior:
--size wins over --arqwen-image-2.0*, prefer explicit --size; otherwise infer from --ar and use the official recommended resolutions belowqwen-image-max/plus/image, only use the five official fixed sizes; if the requested ratio is not covered, switch to qwen-image-2.0-pro--quality is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping normal / 2k onto the table below is an implementation inference, not an official API guaranteeRecommended qwen-image-2.0* sizes for common aspect ratios:
| Ratio | normal | 2k |
|---|---|---|
1:1 | 1024*1024 | 1536*1536 |
2:3 | 768*1152 | 1024*1536 |
3:2 |
DashScope official APIs also expose negative_prompt, prompt_extend, and watermark, but baoyu-image-gen does not expose them as dedicated CLI flags today.
Official references:
Use full OpenRouter model IDs, e.g.:
google/gemini-3.1-flash-image-preview (recommended, supports image output and reference-image workflows)google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-proNotes:
/chat/completions, not the OpenAI /images endpoints--ref is used, choose a multimodal model that supports image input and image output--imageSize maps to OpenRouter imageGenerationOptions.size; --size <WxH> is converted to the nearest OpenRouter size and inferred aspect ratio when possibleSupported model formats:
owner/name (recommended for official models), e.g. google/nano-banana-proowner/name:version (community models by version), e.g. stability-ai/sdxl:<version>Examples:
# Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
--ref provided + no --provider → auto-select Google first, then OpenAI, then OpenRouter, then Replicate (Jimeng and Seedream do not support reference images)--provider specified → use it (if --ref, must be google, openai, openrouter, or replicate)| Preset | Google imageSize | OpenAI Size | OpenRouter size | Replicate resolution | Use Case |
|---|---|---|---|---|---|
normal | 1K | 1024px | 1K | 1K | Quick previews |
2k (default) | 2K | 2048px | 2K | 2K | Covers, illustrations, infographics |
Google/OpenRouter imageSize : Can be overridden with --imageSize 1K|2K|4K
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
imageConfig.aspectRatioimageGenerationOptions.aspect_ratio; if only --size <WxH> is given, aspect ratio is inferred automaticallyaspect_ratio to model; when --ref is provided without --ar, defaults to match_input_imageDefault : Sequential generation.
Batch Parallel Generation : When --batchfile contains 2 or more pending tasks, the script automatically enables parallel generation.
| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel batch | Batch mode with 2+ tasks |
Execution choice:
| Situation | Preferred approach | Why |
|---|---|---|
| One image, or 1-2 simple images | Sequential | Lower coordination overhead and easier debugging |
| Multiple images already have saved prompt files | Batch (--batchfile) | Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput |
| Each image still needs separate reasoning, prompt writing, or style exploration | Subagents | The work is still exploratory, so each image may need independent analysis before generation |
Output comes from baoyu-article-illustrator with outline.md + prompts/ | Batch (build-batch.ts -> ) |
Rule of thumb:
Parallel behavior:
--jobs <count>Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
Weekly Installs
14.4K
Repository
GitHub Stars
11.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode13.2K
gemini-cli12.9K
codex12.8K
cursor12.3K
github-copilot12.0K
amp11.6K
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
102,200 周安装
| `--provider google |
| openai |
--model <id>, -m | Model ID (Google: gemini-3-pro-image-preview; OpenAI: gpt-image-1.5; Azure: deployment name such as gpt-image-1.5 or image-prod; OpenRouter: google/gemini-3.1-flash-image-preview; DashScope: qwen-image-2.0-pro) |
--ar <ratio> | Aspect ratio (e.g., 16:9, 1:1, 4:3) |
--size <WxH> | Size (e.g., 1024x1024) |
| `--quality normal | 2k` |
| `--imageSize 1K | 2K |
--ref <files...> | Reference images. Supported by Google multimodal, OpenAI GPT Image edits, Azure OpenAI edits (PNG/JPG only), OpenRouter multimodal models, Replicate, and Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, or removed SeedEdit 3.0 |
--n <count> | Number of images |
--json | JSON output |
JIMENG_ACCESS_KEY_ID | Jimeng (即梦) Volcengine access key |
JIMENG_SECRET_ACCESS_KEY | Jimeng (即梦) Volcengine secret key |
ARK_API_KEY | Seedream (豆包) Volcengine ARK API key |
OPENAI_IMAGE_MODEL | OpenAI model override |
AZURE_OPENAI_DEPLOYMENT | Azure default deployment name |
AZURE_OPENAI_IMAGE_MODEL | Backward-compatible alias for Azure default deployment/model name |
OPENROUTER_IMAGE_MODEL | OpenRouter model override (default: google/gemini-3.1-flash-image-preview) |
GOOGLE_IMAGE_MODEL | Google model override |
DASHSCOPE_IMAGE_MODEL | DashScope model override (default: qwen-image-2.0-pro) |
REPLICATE_IMAGE_MODEL | Replicate model override (default: google/nano-banana-pro) |
JIMENG_IMAGE_MODEL | Jimeng model override (default: jimeng_t2i_v40) |
SEEDREAM_IMAGE_MODEL | Seedream model override (default: doubao-seedream-5-0-260128) |
OPENAI_BASE_URL | Custom OpenAI endpoint |
AZURE_OPENAI_BASE_URL | Azure resource endpoint or deployment endpoint |
AZURE_API_VERSION | Azure image API version (default: 2025-04-01-preview) |
OPENROUTER_BASE_URL | Custom OpenRouter endpoint (default: https://openrouter.ai/api/v1) |
OPENROUTER_HTTP_REFERER | Optional app/site URL for OpenRouter attribution |
OPENROUTER_TITLE | Optional app name for OpenRouter attribution |
GOOGLE_BASE_URL | Custom Google endpoint |
DASHSCOPE_BASE_URL | Custom DashScope endpoint |
REPLICATE_BASE_URL | Custom Replicate endpoint |
JIMENG_BASE_URL | Custom Jimeng endpoint (default: https://visual.volcengineapi.com) |
JIMENG_REGION | Jimeng region (default: cn-north-1) |
SEEDREAM_BASE_URL | Custom Seedream endpoint (default: https://ark.cn-beijing.volces.com/api/v3) |
BAOYU_IMAGE_GEN_MAX_WORKERS | Override batch worker cap |
BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY | Override provider concurrency, e.g. BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY |
BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS | Override provider start gap, e.g. BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS |
qwen-image-max, qwen-image-max-2025-12-30, qwen-image-plus, qwen-image-plus-2026-01-09, qwen-image
1664*928, 1472*1104, 1328*1328, 1104*1472, 928*16641664*928qwen-image currently has the same capability as qwen-image-plusz-image-turbo, z-image-ultra, wanx-v1
qwen-image-2.0*1152*768 |
1536*1024 |
3:4 | 960*1280 | 1080*1440 |
4:3 | 1280*960 | 1440*1080 |
9:16 | 720*1280 | 1080*1920 |
16:9 | 1280*720 | 1920*1080 |
21:9 | 1344*576 | 2048*872 |
--batchfile| That workflow already produces prompt files, so direct batch execution is the intended path |