OpenAI图像生成技能：AI绘图、图片编辑与批量生成，支持网站素材、UI设计、产品模型

imagegen by openai/skills

602 周安装量

15,300 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/openai/skills --skill imagegen

AI/机器学习内容创作设计

🇨🇳中文介绍

图像生成技能

为当前项目生成或编辑图像（例如网站素材、游戏素材、UI 模型、产品模型、线框图、Logo 设计、照片级真实感图像或信息图表）。

顶层模式与规则

此技能严格包含两种顶层模式：

默认内置工具模式（首选）： 用于常规图像生成和编辑的内置 image_gen 工具。不需要 OPENAI_API_KEY。
备用 CLI 模式（仅限显式调用）： scripts/image_gen.py CLI。仅在用户明确要求使用 CLI 路径时使用。需要 OPENAI_API_KEY。

仅在显式调用 CLI 备用模式时，CLI 暴露三个子命令：

generate
edit
generate-batch

规则：

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

用例分类（精确标识符）

将每个请求归类到以下类别之一，并在提示词和参考中保持标识符一致。

photorealistic-natural — 具有真实纹理和自然光线的自然/社论式生活场景。
product-mockup — 产品/包装照片、目录图像、商品概念。
ui-mockup — 应用/网页界面模型和线框图；指定所需的保真度。
infographic-diagram — 具有结构化布局和文本的图表/信息图。
logo-brand — Logo/标识探索，矢量友好。
illustration-story — 漫画、儿童读物插图、叙事场景。
stylized-concept — 风格驱动的概念艺术、3D/风格化渲染。
historical-scene — 符合时代背景/世界知识的场景。

text-localization — 翻译/替换图像内文本，保留布局。
identity-preserve — 试穿、场景中的人物；锁定面部/身体/姿势。
precise-object-edit — 移除/替换特定元素（包括内部交换）。
lighting-weather — 仅限时间/季节/氛围变化。
background-extraction — 透明背景 / 干净抠图。
style-transfer — 应用参考风格同时更改主题/场景。
compositing — 多图像插入/合并，匹配光照/透视。
sketch-to-render — 绘图/线稿到照片级真实感渲染。

共享提示词模式

使用以下带标签的规格说明作为两种顶层模式的共享提示词脚手架：

Use case: <taxonomy slug>
Asset type: <where the asset will be used>
Primary request: <user's main prompt>
Input images: <Image 1: role; Image 2: role> (optional)
Scene/backdrop: <environment>
Subject: <main subject>
Style/medium: <photo/illustration/3D/etc>
Composition/framing: <wide/close/top-down; placement>
Lighting/mood: <lighting + mood>
Color palette: <palette notes>
Materials/textures: <surface details>
Text (verbatim): "<exact text>"
Constraints: <must keep/must avoid>
Avoid: <negative constraints>

Asset type 和 Input images 是提示词脚手架，不是专用的 CLI 标志。
Scene/backdrop 指的是视觉环境。它与备用 CLI 的 background 参数不同，后者控制输出透明度行为。
仅限备用模式的执行说明，如 Quality:、Input fidelity:、蒙版、输出格式和输出路径，仅属于显式的 CLI 路径。不要将它们视为内置 image_gen 工具的参数。

保持简短。
仅添加能实质性改善提示词所需的细节。
对于编辑，明确列出不变性（仅更改 X；保持 Y 不变）。
如果缺少任何关键细节并阻碍成功，请提问；否则继续。

生成示例（主图）

Use case: product-mockup
Asset type: landing page hero
Primary request: a minimal hero image of a ceramic coffee mug
Style/medium: clean product photography
Composition/framing: wide composition with usable negative space for page copy if needed
Lighting/mood: soft studio lighting
Constraints: no logos, no text, no watermark

编辑示例（不变性）

Use case: precise-object-edit
Asset type: product photo background replacement
Primary request: replace only the background with a warm sunset gradient
Constraints: change only the background; keep the product and its edges unchanged; no text; no watermark

提示词最佳实践

将提示词结构化为场景/背景 -> 主题 -> 细节 -> 约束。
包含预期用途（广告、UI 模型、信息图）以设定模式和精细程度。
对于照片级真实感，使用相机/构图语言。
仅当用户明确要求矢量输出或非图像占位符时，才使用 SVG/矢量占位符。
引用确切文本并指定排版和位置。
对于棘手的单词，逐个字母拼写并要求逐字渲染。
对于多图像输入，按索引引用图像并描述应如何使用它们。
对于编辑，每次迭代都重复不变性以减少漂移。
通过单一更改的后续请求进行迭代。
如果提示词较为通用，仅添加能实质性帮助的额外细节。
如果提示词已经很详细，则对其进行规范化而不是扩展。
仅对于显式的 CLI 备用模式，请参阅 references/cli.md 和 references/image-api.md 了解 quality、input_fidelity、蒙版、输出格式和输出路径指导。

两种模式共享的更多原则：references/prompting.md。两种模式共享的复制/粘贴规格说明：references/sample-prompts.md。

按素材类型分类的指导

素材类型模板（网站素材、游戏素材、线框图、Logo）已整合在 references/sample-prompts.md 中。

仅限备用 CLI 模式

临时文件和输出约定

这些约定仅适用于显式的 CLI 备用模式。它们不描述内置 image_gen 的输出行为。

使用 tmp/imagegen/ 存放中间文件（例如 JSONL 批次）；完成后删除它们。
将最终成果写入 output/imagegen/ 下。
使用 --out 或 --out-dir 控制输出路径；保持文件名稳定且具有描述性。

在此仓库中，优先使用 uv 进行依赖管理。

必需的 Python 包：

uv pip install openai

仅用于缩小的可选包：

uv pip install pillow

可移植性说明：

如果您在此仓库之外使用已安装的技能，请使用该环境的包管理器将依赖项安装到该环境中。
在 uv 管理的环境中，uv pip install ... 仍然是首选路径。

OPENAI_API_KEY 必须为实时 API 调用设置。
使用内置 image_gen 工具时，不要向用户询问 OPENAI_API_KEY。
切勿要求用户在聊天中粘贴完整的密钥。请他们在本地设置并确认准备就绪。

如果缺少密钥，请向用户提供以下步骤：

在 OpenAI 平台 UI 中创建 API 密钥：https://platform.openai.com/api-keys
在他们的系统中将 OPENAI_API_KEY 设置为环境变量。
如果需要，提供指导帮助他们根据其操作系统/Shell 设置环境变量。

如果在此环境中无法安装，请告知用户缺少哪个依赖项以及如何将其安装到其活动环境中。

CLI 命令 + 示例：references/cli.md
API 参数快速参考：references/image-api.md
CLI 模式的网络审批 / 沙盒设置：references/codex-network.md

references/prompting.md：两种模式共享的提示词原则。
references/sample-prompts.md：两种模式共享的复制/粘贴提示词配方。
references/cli.md：通过 scripts/image_gen.py 进行的仅限备用模式的 CLI 使用。
references/image-api.md：仅限备用模式的 API/CLI 参数参考。
references/codex-network.md：仅限备用模式的 CLI 模式网络/沙盒故障排除。
scripts/image_gen.py：仅限备用模式的 CLI 实现。除非用户明确选择 CLI 模式，否则不要加载或使用它。

2026 年 1 月 28 日

🇺🇸English

Image Generation Skill

Generates or edits images for the current project (for example website assets, game assets, UI mockups, product mockups, wireframes, logo design, photorealistic images, or infographics).

Top-level modes and rules

This skill has exactly two top-level modes:

Default built-in tool mode (preferred): built-in image_gen tool for normal image generation and editing. Does not require OPENAI_API_KEY.
Fallback CLI mode (explicit-only): scripts/image_gen.py CLI. Use only when the user explicitly asks for the CLI path. Requires OPENAI_API_KEY.

Within the explicit CLI fallback only, the CLI exposes three subcommands:

generate
edit
generate-batch

Rules:

Use the built-in image_gen tool by default for all normal image generation and editing requests.
Never switch to CLI fallback automatically.
If the built-in tool fails or is unavailable, tell the user the CLI fallback exists and that it requires OPENAI_API_KEY. Proceed only if the user explicitly asks for that fallback.
If the user explicitly asks for CLI mode, use the bundled scripts/image_gen.py workflow. Do not create one-off SDK runners.
Never modify scripts/image_gen.py. If something is missing, ask the user before doing anything else.

Built-in save-path policy:

In built-in tool mode, Codex saves generated images under $CODEX_HOME/* by default.
Do not describe or rely on OS temp as the default built-in destination.
Do not describe or rely on a destination-path argument (if any) on the built-in image_gen tool. If a specific location is needed, generate first and then move or copy the selected output from $CODEX_HOME/generated_images/....
Save-path precedence in built-in mode:
1. If the user names a destination, move or copy the selected output there.
2. If the image is meant for the current project, move or copy the final selected image into the workspace before finishing.
3. If the image is only for preview or brainstorming, render it inline; the underlying file can remain at the default $CODEX_HOME/* path.
Never leave a project-referenced asset only at the default $CODEX_HOME/* path.
Do not overwrite an existing asset unless the user explicitly asked for replacement; otherwise create a sibling versioned filename such as hero-v2.png or item-icon-edited.png.

Shared prompt guidance for both modes lives in references/prompting.md and references/sample-prompts.md.

Fallback-only docs/resources for CLI mode:

references/cli.md
references/image-api.md
references/codex-network.md
scripts/image_gen.py

When to use

Generate a new image (concept art, product shot, cover, website hero)
Generate a new image using one or more reference images for style, composition, or mood
Edit an existing image (inpainting, lighting or weather transformations, background replacement, object removal, compositing, transparent background)
Produce many assets or variants for one task

When not to use

Extending or matching an existing SVG/vector icon set, logo system, or illustration library inside the repo
Creating simple shapes, diagrams, wireframes, or icons that are better produced directly in SVG, HTML/CSS, or canvas
Making a small project-local asset edit when the source file already exists in an editable native format
Any task where the user clearly wants deterministic code-native output instead of a generated bitmap

Decision tree

Think about two separate questions:

Intent: is this a new image or an edit of an existing image?
Execution strategy: is this one asset or many assets/variants?

Intent:

If the user wants to modify an existing image while preserving parts of it, treat the request as edit.
If the user provides images only as references for style, composition, mood, or subject guidance, treat the request as generate.
If the user provides no images, treat the request as generate.

Built-in edit semantics:

Built-in edit mode is for images already visible in the conversation context, such as attached images or images generated earlier in the thread.
If the user wants to edit a local image file with the built-in tool, first load it with built-in view_image tool so the image is visible in the conversation context, then proceed with the built-in edit flow.
Do not promise arbitrary filesystem-path editing through the built-in tool.
If a local file still needs direct file-path control, masks, or other explicit CLI-only parameters, use the explicit CLI fallback only when the user asks for it.
For edits, preserve invariants aggressively and save non-destructively by default.

Execution strategy:

In the built-in default path, produce many assets or variants by issuing one image_gen call per requested asset or variant.
In the explicit CLI fallback path, use the CLI generate-batch subcommand only when the user explicitly chose CLI mode and needs many prompts/assets.

Assume the user wants a new image unless they clearly ask to change an existing one.

Workflow

Decide the top-level mode: built-in by default, fallback CLI only if explicitly requested.
Decide the intent: generate or edit.
Decide whether the output is preview-only or meant to be consumed by the current project.
Decide the execution strategy: single asset vs repeated built-in calls vs CLI generate-batch.
Collect inputs up front: prompt(s), exact text (verbatim), constraints/avoid list, and any input images.
For every input image, label its role explicitly:
- reference image
- edit target
- supporting insert/style/compositing input
If the edit target is only on the local filesystem and you are staying on the built-in path, inspect it with view_image first so the image is available in conversation context.
If the user asked for a photo, illustration, sprite, product image, banner, or other explicitly raster-style asset, use image_gen rather than substituting SVG/HTML/CSS placeholders. If the request is for an icon, logo, or UI graphic that should match existing repo-native SVG/vector/code assets, prefer editing those directly instead.
Augment the prompt based on specificity:
- If the user's prompt is already specific and detailed, normalize it into a clear spec without adding creative requirements.

Prompt augmentation

Reformat user prompts into a structured, production-oriented spec. Make the user's goal clearer and more actionable, but do not blindly add detail.

Treat this as prompt-shaping guidance, not a closed schema. Use only the lines that help, and add a short extra labeled line when it materially improves clarity.

Specificity policy

Use the user's prompt specificity to decide how much augmentation is appropriate:

If the prompt is already specific and detailed, preserve that specificity and only normalize/structure it.
If the prompt is generic, you may add tasteful augmentation when it will materially improve the result.

Allowed augmentations:

composition or framing hints
polish level or intended-use hints
practical layout guidance
reasonable scene concreteness that supports the stated request

Not allowed augmentations:

extra characters or objects that are not implied by the request
brand names, slogans, palettes, or narrative beats that are not implied
arbitrary side-specific placement unless the surrounding layout supports it

Use-case taxonomy (exact slugs)

Classify each request into one of these buckets and keep the slug consistent across prompts and references.

Generate:

photorealistic-natural — candid/editorial lifestyle scenes with real texture and natural lighting.
product-mockup — product/packaging shots, catalog imagery, merch concepts.
ui-mockup — app/web interface mockups and wireframes; specify the desired fidelity.
infographic-diagram — diagrams/infographics with structured layout and text.
logo-brand — logo/mark exploration, vector-friendly.
illustration-story — comics, children’s book art, narrative scenes.
stylized-concept — style-driven concept art, 3D/stylized renders.
historical-scene — period-accurate/world-knowledge scenes.

Edit:

text-localization — translate/replace in-image text, preserve layout.
identity-preserve — try-on, person-in-scene; lock face/body/pose.
precise-object-edit — remove/replace a specific element (including interior swaps).
lighting-weather — time-of-day/season/atmosphere changes only.
background-extraction — transparent background / clean cutout.
style-transfer — apply reference style while changing subject/scene.
compositing — multi-image insert/merge with matched lighting/perspective.
sketch-to-render — drawing/line art to photoreal render.

Shared prompt schema

Use the following labeled spec as shared prompt scaffolding for both top-level modes:

Use case: <taxonomy slug>
Asset type: <where the asset will be used>
Primary request: <user's main prompt>
Input images: <Image 1: role; Image 2: role> (optional)
Scene/backdrop: <environment>
Subject: <main subject>
Style/medium: <photo/illustration/3D/etc>
Composition/framing: <wide/close/top-down; placement>
Lighting/mood: <lighting + mood>
Color palette: <palette notes>
Materials/textures: <surface details>
Text (verbatim): "<exact text>"
Constraints: <must keep/must avoid>
Avoid: <negative constraints>

Notes:

Asset type and Input images are prompt scaffolding, not dedicated CLI flags.
Scene/backdrop refers to the visual setting. It is not the same as the fallback CLI background parameter, which controls output transparency behavior.
Fallback-only execution notes such as Quality:, Input fidelity:, masks, output format, and output paths belong in the explicit CLI path only. Do not treat them as built-in image_gen tool arguments.

Augmentation rules:

Keep it short.
Add only the details needed to improve the prompt materially.
For edits, explicitly list invariants (change only X; keep Y unchanged).
If any critical detail is missing and blocks success, ask a question; otherwise proceed.

Examples

Generation example (hero image)

Use case: product-mockup
Asset type: landing page hero
Primary request: a minimal hero image of a ceramic coffee mug
Style/medium: clean product photography
Composition/framing: wide composition with usable negative space for page copy if needed
Lighting/mood: soft studio lighting
Constraints: no logos, no text, no watermark

Edit example (invariants)

Use case: precise-object-edit
Asset type: product photo background replacement
Primary request: replace only the background with a warm sunset gradient
Constraints: change only the background; keep the product and its edges unchanged; no text; no watermark

Prompting best practices

Structure prompt as scene/backdrop -> subject -> details -> constraints.
Include intended use (ad, UI mock, infographic) to set the mode and polish level.
Use camera/composition language for photorealism.
Only use SVG/vector stand-ins when the user explicitly asked for vector output or a non-image placeholder.
Quote exact text and specify typography + placement.
For tricky words, spell them letter-by-letter and require verbatim rendering.
For multi-image inputs, reference images by index and describe how they should be used.
For edits, repeat invariants every iteration to reduce drift.
Iterate with single-change follow-ups.
If the prompt is generic, add only the extra detail that will materially help.
If the prompt is already detailed, normalize it instead of expanding it.
For explicit CLI fallback only, see references/cli.md and references/image-api.md for quality, input_fidelity, masks, output format, and output-path guidance.

More principles shared by both modes: references/prompting.md. Copy/paste specs shared by both modes: references/sample-prompts.md.

Guidance by asset type

Asset-type templates (website assets, game assets, wireframes, logo) are consolidated in references/sample-prompts.md.

Fallback CLI mode only

Temp and output conventions

These conventions apply only to the explicit CLI fallback. They do not describe built-in image_gen output behavior.

Use tmp/imagegen/ for intermediate files (for example JSONL batches); delete them when done.
Write final artifacts under output/imagegen/.
Use --out or --out-dir to control output paths; keep filenames stable and descriptive.

Dependencies

Prefer uv for dependency management in this repo.

Required Python package:

uv pip install openai

Optional for downscaling only:

uv pip install pillow

Portability note:

If you are using the installed skill outside this repo, install dependencies into that environment with its package manager.
In uv-managed environments, uv pip install ... remains the preferred path.

Environment

OPENAI_API_KEY must be set for live API calls.
Do not ask the user for OPENAI_API_KEY when using the built-in image_gen tool.
Never ask the user to paste the full key in chat. Ask them to set it locally and confirm when ready.

If the key is missing, give the user these steps:

Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
Set OPENAI_API_KEY as an environment variable in their system.
Offer to guide them through setting the environment variable for their OS/shell if needed.

If installation is not possible in this environment, tell the user which dependency is missing and how to install it into their active environment.

Script-mode notes

CLI commands + examples: references/cli.md
API parameter quick reference: references/image-api.md
Network approvals / sandbox settings for CLI mode: references/codex-network.md

Reference map

references/prompting.md: shared prompting principles for both modes.
references/sample-prompts.md: shared copy/paste prompt recipes for both modes.
references/cli.md: fallback-only CLI usage via scripts/image_gen.py.
references/image-api.md: fallback-only API/CLI parameter reference.
references/codex-network.md: fallback-only network/sandbox troubleshooting for CLI mode.
scripts/image_gen.py: fallback-only CLI implementation. Do not load or use it unless the user explicitly chooses CLI mode.

Weekly Installs

602

Repository

openai/skills

GitHub Stars

15.3K

First Seen

Jan 28, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

codex541

opencode516

gemini-cli497

github-copilot486

cursor474

kimi-cli465

AI Elements：基于shadcn/ui的AI原生应用组件库，快速构建对话界面

53,500 周安装

If the user's prompt is generic, add tasteful augmentation only when it materially improves output quality.

Use the built-in image_gen tool by default.

If the user explicitly chooses the CLI fallback, then and only then use the fallback-only docs for quality, input_fidelity, masks, output format, output paths, and network setup.

Inspect outputs and validate: subject, style, composition, text accuracy, and invariants/avoid items.

Iterate with a single targeted change, then re-check.

For preview-only work, render the image inline; the underlying file may remain at the default $CODEX_HOME/generated_images/... path.

For project-bound work, move or copy the selected artifact into the workspace and update any consuming code or references. Never leave a project-referenced asset only at the default $CODEX_HOME/generated_images/... path.

For batches, persist only the selected finals in the workspace unless the user explicitly asked to keep discarded variants.

Always report the final saved path for any workspace-bound asset, plus the final prompt and whether the built-in tool or fallback CLI mode was used.