⚠️

重要前提

安装AI Skills的关键前提是：必须科学上网，且开启TUN模式，这一点至关重要，直接决定安装能否顺利完成，在此郑重提醒三遍：科学上网，科学上网，科学上网。查看完整安装教程 →

Veo 3.2 提示词设计器 - 将多模态素材转化为高质量视频生成提示词

veo-3.2-prompter by pexoai/pexo-skills

50 周安装量

388 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/pexoai/pexo-skills --skill veo-3.2-prompter

AI/机器学习内容创作提示工程

🇨🇳中文介绍

Veo 3.2 Prompt Designer Skill

此技能将用户零散的多模态素材（图像、视频、音频）和创作意图，转化为适用于 Google Veo 3.2 视频生成模型（Artemis 引擎）的结构化、可执行的提示词。它扮演着专家提示词工程师的角色，确保底层模型能输出最高质量的成果。

使用时机

当用户提供用于 Veo 3.2 视频生成的素材（图像、视频、音频）时。
当用户的请求很复杂，需要为 Veo 模型精心构建提示词时。
当使用任何 Google Veo 3.x 模型进行视频生成时。

核心功能

此技能分析所有用户输入，并生成一个单一的、经过优化的 JSON 对象，其中包含最终的提示词和推荐参数。内部工作流程（识别、映射、构建）会自动处理，不应向用户暴露。

内部工作流程

阶段 1：识别 — 分析上传的素材和用户意图。使用 atomic_element_mapping.md 将每个素材分类到其原子元素角色中。
阶段 2：映射 — 针对每个原子元素，确定最佳的参考方法（参考图像、文本提示词或混合方法）。使用映射表来决定。
阶段 3：构建 — 使用 5 部分框架（镜头 → 主体 → 环境 → 摄像机 → 风格）组装最终提示词，并通过 Gemini API 的 RawReferenceImage 系统附加参考图像。

使用示例

用户请求： "制作一个电影感的镜头，展示这个香水瓶在深色表面上旋转，就像奢侈品广告一样。" 用户上传 perfume.png

使用 veo-3.2-prompter 的代理：

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

功能	能力
引擎	Artemis — 世界模型物理模拟（非像素预测）
最大时长	约 30 秒原生连续生成
音频	原生对话 + 同步音效
参考图像	最多 3 张（`STYLE`、`SUBJECT`、`SUBJECT_FACE`）
视频扩展	通过前一个视频输入链接片段
首/末帧	指定起始和/或结束关键帧
分辨率	720p、1080p、4K（带超分）
宽高比	16:9、9:16

🇺🇸English

Veo 3.2 Prompt Designer Skill

This skill transforms a user's scattered multimodal assets (images, videos, audio) and creative intent into a structured, executable prompt for the Google Veo 3.2 video generation model (Artemis engine). It acts as an expert prompt engineer, ensuring the highest quality output from the underlying model.

When to Use

When the user provides assets (images, videos, audio) for video generation with Veo 3.2.
When the user's request is complex and requires careful prompt construction for the Veo model.
When using any Google Veo 3.x model for video generation.

Core Function

This skill analyzes all user inputs and generates a single, optimized JSON object containing the final prompt and recommended parameters. The internal workflow (Recognition, Mapping, Construction) is handled automatically and should not be exposed to the user.

Internal Workflow

Phase 1: Recognition — Analyze uploaded assets and user intent. Use the atomic_element_mapping.md to classify each asset into its atomic element role(s).
Phase 2: Mapping — For each atomic element, determine the optimal reference method (reference image, text prompt, or hybrid). Use the mapping table to decide.
Phase 3: Construction — Assemble the final prompt using the 5-Part Framework (Shot → Subject → Environment → Camera → Style) and attach reference images via the Gemini API's RawReferenceImage system.

Usage Example

User Request: "Make a cinematic shot of this perfume bottle rotating on a dark surface, like a luxury commercial." User uploadsperfume.png

Agent usingveo-3.2-prompter: The agent internally processes the request and assets, then outputs the final JSON to the next skill in the chain.

Final Output (for internal use):

{
  "final_prompt": "Hero shot, a frosted glass perfume bottle with gold cap rotating slowly on a reflective dark surface, three-point studio lighting with soft key and rim light creating subtle caustics, smooth 180-degree arc, hyper-realistic luxury commercial style with shallow depth of field. Crystalline chime, soft ambient pad.",
  "reference_images": [
    {
      "file": "perfume.png",
      "reference_type": "SUBJECT"
    }
  ],
  "recommended_parameters": {
    "model": "veo-3.2-generate",
    "duration_seconds": 8,
    "aspect_ratio": "16:9",
    "resolution": "1080p",
    "generate_audio": true
  }
}

Veo 3.2 Key Differentiators

Feature	Capability
Engine	Artemis — world-model physics simulation (not pixel prediction)
Max duration	~30s native continuous generation
Audio	Native dialogue + synchronized SFX
Reference images	Up to 3 (`STYLE`, `SUBJECT`, `SUBJECT_FACE`)
Video extension	Chain clips via previous video input
First/last frame	Specify start and/or end keyframes
Resolutions	720p, 1080p, 4K (with upscaling)
Aspect ratios	16:9, 9:16

Knowledge Base

This skill relies on an internal knowledge base to make informed decisions. The agent MUST consult these files during execution.

references/atomic_element_mapping.md : Core Knowledge. Contains the "Asset Type → Atomic Element" and "Atomic Element → Optimal Reference Method" mapping tables, adapted for Veo 3.2's reference image system.
references/veo_syntax_guide.md : Veo 3.2 Gemini API syntax reference, covering RawReferenceImage, GenerateVideosConfig, video extension, and first/last frame specification.

Weekly Installs

Repository

pexoai/pexo-skills

GitHub Stars

388

First Seen

Mar 9, 2026

Installed on

gemini-cli48

codex48

github-copilot47

amp47

cline47

kimi-cli47

Veo 3.2 提示词设计器 - 将多模态素材转化为高质量视频生成提示词

🇨🇳中文介绍

Veo 3.2 Prompt Designer Skill

使用时机

核心功能

内部工作流程

使用示例

相关 Skills

Veo 3.2 关键特性

知识库

🇺🇸English

Veo 3.2 Prompt Designer Skill

When to Use

Core Function

Internal Workflow

Usage Example

Veo 3.2 Key Differentiators

Knowledge Base

最新 Skills