AI解说视频生成工具 - 一键创建教程与产品介绍视频，含AI生成画面

explainer by marswaveai/skills

379 周安装量

28 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/marswaveai/skills --skill explainer

AI/机器学习内容创作自动化

🇨🇳中文介绍

何时使用

用户想要创建解说或教程视频
用户要求以视频形式“解释”某事物
用户想要包含 AI 生成画面的旁白内容
用户提及“解说视频”、“tutorial video”

何时不使用

用户想要只有音频没有画面的内容（使用 /speech 或 /podcast）
用户想要播客风格的讨论（使用 /podcast）
用户想要生成独立的图片（使用 /image-gen）
用户想要朗读文本而不生成视频（使用 /speech）

目的

生成解说视频，将单一叙述者的旁白与 AI 生成的画面相结合。适用于产品介绍、概念解释和教程。支持纯文本脚本生成或完整的文本+视频输出。

硬性约束

不得使用 shell 脚本。根据 Resources 中列出的 API 参考文件构造 curl 命令
始终阅读 shared/authentication.md 以了解 API 密钥和请求头信息
遵循中关于轮询、错误和交互模式的说明

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

步骤 -1：API 密钥检查

遵循 shared/config-pattern.md § API 密钥检查。如果密钥缺失，立即停止。

步骤 0：配置设置

遵循 shared/config-pattern.md 步骤 0（零问题启动）。

如果文件不存在 — 静默创建默认配置并继续：

mkdir -p ".listenhub/explainer"
echo '{"outputMode":"inline","language":null,"defaultStyle":null,"defaultSpeakers":{}}' > ".listenhub/explainer/config.json"
CONFIG_PATH=".listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

请勿询问任何设置问题。 直接进入交互流程。

如果文件存在 — 静默读取配置并继续：

CONFIG_PATH=".listenhub/explainer/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

设置流程（仅限用户主动要求重新配置）

仅在用户明确要求重新配置时运行。显示当前设置：

当前配置 (explainer)：
  输出方式：{inline / download / both}
  语言偏好：{zh / en / 未设置}
  默认风格：{info / story / 未设置}
  默认主播：{speakerName / 使用内置默认}

outputMode : 遵循 shared/output-mode.md § 设置流程问题。
Language (可选): "默认语言？"
- "中文 (zh)"
- "English (en)"
- "每次手动选择" → 保持 null
Style (可选): "默认风格？"
- "Info — 信息展示型"
- "Story — 故事叙述型"
- "每次手动选择" → 保持 null

收集答案后，立即保存：

NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")

步骤 1：主题 / 内容

自由文本输入。询问用户：

您想解释或介绍什么？

接受：主题描述、文本内容或要解释的概念。

如果 config.language 已设置，则在摘要中预填并显示 — 跳过此问题。否则询问：

Question: "What language?"
Options:
  - "Chinese (zh)" — 内容为中文普通话
  - "English (en)" — 内容为英文

如果 config.defaultStyle 已设置，则在摘要中预填并显示 — 跳过此问题。否则询问：

Question: "What style of explainer?"
Options:
  - "Info" — 信息性、事实陈述风格
  - "Story" — 叙事性、讲故事风格

步骤 4：主播选择

遵循 shared/speaker-selection.md：

如果 config.defaultSpeakers.{language} 已设置 → 静默使用已保存的主播
如果未设置 → 使用 shared/speaker-selection.md 中针对该语言的内置默认主播
在确认摘要（步骤 6）中显示主播 — 用户可以在那里更改（如果需要）
仅在用户明确要求更改声音时才显示完整的主播列表

解说视频仅支持 1 位主播。

步骤 5：输出类型

Question: "What output do you want?"
Options:
  - "Text script only" — 仅生成旁白脚本，不生成视频
  - "Text + Video" — 生成包含 AI 画面的完整解说视频

步骤 6：确认并生成

总结所有选择：

Ready to generate explainer:

  Topic: {topic}
  Language: {language}
  Style: {info/story}
  Speaker: {speaker name}
  Output: {text only / text + video}

  Proceed?

在调用任何 API 之前，等待明确的确认。

提交（前台） : POST /storybook/episodes 附带内容、主播、语言、模式 → 提取 episodeId
告知用户任务已提交

轮询（后台） : 使用 run_in_background: true 和 timeout: 600000 运行以下精确的 bash 命令。请勿使用 python3、awk 或任何其他 JSON 解析器 — 按所示使用 jq：

EPISODE_ID="<id-from-step-1>"
for i in $(seq 1 30); do
  RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
    -H "Authorization: Bearer $LISTENHUB_API_KEY" \
    -H "X-Source: skills" 2>/dev/null)
  STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"')
  case "$STATUS" in
    success|completed) echo "$RESULT"; exit 0 ;;
    failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
    *) sleep 10 ;;
  esac
done
echo "TIMEOUT" >&2; exit 2

收到通知后，下载并呈现脚本：

从配置中读取 OUTPUT_MODE。遵循 shared/output-mode.md 中的行为说明。

inline 或 both：内联呈现脚本。

     解说脚本已生成！
     
     「{title}」
     
     在线查看：https://listenhub.ai/app/explainer/{episodeId}

download 或 both：同时保存脚本文件。按照 shared/config-pattern.md § 生成物命名规则生成主题 slug。

 * 如果仅输出文本：保存为当前工作目录下的 `{slug}-explainer.md`（如果存在则去重）
 * 如果输出文本+视频：创建 `{slug}-explainer/` 文件夹（如果存在则去重），并在其中写入 `script.md`
 * 除了上述摘要外，同时呈现保存路径。

5. 如果请求了视频 : POST /storybook/episodes/{episodeId}/video（前台）→ 再次轮询（后台），使用以下精确的 bash 命令，附带 run_in_background: true 和 timeout: 600000。轮询 videoStatus，而非 processStatus：

     EPISODE_ID="<id-from-step-1>"
     for i in $(seq 1 30); do
       RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
         -H "Authorization: Bearer $LISTENHUB_API_KEY" \
         -H "X-Source: skills" 2>/dev/null)
       STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.videoStatus // "pending"')
       case "$STATUS" in
         success|completed) echo "$RESULT"; exit 0 ;;
         failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
         *) sleep 10 ;;
       esac
     done
     echo "TIMEOUT" >&2; exit 2

6. 收到通知后，下载并呈现结果：

从配置中读取 OUTPUT_MODE。遵循 shared/output-mode.md 中的行为说明。

inline 或 both：将视频 URL 和音频 URL 显示为可点击链接。

解说视频已生成！

视频链接：{videoUrl}
音频链接：{audioUrl}
时长：{duration}s
消耗积分：{credits}

download 或 both：同时将音频文件下载到 {slug}-explainer/ 文件夹中。

curl -sS -o "{slug}-explainer/audio.mp3" "{audioUrl}"

已保存到当前目录：
  {slug}-explainer/
    script.md
    audio.mp3

使用本次会话所做的选择更新配置：

NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg style "{info/story}" \
  --arg speakerId "{speakerId}" \
  '. + {"language": $lang, "defaultStyle": $style, "defaultSpeakers": (.defaultSpeakers + {($lang): [$speakerId]})}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"

仅文本脚本：2-3 分钟
文本 + 视频：3-5 分钟

主播列表：shared/api-speakers.md
主播选择指南：shared/speaker-selection.md
剧集创建：shared/api-storybook.md
轮询：shared/common-patterns.md § 异步轮询
配置模式：shared/config-pattern.md

调用：speakers API（用于主播选择）；可能调用 /speech 进行旁白
被调用：content-planner（第 3 阶段）

用户："创建一个介绍 Claude Code 的解说视频"

代理工作流程：

主题："Claude Code introduction"
询问语言 → "English"
询问风格 → "Info"
获取主播列表，用户选择 "cozy-man-english"
询问输出 → "Text + Video"

curl -sS -X POST "https://api.marswave.ai/openapi/v1/storybook/episodes" \

  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Source: skills" \
  -d '{
    "sources": [{"type": "text", "content": "Introduce Claude Code: what it is, key features, and how to get started"}],
    "speakers": [{"speakerId": "cozy-man-english"}],
    "language": "en",
    "mode": "info"
  }'

轮询直到文本就绪，然后如果请求了视频则生成视频。

🇺🇸English

When to Use

User wants to create an explainer or tutorial video
User asks to "explain" something in video form
User wants narrated content with AI-generated visuals
User says "explainer video", "解说视频", "tutorial video"

When NOT to Use

User wants audio-only content without visuals (use /speech or /podcast)
User wants a podcast-style discussion (use /podcast)
User wants to generate a standalone image (use /image-gen)
User wants to read text aloud without video (use /speech)

Purpose

Generate explainer videos that combine a single narrator's voiceover with AI-generated visuals. Ideal for product introductions, concept explanations, and tutorials. Supports text-only script generation or full text + video output.

Hard Constraints

No shell scripts. Construct curl commands from the API reference files listed in Resources
Always read shared/authentication.md for API key and headers
Follow shared/common-patterns.md for polling, errors, and interaction patterns
Always read config following shared/config-pattern.md before any interaction
Never hardcode speaker IDs — always fetch from the speakers API
Never save files to ~/Downloads/ or .listenhub/ — save artifacts to the current working directory with friendly topic-based names (see shared/config-pattern.md § Artifact Naming)
Explainer uses exactly 1 speaker
Mode must be info (for Info style) or story (for Story style) — never slides (use /slides skill instead)

Step -1: API Key Check

Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.

Step 0: Config Setup

Follow shared/config-pattern.md Step 0 (Zero-Question Boot).

If file doesn't exist — silently create with defaults and proceed:

mkdir -p ".listenhub/explainer"
echo '{"outputMode":"inline","language":null,"defaultStyle":null,"defaultSpeakers":{}}' > ".listenhub/explainer/config.json"
CONFIG_PATH=".listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Do NOT ask any setup questions. Proceed directly to the Interaction Flow.

If file exists — read config silently and proceed:

CONFIG_PATH=".listenhub/explainer/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Setup Flow (user-initiated reconfigure only)

Only run when the user explicitly asks to reconfigure. Display current settings:

当前配置 (explainer)：
  输出方式：{inline / download / both}
  语言偏好：{zh / en / 未设置}
  默认风格：{info / story / 未设置}
  默认主播：{speakerName / 使用内置默认}

Then ask:

outputMode : Follow shared/output-mode.md § Setup Flow Question.
Language (optional): "默认语言？"
- "中文 (zh)"
- "English (en)"
- "每次手动选择" → keep null
Style (optional): "默认风格？"
- "Info — 信息展示型"
- "Story — 故事叙述型"
- "每次手动选择" → keep null

After collecting answers, save immediately:

NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")

Interaction Flow

Step 1: Topic / Content

Free text input. Ask the user:

What would you like to explain or introduce?

Accept: topic description, text content, or concept to explain.

Step 2: Language

If config.language is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What language?"
Options:
  - "Chinese (zh)" — Content in Mandarin Chinese
  - "English (en)" — Content in English

Step 3: Style

If config.defaultStyle is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What style of explainer?"
Options:
  - "Info" — Informational, factual presentation style
  - "Story" — Narrative, storytelling approach

Step 4: Speaker Selection

Follow shared/speaker-selection.md:

If config.defaultSpeakers.{language} is set → use saved speaker silently
If not set → use built-in default from shared/speaker-selection.md for the language
Show the speaker in the confirmation summary (Step 6) — user can change from there if desired
Only show the full speaker list if the user explicitly asks to change voice

Only 1 speaker is supported for explainer videos.

Step 5: Output Type

Question: "What output do you want?"
Options:
  - "Text script only" — Generate narration script, no video
  - "Text + Video" — Generate full explainer video with AI visuals

Step 6: Confirm & Generate

Summarize all choices:

Ready to generate explainer:

  Topic: {topic}
  Language: {language}
  Style: {info/story}
  Speaker: {speaker name}
  Output: {text only / text + video}

  Proceed?

Wait for explicit confirmation before calling any API.

Workflow

Submit (foreground) : POST /storybook/episodes with content, speaker, language, mode → extract episodeId
Tell the user the task is submitted

Poll (background) : Run the following exact bash command with run_in_background: true and timeout: 600000. Do NOT use python3, awk, or any other JSON parser — use jq as shown:

EPISODE_ID="<id-from-step-1>"
for i in $(seq 1 30); do
  RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
    -H "Authorization: Bearer $LISTENHUB_API_KEY" \
    -H "X-Source: skills" 2>/dev/null)
  STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"')
  case "$STATUS" in
    success|completed) echo "$RESULT"; exit 0 ;;
    failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
    *) sleep 10 ;;
  esac
done
echo "TIMEOUT" >&2; exit 2

Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

inline or both: Present the script inline.

Present:

     解说脚本已生成！
     
     「{title}」
     
     在线查看：https://listenhub.ai/app/explainer/{episodeId}

download or both: Also save the script file. Generate a topic slug following shared/config-pattern.md § Artifact Naming.

 * If text-only output: save as `{slug}-explainer.md` in cwd (dedup if exists)
 * If text+video output: create `{slug}-explainer/` folder (dedup if exists), write `script.md` inside
 * Present the save path in addition to the above summary.

5. If video requested : POST /storybook/episodes/{episodeId}/video (foreground) → poll again (background) using the exact bash command below with run_in_background: true and timeout: 600000. Poll for videoStatus, not processStatus:

     EPISODE_ID="<id-from-step-1>"
     for i in $(seq 1 30); do
       RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
         -H "Authorization: Bearer $LISTENHUB_API_KEY" \
         -H "X-Source: skills" 2>/dev/null)
       STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.videoStatus // "pending"')
       case "$STATUS" in
         success|completed) echo "$RESULT"; exit 0 ;;
         failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
         *) sleep 10 ;;
       esac
     done
     echo "TIMEOUT" >&2; exit 2

6. When notified, download and present result :

Present result

Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

inline or both: Display video URL and audio URL as clickable links.

Present:

解说视频已生成！

视频链接：{videoUrl}
音频链接：{audioUrl}
时长：{duration}s
消耗积分：{credits}

download or both: Also download the audio file into the {slug}-explainer/ folder.

curl -sS -o "{slug}-explainer/audio.mp3" "{audioUrl}"

Present:

已保存到当前目录：
  {slug}-explainer/
    script.md
    audio.mp3

After Successful Generation

Update config with the choices made this session:

NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg style "{info/story}" \
  --arg speakerId "{speakerId}" \
  '. + {"language": $lang, "defaultStyle": $style, "defaultSpeakers": (.defaultSpeakers + {($lang): [$speakerId]})}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"

Estimated times :

Text script only: 2-3 minutes
Text + Video: 3-5 minutes

API Reference

Speaker list: shared/api-speakers.md
Speaker selection guide: shared/speaker-selection.md
Episode creation: shared/api-storybook.md
Polling: shared/common-patterns.md § Async Polling
Config pattern: shared/config-pattern.md

Composability

Invokes : speakers API (for speaker selection); may invoke /speech for voiceover
Invoked by : content-planner (Phase 3)

Example

User : "Create an explainer video introducing Claude Code"

Agent workflow :

Topic: "Claude Code introduction"
Ask language → "English"
Ask style → "Info"
Fetch speakers, user picks "cozy-man-english"
Ask output → "Text + Video"

curl -sS -X POST "https://api.marswave.ai/openapi/v1/storybook/episodes" \

  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Source: skills" \
  -d '{
    "sources": [{"type": "text", "content": "Introduce Claude Code: what it is, key features, and how to get started"}],
    "speakers": [{"speakerId": "cozy-man-english"}],
    "language": "en",
    "mode": "info"
  }'

Poll until text is ready, then generate video if requested.

Weekly Installs

379

Repository

marswaveai/skills

GitHub Stars

First Seen

12 days ago

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

codex375

gemini-cli373

cursor373

opencode373

cline372

github-copilot372

AI Elements：基于shadcn/ui的AI原生应用组件库，快速构建对话界面

54,900 周安装

When notified, download and present script :

AI解说视频生成工具 - 一键创建教程与产品介绍视频，含AI生成画面

🇨🇳中文介绍

何时使用

何时不使用

目的

硬性约束

相关 Skills

步骤 -1：API 密钥检查

步骤 0：配置设置

设置流程（仅限用户主动要求重新配置）

交互流程

步骤 1：主题 / 内容

步骤 2：语言

步骤 3：风格

步骤 4：主播选择

步骤 5：输出类型

步骤 6：确认并生成

工作流程

成功生成后

API 参考

可组合性

示例

🇺🇸English

When to Use

When NOT to Use

Purpose

Hard Constraints

Step -1: API Key Check

Step 0: Config Setup

Setup Flow (user-initiated reconfigure only)

Interaction Flow

Step 1: Topic / Content

Step 2: Language

Step 3: Style

Step 4: Speaker Selection

Step 5: Output Type

Step 6: Confirm & Generate

Workflow

After Successful Generation

API Reference

Composability

Example

最新 Skills