podcast-generation by bytedance/deer-flow
npx skills add https://github.com/bytedance/deer-flow --skill podcast-generation此技能可将文本内容转换为高质量的播客音频。工作流程包括创建结构化的 JSON 脚本(对话形式)并通过文本转语音合成执行音频生成。
当用户请求生成播客时,请确认:
/mnt/user-data 下的文件夹在 /mnt/user-data/workspace/ 目录下生成一个结构化的 JSON 脚本文件,命名模式为:{描述性名称}-script.json
JSON 结构:
{
"locale": "en",
"lines": [
{"speaker": "male", "paragraph": "dialogue text"},
{"speaker": "female", "paragraph": "dialogue text"}
]
}
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
调用 Python 脚本:
python /mnt/skills/public/podcast-generation/scripts/generate.py \
--script-file /mnt/user-data/workspace/script-file.json \
--output-file /mnt/user-data/outputs/generated-podcast.mp3 \
--transcript-file /mnt/user-data/outputs/generated-podcast-transcript.md
参数:
--script-file:JSON 脚本文件的绝对路径(必需)--output-file:输出 MP3 文件的绝对路径(必需)--transcript-file:输出转录文本 Markdown 文件的绝对路径(可选,但推荐)[!IMPORTANT]
- 在一个完整的调用中执行脚本。请勿将工作流程拆分为单独的步骤。
- 脚本内部处理所有 TTS API 调用和音频生成。
- 请勿读取 Python 文件,只需使用参数调用它。
- 始终包含
--transcript-file参数,以便为用户生成可读的转录文本。
脚本 JSON 文件必须遵循此结构:
{
"title": "The History of Artificial Intelligence",
"locale": "en",
"lines": [
{"speaker": "male", "paragraph": "Hello Deer! Welcome back to another episode."},
{"speaker": "female", "paragraph": "Hey everyone! Today we have an exciting topic to discuss."},
{"speaker": "male", "paragraph": "That's right! We're going to talk about..."}
]
}
字段:
title:播客剧集标题(可选,在转录文本中用作标题)locale:语言代码 - "en" 表示英文,"zh" 表示中文lines:对话行数组
speaker:"male" 或 "female"paragraph:该主持人的对话文本创建脚本 JSON 时,请遵循以下指南:
用户请求:"生成一个关于人工智能历史的播客"
步骤 1:创建脚本文件 /mnt/user-data/workspace/ai-history-script.json:
{
"title": "The History of Artificial Intelligence",
"locale": "en",
"lines": [
{"speaker": "male", "paragraph": "Hello Deer! Welcome back to another fascinating episode. Today we're diving into something that's literally shaping our future - the history of artificial intelligence."},
{"speaker": "female", "paragraph": "Oh, I love this topic! You know, AI feels so modern, but it actually has roots going back over seventy years."},
{"speaker": "male", "paragraph": "Exactly! It all started back in the 1950s. The term artificial intelligence was actually coined by John McCarthy in 1956 at a famous conference at Dartmouth."},
{"speaker": "female", "paragraph": "Wait, so they were already thinking about machines that could think back then? That's incredible!"},
{"speaker": "male", "paragraph": "Right? The early pioneers were so optimistic. They thought we'd have human-level AI within a generation."},
{"speaker": "female", "paragraph": "But things didn't quite work out that way, did they?"},
{"speaker": "male", "paragraph": "No, not at all. The 1970s brought what's called the first AI winter..."}
]
}
步骤 2:执行生成:
python /mnt/skills/public/podcast-generation/scripts/generate.py \
--script-file /mnt/user-data/workspace/ai-history-script.json \
--output-file /mnt/user-data/outputs/ai-history-podcast.mp3 \
--transcript-file /mnt/user-data/outputs/ai-history-transcript.md
这将生成:
ai-history-podcast.mp3:音频播客文件ai-history-transcript.md:播客的可读 Markdown 转录文本仅在匹配用户请求时阅读以下模板文件。
生成的播客遵循 "Hello Deer" 格式:
生成后:
/mnt/user-data/outputs/ 目录下present_files 工具与用户共享播客 MP3 和转录文本 MD必须设置以下环境变量:
VOLCENGINE_TTS_APPID:火山引擎 TTS 应用 IDVOLCENGINE_TTS_ACCESS_TOKEN:火山引擎 TTS 访问令牌VOLCENGINE_TTS_CLUSTER:火山引擎 TTS 集群(可选,默认为 "volcano_tts")每周安装量
140
仓库
GitHub 星标数
27.8K
首次出现
2026年2月17日
安全审计
安装于
gemini-cli137
github-copilot137
opencode137
kimi-cli136
amp136
cursor136
This skill generates high-quality podcast audio from text content. The workflow includes creating a structured JSON script (conversational dialogue) and executing audio generation through text-to-speech synthesis.
When a user requests podcast generation, identify:
/mnt/user-dataGenerate a structured JSON script file in /mnt/user-data/workspace/ with naming pattern: {descriptive-name}-script.json
The JSON structure:
{
"locale": "en",
"lines": [
{"speaker": "male", "paragraph": "dialogue text"},
{"speaker": "female", "paragraph": "dialogue text"}
]
}
Call the Python script:
python /mnt/skills/public/podcast-generation/scripts/generate.py \
--script-file /mnt/user-data/workspace/script-file.json \
--output-file /mnt/user-data/outputs/generated-podcast.mp3 \
--transcript-file /mnt/user-data/outputs/generated-podcast-transcript.md
Parameters:
--script-file: Absolute path to JSON script file (required)--output-file: Absolute path to output MP3 file (required)--transcript-file: Absolute path to output transcript markdown file (optional, but recommended)[!IMPORTANT]
- Execute the script in one complete call. Do NOT split the workflow into separate steps.
- The script handles all TTS API calls and audio generation internally.
- Do NOT read the Python file, just call it with the parameters.
- Always include
--transcript-fileto generate a readable transcript for the user.
The script JSON file must follow this structure:
{
"title": "The History of Artificial Intelligence",
"locale": "en",
"lines": [
{"speaker": "male", "paragraph": "Hello Deer! Welcome back to another episode."},
{"speaker": "female", "paragraph": "Hey everyone! Today we have an exciting topic to discuss."},
{"speaker": "male", "paragraph": "That's right! We're going to talk about..."}
]
}
Fields:
title: Title of the podcast episode (optional, used as heading in transcript)locale: Language code - "en" for English or "zh" for Chineselines: Array of dialogue lines
speaker: Either "male" or "female"paragraph: The dialogue text for this speakerWhen creating the script JSON, follow these guidelines:
User request: "Generate a podcast about the history of artificial intelligence"
Step 1: Create script file /mnt/user-data/workspace/ai-history-script.json:
{
"title": "The History of Artificial Intelligence",
"locale": "en",
"lines": [
{"speaker": "male", "paragraph": "Hello Deer! Welcome back to another fascinating episode. Today we're diving into something that's literally shaping our future - the history of artificial intelligence."},
{"speaker": "female", "paragraph": "Oh, I love this topic! You know, AI feels so modern, but it actually has roots going back over seventy years."},
{"speaker": "male", "paragraph": "Exactly! It all started back in the 1950s. The term artificial intelligence was actually coined by John McCarthy in 1956 at a famous conference at Dartmouth."},
{"speaker": "female", "paragraph": "Wait, so they were already thinking about machines that could think back then? That's incredible!"},
{"speaker": "male", "paragraph": "Right? The early pioneers were so optimistic. They thought we'd have human-level AI within a generation."},
{"speaker": "female", "paragraph": "But things didn't quite work out that way, did they?"},
{"speaker": "male", "paragraph": "No, not at all. The 1970s brought what's called the first AI winter..."}
]
}
Step 2: Execute generation:
python /mnt/skills/public/podcast-generation/scripts/generate.py \
--script-file /mnt/user-data/workspace/ai-history-script.json \
--output-file /mnt/user-data/outputs/ai-history-podcast.mp3 \
--transcript-file /mnt/user-data/outputs/ai-history-transcript.md
This will generate:
ai-history-podcast.mp3: The audio podcast fileai-history-transcript.md: A readable markdown transcript of the podcastRead the following template file only when matching the user request.
The generated podcast follows the "Hello Deer" format:
After generation:
/mnt/user-data/outputs/present_files toolThe following environment variables must be set:
VOLCENGINE_TTS_APPID: Volcengine TTS application IDVOLCENGINE_TTS_ACCESS_TOKEN: Volcengine TTS access tokenVOLCENGINE_TTS_CLUSTER: Volcengine TTS cluster (optional, defaults to "volcano_tts")Weekly Installs
140
Repository
GitHub Stars
27.8K
First Seen
Feb 17, 2026
Security Audits
Gen Agent Trust HubPassSocketFailSnykPass
Installed on
gemini-cli137
github-copilot137
opencode137
kimi-cli136
amp136
cursor136
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
56,200 周安装