⚠️

重要前提

安装AI Skills的关键前提是：必须科学上网，且开启TUN模式，这一点至关重要，直接决定安装能否顺利完成，在此郑重提醒三遍：科学上网，科学上网，科学上网。查看完整安装教程 →

ElevenLabs文本转语音与播客生成技能 - 高质量AI语音合成与对话式音频制作

elevenlabs by sanjay3290/ai-skills

65 周安装量

172 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/sanjay3290/ai-skills --skill elevenlabs

AI/机器学习内容创作音频处理

🇨🇳中文介绍

ElevenLabs - 文本转语音与播客技能

概述

此技能利用 ElevenLabs TTS API 将文本和文档转换为高质量音频。它支持两种模式：单人语音播报和双人对话式播客生成。

使用场景

当用户提及以下内容时激活：

"创建播客"、"生成播客"、"从文档生成播客"
"播报文档"、"播报此文件"、"朗读"
"文本转语音"、"TTS"、"转换为音频"
"从文档生成音频"、"音频版本"

设置

配置文件位于 skills/elevenlabs/config.json：

{
  "api_key": "your-elevenlabs-api-key",
  "default_voice": "JBFqnCBsd6RMkjVDRZzb",
  "default_model": "eleven_multilingual_v2",
  "podcast_voice1": "JBFqnCBsd6RMkjVDRZzb",
  "podcast_voice2": "EXAVITQu4vr4xnSDxMaL"
}

仅 api_key 为必填项。或者设置 ELEVENLABS_API_KEY 环境变量。

依赖项：pip install PyPDF2 python-docx（仅处理 PDF/DOCX 文件时需要）。

多片段播报和播客功能需要。

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

播客工作流程（针对 Claude）

当用户要求从文档创建播客时：

提取文档文本：

python skills/elevenlabs/scripts/extract.py /path/to/document.pdf
根据提取的文本生成双人对话脚本。遵循以下准则：

 * 写作风格应像两个主持人之间自然、引人入胜的讨论

 * 主持人 1 通常引导/介绍主题，主持人 2 补充分析和反应
 * 以简短的介绍开场，欢迎听众并陈述主题
 * 以总结/结束语收尾
 * 每个话轮保持在 3000 字符以内
 * 变化话轮长度 - 混合简短反应和较长解释
 * 使用对话式语言："这是个很好的观点"、"我发现有趣的是..."
 * 引用源文档中的具体细节
 * 避免逐字朗读文档 - 应讨论和解读它

3. 将脚本作为 JSON 数组写入临时文件：

     # 写入 /tmp/podcast_script.json
     [
       {"speaker": "host1", "text": "Welcome to today's episode..."},
       {"speaker": "host2", "text": "Thanks for having me..."},
       ...
     ]

4. 生成播客：

     python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3

5. 清理临时脚本文件。

首先运行 voices 命令，让用户选择他们喜欢的语音
对于播客，建议选择具有对比特质的语音对（例如，一个深沉，一个明亮）
除非用户另有指定，默认输出到 ~/Downloads/ 目录
对于大型文档，提醒用户注意其 ElevenLabs 套餐的字符使用量

🇺🇸English

ElevenLabs - Text-to-Speech & Podcast Skill

Overview

This skill converts text and documents into high-quality audio using ElevenLabs TTS API. It supports two modes: single-voice narration and two-host conversational podcast generation.

When to Use This Skill

Activate when the user mentions:

"create podcast", "generate podcast", "podcast from document"
"narrate document", "narrate this file", "read aloud"
"text to speech", "TTS", "convert to audio"
"audio from document", "audio version of"

Setup

Config at skills/elevenlabs/config.json:

{
  "api_key": "your-elevenlabs-api-key",
  "default_voice": "JBFqnCBsd6RMkjVDRZzb",
  "default_model": "eleven_multilingual_v2",
  "podcast_voice1": "JBFqnCBsd6RMkjVDRZzb",
  "podcast_voice2": "EXAVITQu4vr4xnSDxMaL"
}

Only api_key is required. Or set ELEVENLABS_API_KEY env var.

Dependencies: pip install PyPDF2 python-docx (only needed for PDF/DOCX files).

Requires ffmpeg for multi-chunk narration and podcasts.

Commands

List Voices

python skills/elevenlabs/scripts/elevenlabs.py voices
python skills/elevenlabs/scripts/elevenlabs.py voices --json

Use this to find voice IDs for the user.

Single-Voice TTS

# From text
python skills/elevenlabs/scripts/elevenlabs.py tts --text "Hello world" --output ~/Downloads/hello.mp3

# From document
python skills/elevenlabs/scripts/elevenlabs.py tts --file /path/to/doc.pdf --output ~/Downloads/narration.mp3

# With specific voice
python skills/elevenlabs/scripts/elevenlabs.py tts --file doc.md --voice VOICE_ID --output out.mp3

The script handles text extraction, chunking at sentence boundaries (~4000 chars), TTS per chunk with voice continuity, and ffmpeg concatenation automatically.

Podcast Generation

Podcast mode requires a JSON script file with conversation segments:

[
  {"speaker": "host1", "text": "Welcome to our podcast! Today we're diving into..."},
  {"speaker": "host2", "text": "That's right! I found the section on..."},
  {"speaker": "host1", "text": "Let's break that down..."}
]



python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/script.json --voice1 ID1 --voice2 ID2 --output ~/Downloads/podcast.mp3

Podcast Workflow (for Claude)

When the user asks to create a podcast from a document:

Extract the document text :

python skills/elevenlabs/scripts/extract.py /path/to/document.pdf

Generate a two-host conversation script from the extracted text. Follow these guidelines:
- Write as a natural, engaging discussion between two hosts
- Host 1 typically leads/introduces topics, Host 2 adds analysis and reactions
- Start with a brief intro welcoming listeners and stating the topic
- End with a summary/outro
- Keep each turn under 3000 characters
- Vary turn lengths - mix short reactions with longer explanations
- Use conversational language: "That's a great point", "What I found interesting was..."
- Reference specific details from the source document
- Avoid reading the document verbatim - discuss and interpret it

Write the script as a JSON array to a temp file:

# Write to /tmp/podcast_script.json
[
  {"speaker": "host1", "text": "Welcome to today's episode..."},
  {"speaker": "host2", "text": "Thanks for having me..."},
  ...
]

Generate the podcast :

python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3

Tips

Run voices first to let the user pick voices they like
For podcasts, suggest voice pairs with contrasting qualities (e.g., one deep, one bright)
Default output to ~/Downloads/ unless the user specifies otherwise
For large documents, warn the user about character usage on their ElevenLabs plan

Weekly Installs

Repository

sanjay3290/ai-skills

GitHub Stars

172

First Seen

Feb 14, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

codex62

gemini-cli62

opencode60

github-copilot60

cursor59

amp58

超能力技能使用指南：AI助手技能调用优先级与工作流程详解

53,700 周安装

Clean up the temp script file.