重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
gemini-tts by akrindev/google-studio-skills
npx skills add https://github.com/akrindev/google-studio-skills --skill gemini-tts通过可执行脚本,利用 Gemini 的 TTS 模型将文本转换为自然流畅的语音,支持多种音色和多说话人对话。
在以下场景中使用此技能:
用途 : 使用 Gemini TTS 模型将文本转换为语音
何时使用 :
关键参数 :
| 参数 | 描述 | 示例 |
|---|---|---|
text | 要转换的文本(必需) | "Hello, world!" |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
--voice, -v | 音色名称 | Kore |
--output, -o | 输出文件的基础名称 | welcome |
--output-dir | 音频输出目录 | audio/ |
--no-timestamp | 禁用自动时间戳 | 标志 |
--model, -m | TTS 模型 | gemini-2.5-flash-preview-tts |
--stream, -s | 启用流式传输 | 标志 |
--speakers | 多说话人映射 | "Joe:Kore,Jane:Puck" |
输出 : WAV 音频文件路径
node scripts/tts.js "Hello, world! Have a wonderful day."
Kore(默认,清晰专业)audio/tts_output_YYYYMMDD_HHMMSS.wav(自动添加时间戳)node scripts/tts.js "Welcome to our podcast about technology trends" --voice Puck --output welcome
audio/welcome_YYYYMMDD_HHMMSS.wavnode scripts/tts.js "TTS the following conversation:
Joe: How's it going today?
Jane: Not too bad, how about you?
Joe: I'm working on a new project.
Jane: Sounds exciting, tell me more!" --speakers "Joe:Kore,Jane:Puck" --output conversation
audio/conversation_YYYYMMDD_HHMMSS.wavnode scripts/tts.js "This is a very long text that would benefit from streaming..." --stream --output long-form
audio/long-form_YYYYMMDD_HHMMSS.wavnode scripts/tts.js "Welcome to our quarterly earnings presentation. Today we'll discuss our growth metrics and future plans." --voice Charon --output voiceover
Charon(深沉、权威)node scripts/tts.js "Save to specific folder." --output-dir ./my-projects/podcasts/ --output episode1
./my-projects/podcasts/episode1_YYYYMMDD_HHMMSS.wav# 1. 生成脚本(gemini-text 技能)
node skills/gemini-text/scripts/generate.js "Write a 2-minute podcast intro about sustainable energy"
# 2. 生成音频(此技能)
node scripts/tts.js "[Paste generated script]" --voice Fenrir --output podcast-intro
# 3. 用于视频或播客
node scripts/tts.js "Welcome to our accessible website. This audio describes our main navigation options." --voice Aoede --output accessibility
Aoede(旋律优美、悦耳)node scripts/tts.js "Chapter 1: Introduction to Quantum Computing. Let's explore the fundamental principles..." --voice Zephyr --output chapter1
Zephyr(轻盈、空灵)node scripts/tts.js "Fixed filename." --output my-audio --no-timestamp
audio/my-audio.wav(无时间戳)| 模型 | 质量 | 速度 | 最适合 |
|---|---|---|---|
gemini-2.5-flash-preview-tts | 良好 | 快速 | 通用、高吞吐量 |
gemini-2.5-pro-preview-tts | 更高 | 较慢 | 优质内容、画外音 |
| 音色 | 特点 | 最适合 |
|---|---|---|
| Kore | 清晰、专业 | 公告、通用目的(默认) |
| Puck | 友好、对话式 | 休闲内容、访谈 |
| Charon | 深沉、权威 | 企业、严肃内容 |
| Fenrir | 温暖、富有表现力 | 讲故事、叙述 |
| Aoede | 旋律优美、悦耳 | 教育、无障碍 |
| Zephyr | 轻盈、空灵 | 温和内容、教程 |
| Sulafat | 中性、平衡 | 纪录片、事实性内容 |
| 规格 | 值 |
|---|---|
| 格式 | WAV (PCM) |
| 采样率 | 24000 Hz |
| 声道 | 1 (单声道) |
| 位深度 | 16-bit |
| 限制 | 类型 | 描述 |
|---|---|---|
| 8,192 | 输入 | 最大输入文本令牌数 |
| 16,384 | 输出 | 最大输出音频令牌数 |
--speakers 参数将说话人映射到音色npm install @google/genai@latest dotenv@latest
SpeakerName:VoiceName,Speaker2:Voice2"Joe:Kore,Jane:Puck,Host:Charon"--output 文件名以避免冲突| 音色 | 理想使用场景 |
|---|---|
| Kore | 公告、导航、通用信息 |
| Puck | 播客、访谈、休闲内容 |
| Charon | 企业、新闻、正式演示 |
| Fenrir | 有声读物、故事、情感内容 |
| Aoede | 无障碍、教育、温和内容 |
| Zephyr | 教程、解释、指南 |
| Sulafat | 纪录片、事实性演示 |
# 基础
node scripts/tts.js "Your text here"
# 自定义音色
node scripts/tts.js "Your text" --voice Puck --output audio.wav
# 多说话人
node scripts/tts.js "Joe: Hi. Jane: Hello!" --speakers "Joe:Kore,Jane:Puck"
# 流式传输
node scripts/tts.js "Long text..." --stream --output long.wav
# 专业
node scripts/tts.js "Corporate announcement" --voice Charon
references/voices.md 获取完整的音色文档每周安装次数
42
代码仓库
GitHub 星标数
1
首次出现
2026年1月29日
安全审计
安装于
gemini-cli32
opencode28
codex27
cursor23
github-copilot22
openclaw21
Generate natural-sounding speech from text using Gemini's TTS models through executable scripts with support for multiple voices and multi-speaker conversations.
Use this skill when you need to:
Purpose : Convert text to speech using Gemini TTS models
When to use :
Key parameters :
| Parameter | Description | Example |
|---|---|---|
text | Text to convert (required) | "Hello, world!" |
--voice, -v | Voice name | Kore |
--output, -o | Base name for output file | welcome |
--output-dir | Output directory for audio | audio/ |
--no-timestamp | Disable auto timestamp | Flag |
--model, -m | TTS model | gemini-2.5-flash-preview-tts |
--stream, -s | Enable streaming | Flag |
--speakers | Multi-speaker mapping | "Joe:Kore,Jane:Puck" |
Output : WAV audio file path
node scripts/tts.js "Hello, world! Have a wonderful day."
Kore (default, clear and professional)audio/tts_output_YYYYMMDD_HHMMSS.wav (auto timestamp)node scripts/tts.js "Welcome to our podcast about technology trends" --voice Puck --output welcome
audio/welcome_YYYYMMDD_HHMMSS.wavnode scripts/tts.js "TTS the following conversation:
Joe: How's it going today?
Jane: Not too bad, how about you?
Joe: I'm working on a new project.
Jane: Sounds exciting, tell me more!" --speakers "Joe:Kore,Jane:Puck" --output conversation
audio/conversation_YYYYMMDD_HHMMSS.wavnode scripts/tts.js "This is a very long text that would benefit from streaming..." --stream --output long-form
audio/long-form_YYYYMMDD_HHMMSS.wavnode scripts/tts.js "Welcome to our quarterly earnings presentation. Today we'll discuss our growth metrics and future plans." --voice Charon --output voiceover
Charon (deep, authoritative)node scripts/tts.js "Save to specific folder." --output-dir ./my-projects/podcasts/ --output episode1
./my-projects/podcasts/episode1_YYYYMMDD_HHMMSS.wav# 1. Generate script (gemini-text skill)
node skills/gemini-text/scripts/generate.js "Write a 2-minute podcast intro about sustainable energy"
# 2. Generate audio (this skill)
node scripts/tts.js "[Paste generated script]" --voice Fenrir --output podcast-intro
# 3. Use in video or podcast
node scripts/tts.js "Welcome to our accessible website. This audio describes our main navigation options." --voice Aoede --output accessibility
Aoede (melodic, pleasant)node scripts/tts.js "Chapter 1: Introduction to Quantum Computing. Let's explore the fundamental principles..." --voice Zephyr --output chapter1
Zephyr (light, airy)node scripts/tts.js "Fixed filename." --output my-audio --no-timestamp
audio/my-audio.wav (no timestamp)| Model | Quality | Speed | Best For |
|---|---|---|---|
gemini-2.5-flash-preview-tts | Good | Fast | General use, high volume |
gemini-2.5-pro-preview-tts | Higher | Slower | Premium content, voiceovers |
| Voice | Characteristics | Best For |
|---|---|---|
| Kore | Clear, professional | Announcements, general purpose (default) |
| Puck | Friendly, conversational | Casual content, interviews |
| Charon | Deep, authoritative | Corporate, serious content |
| Fenrir | Warm, expressive | Storytelling, narratives |
| Aoede | Melodic, pleasant | Educational, accessibility |
| Zephyr | Light, airy | Gentle content, tutorials |
| Sulafat | Neutral, balanced | Documentaries, factual content |
| Specification | Value |
|---|---|
| Format | WAV (PCM) |
| Sample rate | 24000 Hz |
| Channels | 1 (mono) |
| Bit depth | 16-bit |
| Limit | Type | Description |
|---|---|---|
| 8,192 | Input | Maximum input text tokens |
| 16,384 | Output | Maximum output audio tokens |
--speakers parameter to map speakers to voicesnpm install @google/genai@latest dotenv@latest
SpeakerName:VoiceName,Speaker2:Voice2"Joe:Kore,Jane:Puck,Host:Charon"--output filename to avoid conflicts| Voice | Ideal Use Cases |
|---|---|
| Kore | Announcements, navigation, general info |
| Puck | Podcasts, interviews, casual content |
| Charon | Corporate, news, formal presentations |
| Fenrir | Audiobooks, stories, emotional content |
| Aoede | Accessibility, educational, gentle content |
| Zephyr | Tutorials, explanations, guides |
| Sulafat | Documentaries, factual presentations |
# Basic
node scripts/tts.js "Your text here"
# Custom voice
node scripts/tts.js "Your text" --voice Puck --output audio.wav
# Multi-speaker
node scripts/tts.js "Joe: Hi. Jane: Hello!" --speakers "Joe:Kore,Jane:Puck"
# Streaming
node scripts/tts.js "Long text..." --stream --output long.wav
# Professional
node scripts/tts.js "Corporate announcement" --voice Charon
references/voices.md for complete voice documentationWeekly Installs
42
Repository
GitHub Stars
1
First Seen
Jan 29, 2026
Security Audits
Gen Agent Trust HubPassSocketFailSnykPass
Installed on
gemini-cli32
opencode28
codex27
cursor23
github-copilot22
openclaw21
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
56,600 周安装