speech by openai/skills
npx skills add https://github.com/openai/skills --skill speech为当前项目生成语音音频(旁白、产品演示配音、交互式语音应答提示、无障碍阅读)。默认使用 gpt-4o-mini-tts-2025-12-15 和内置语音,并优先使用捆绑的 CLI 以实现确定性和可复现的运行。
tmp/ 目录下写入一个临时的 JSONL 文件(每行一个任务),运行一次,然后删除该 JSONL 文件。scripts/text_to_speech.py)(参见 references/cli.md)。广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
tmp/speech/ 存放中间文件(例如 JSONL 批处理文件);完成后删除。output/speech/ 目录下。--out 或 --out-dir 来控制输出路径;保持文件名稳定且具有描述性。优先使用 uv 进行依赖管理。
Python 包:
uv pip install openai
如果 uv 不可用:
python3 -m pip install openai
OPENAI_API_KEY。如果缺少密钥,请向用户提供以下步骤:
OPENAI_API_KEY 设置为环境变量。如果在此环境中无法安装,请告知用户缺少哪个依赖项以及如何在本地安装。
gpt-4o-mini-tts-2025-12-15。cedar。如果用户想要更明亮的音色,优先选择 marin。instructions 参数支持 GPT-4o mini TTS 模型,但不支持 tts-1 或 tts-1-hd。--rpm 上限设置为 50。OPENAI_API_KEY。openai 包);不要使用原始 HTTP。scripts/text_to_speech.py),而不是编写新的临时脚本。scripts/text_to_speech.py。如果缺少某些功能,请在执行其他操作之前询问用户。将用户指示重新格式化为简短的、带标签的规范。仅将隐含的细节明确化;不要发明新的要求。
快速澄清(增强 vs 发明):
模板(仅包含相关行):
Voice Affect: <语音的整体特征和质感>
Tone: <态度、正式程度、热情度>
Pacing: <缓慢、平稳、轻快>
Emotion: <要传达的关键情感>
Pronunciation: <需要清晰发音或强调的词语>
Pauses: <添加有意停顿的位置>
Emphasis: <需要强调的关键词或短语>
Delivery: <节奏或韵律说明>
增强规则:
Input text: "Welcome to the demo. Today we'll show how it works."
Instructions:
Voice Affect: Warm and composed.
Tone: Friendly and confident.
Pacing: Steady and moderate.
Emphasis: Stress "demo" and "show".
{"input":"Thank you for calling. Please hold.","voice":"cedar","response_format":"mp3","out":"hold.mp3"}
{"input":"For sales, press 1. For support, press 2.","voice":"marin","instructions":"Tone: Clear and neutral. Pacing: Slow.","response_format":"wav"}
更多原则:references/prompting.md。复制/粘贴规范:references/sample-prompts.md。
当请求针对特定表达风格时,请使用这些模块。它们提供了有针对性的默认值和模板。
references/narration.mdreferences/voiceover.mdreferences/ivr.mdreferences/accessibility.mdreferences/cli.mdreferences/audio-api.mdreferences/voice-directions.mdreferences/codex-network.mdreferences/cli.md : 如何通过 scripts/text_to_speech.py 运行语音生成/批处理(命令、标志、配方)。references/audio-api.md : API 参数、限制、语音列表。references/voice-directions.md : 指令模式和示例。references/prompting.md : 指令最佳实践(结构、限制、迭代模式)。references/sample-prompts.md : 复制/粘贴指令配方(仅示例;无额外理论)。references/narration.md : 旁白和解说的模板 + 默认值。references/voiceover.md : 产品演示配音的模板 + 默认值。references/ivr.md : 交互式语音应答/电话提示的模板 + 默认值。references/accessibility.md : 无障碍阅读的模板 + 默认值。references/codex-network.md : 环境/沙箱/网络审批故障排除。每周安装次数
527
代码仓库
GitHub 星标数
15.3K
首次出现
2026年1月28日
安全审计
安装于
codex468
opencode445
gemini-cli436
github-copilot425
cursor421
kimi-cli410
Generate spoken audio for the current project (narration, product demo voiceover, IVR prompts, accessibility reads). Defaults to gpt-4o-mini-tts-2025-12-15 and built-in voices, and prefers the bundled CLI for deterministic, reproducible runs.
scripts/text_to_speech.py) with sensible defaults (see references/cli.md).tmp/speech/ for intermediate files (for example JSONL batches); delete when done.output/speech/ when working in this repo.--out or --out-dir to control output paths; keep filenames stable and descriptive.Prefer uv for dependency management.
Python packages:
uv pip install openai
If uv is unavailable:
python3 -m pip install openai
OPENAI_API_KEY must be set for live API calls.If the key is missing, give the user these steps:
OPENAI_API_KEY as an environment variable in their system.If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.
gpt-4o-mini-tts-2025-12-15 unless the user requests another model.cedar. If the user wants a brighter tone, prefer marin.instructions are supported for GPT-4o mini TTS models, but not for tts-1 or tts-1-hd.--rpm at 50.OPENAI_API_KEY before any live API call.openai package) for all API calls; do not use raw HTTP.Reformat user direction into a short, labeled spec. Only make implicit details explicit; do not invent new requirements.
Quick clarification (augmentation vs invention):
Template (include only relevant lines):
Voice Affect: <overall character and texture of the voice>
Tone: <attitude, formality, warmth>
Pacing: <slow, steady, brisk>
Emotion: <key emotions to convey>
Pronunciation: <words to enunciate or emphasize>
Pauses: <where to add intentional pauses>
Emphasis: <key words or phrases to stress>
Delivery: <cadence or rhythm notes>
Augmentation rules:
Input text: "Welcome to the demo. Today we'll show how it works."
Instructions:
Voice Affect: Warm and composed.
Tone: Friendly and confident.
Pacing: Steady and moderate.
Emphasis: Stress "demo" and "show".
{"input":"Thank you for calling. Please hold.","voice":"cedar","response_format":"mp3","out":"hold.mp3"}
{"input":"For sales, press 1. For support, press 2.","voice":"marin","instructions":"Tone: Clear and neutral. Pacing: Slow.","response_format":"wav"}
More principles: references/prompting.md. Copy/paste specs: references/sample-prompts.md.
Use these modules when the request is for a specific delivery style. They provide targeted defaults and templates.
references/narration.mdreferences/voiceover.mdreferences/ivr.mdreferences/accessibility.mdreferences/cli.mdreferences/audio-api.mdreferences/voice-directions.mdreferences/codex-network.mdreferences/cli.md : how to run speech generation/batches via scripts/text_to_speech.py (commands, flags, recipes).references/audio-api.md : API parameters, limits, voice list.references/voice-directions.md : instruction patterns and examples.references/prompting.md : instruction best practices (structure, constraints, iteration patterns).references/sample-prompts.md : copy/paste instruction recipes (examples only; no extra theory).references/narration.md : templates + defaults for narration and explainers.Weekly Installs
527
Repository
GitHub Stars
15.3K
First Seen
Jan 28, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
codex468
opencode445
gemini-cli436
github-copilot425
cursor421
kimi-cli410
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
41,400 周安装
SRE工程师指南:核心工作流程、SLO定义、监控告警与自动化实践
1,100 周安装
Playwright 测试最佳实践指南 - 50+ 实战模式与 TypeScript/JavaScript 示例
1,100 周安装
Swift协议依赖注入测试:基于协议的DI模式实现可测试代码
1,100 周安装
GSAP Utils 工具函数详解:数学运算、数组处理与动画值映射 | GSAP 开发指南
1,300 周安装
App Store Connect 订阅批量本地化工具 - 自动化设置多语言显示名称
1,100 周安装
maishou 买手技能:淘宝京东拼多多抖音快手全网比价,获取商品价格优惠券
1,100 周安装
scripts/text_to_speech.py) over writing new one-off scripts.scripts/text_to_speech.py. If something is missing, ask the user before doing anything else.references/voiceover.mdreferences/ivr.md : templates + defaults for IVR/phone prompts.references/accessibility.md : templates + defaults for accessibility reads.references/codex-network.md : environment/sandbox/network-approval troubleshooting.