voicebox-voice-synthesis by aradotso/trending-skills
npx skills add https://github.com/aradotso/trending-skills --skill voicebox-voice-synthesis技能由 ara.so 提供 — Daily 2026 技能集合。
Voicebox 是一个本地优先、开源的语音克隆和 TTS 工作室 — 一个自托管的 ElevenLabs 替代方案。它完全在您的机器上运行(macOS MLX/Metal、Windows/Linux CUDA、CPU 后备),在 localhost:17493 上暴露 REST API,并附带 5 个 TTS 引擎、23 种语言、后处理效果和一个多轨故事编辑器。
| 平台 | 链接 |
|---|---|
| macOS Apple Silicon | https://voicebox.sh/download/mac-arm |
| macOS Intel | https://voicebox.sh/download/mac-intel |
| Windows | https://voicebox.sh/download/windows |
| Docker | docker compose up |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
Linux 需要从源代码构建:https://voicebox.sh/linux-install
先决条件: Bun, Rust, Python 3.11+, Tauri 先决条件
git clone https://github.com/jamiepine/voicebox.git
cd voicebox
# 安装 just 任务运行器
brew install just # macOS
cargo install just # 任何平台
# 设置 Python venv + 所有依赖项
just setup
# 以开发模式启动后端 + 桌面应用
just dev
# 列出所有可用命令
just --list
| 层级 | 技术 |
|---|---|
| 桌面应用 | Tauri (Rust) |
| 前端 | React + TypeScript + Tailwind CSS |
| 状态管理 | Zustand + React Query |
| 后端 | FastAPI (Python) 运行在端口 17493 |
| TTS 引擎 | Qwen3-TTS, LuxTTS, Chatterbox, Chatterbox Turbo, TADA |
| 效果处理 | Pedalboard (Spotify) |
| 转录 | Whisper / Whisper Turbo |
| 推理 | MLX (Apple Silicon) / PyTorch (CUDA/ROCm/XPU/CPU) |
| 数据库 | SQLite |
Python FastAPI 后端处理所有 ML 推理。Tauri Rust 外壳包装前端并管理后端进程的生命周期。即使在使用桌面应用时,API 也可直接在 http://localhost:17493 访问。
基础 URL:http://localhost:17493
交互式文档:http://localhost:17493/docs
# 基本生成
curl -X POST http://localhost:17493/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Hello world, this is a voice clone.",
"profile_id": "abc123",
"language": "en"
}'
# 选择引擎
curl -X POST http://localhost:17493/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Speak slowly and with gravitas.",
"profile_id": "abc123",
"language": "en",
"engine": "qwen3-tts"
}'
# 使用副语言标签(仅限 Chatterbox Turbo)
curl -X POST http://localhost:17493/generate \
-H "Content-Type: application/json" \
-d '{
"text": "That is absolutely hilarious! [laugh] I cannot believe it.",
"profile_id": "abc123",
"engine": "chatterbox-turbo",
"language": "en"
}'
# 列出所有配置文件
curl http://localhost:17493/profiles
# 创建新配置文件
curl -X POST http://localhost:17493/profiles \
-H "Content-Type: application/json" \
-d '{
"name": "Narrator",
"language": "en",
"description": "Deep narrative voice"
}'
# 上传音频样本到配置文件
curl -X POST http://localhost:17493/profiles/{profile_id}/samples \
-F "file=@/path/to/voice-sample.wav"
# 导出配置文件
curl http://localhost:17493/profiles/{profile_id}/export \
--output narrator-profile.zip
# 导入配置文件
curl -X POST http://localhost:17493/profiles/import \
-F "file=@narrator-profile.zip"
# 获取生成状态(SSE 流)
curl -N http://localhost:17493/generate/{generation_id}/status
# 列出最近的生成任务
curl http://localhost:17493/generations
# 重试失败的生成任务
curl -X POST http://localhost:17493/generations/{generation_id}/retry
# 下载生成的音频
curl http://localhost:17493/generations/{generation_id}/audio \
--output output.wav
# 列出可用模型及其下载状态
curl http://localhost:17493/models
# 从 GPU 内存中卸载模型(不删除)
curl -X POST http://localhost:17493/models/{model_id}/unload
const VOICEBOX_URL = process.env.VOICEBOX_API_URL ?? "http://localhost:17493";
interface GenerateRequest {
text: string;
profile_id: string;
language?: string;
engine?: "qwen3-tts" | "luxtts" | "chatterbox" | "chatterbox-turbo" | "tada";
}
interface GenerateResponse {
generation_id: string;
status: "queued" | "processing" | "complete" | "failed";
audio_url?: string;
}
async function generateSpeech(req: GenerateRequest): Promise<GenerateResponse> {
const response = await fetch(`${VOICEBOX_URL}/generate`, {
method: "POST",
headers: { "Content-Type: "application/json" },
body: JSON.stringify(req),
});
if (!response.ok) {
throw new Error(`Voicebox API error: ${response.status} ${await response.text()}`);
}
return response.json();
}
// 用法
const result = await generateSpeech({
text: "Welcome to our application.",
profile_id: "abc123",
language: "en",
engine: "qwen3-tts",
});
console.log("Generation ID:", result.generation_id);
async function waitForGeneration(
generationId: string,
timeoutMs = 60_000
): Promise<string> {
const start = Date.now();
while (Date.now() - start < timeoutMs) {
const res = await fetch(`${VOICEBOX_URL}/generations/${generationId}`);
const data = await res.json();
if (data.status === "complete") {
return `${VOICEBOX_URL}/generations/${generationId}/audio`;
}
if (data.status === "failed") {
throw new Error(`Generation failed: ${data.error}`);
}
await new Promise((r) => setTimeout(r, 1000));
}
throw new Error("Generation timed out");
}
function streamGenerationStatus(
generationId: string,
onStatus: (status: string) => void
): () => void {
const eventSource = new EventSource(
`${VOICEBOX_URL}/generate/${generationId}/status`
);
eventSource.onmessage = (event) => {
const data = JSON.parse(event.data);
onStatus(data.status);
if (data.status === "complete" || data.status === "failed") {
eventSource.close();
}
};
eventSource.onerror = () => eventSource.close();
// 返回清理函数
return () => eventSource.close();
}
// 用法
const cleanup = streamGenerationStatus("gen_abc123", (status) => {
console.log("Status update:", status);
});
async function downloadAudio(generationId: string): Promise<Blob> {
const response = await fetch(
`${VOICEBOX_URL}/generations/${generationId}/audio`
);
if (!response.ok) {
throw new Error(`Failed to download audio: ${response.status}`);
}
return response.blob();
}
// 在浏览器中播放
async function playGeneratedAudio(generationId: string): Promise<void> {
const blob = await downloadAudio(generationId);
const url = URL.createObjectURL(blob);
const audio = new Audio(url);
audio.play();
audio.onended = () => URL.revokeObjectURL(url);
}
import httpx
import asyncio
VOICEBOX_URL = "http://localhost:17493"
async def generate_speech(
text: str,
profile_id: str,
language: str = "en",
engine: str = "qwen3-tts"
) -> bytes:
async with httpx.AsyncClient(timeout=120.0) as client:
# 提交生成任务
resp = await client.post(
f"{VOICEBOX_URL}/generate",
json={
"text": text,
"profile_id": profile_id,
"language": language,
"engine": engine,
}
)
resp.raise_for_status()
generation_id = resp.json()["generation_id"]
# 轮询直到完成
for _ in range(120):
status_resp = await client.get(
f"{VOICEBOX_URL}/generations/{generation_id}"
)
status_data = status_resp.json()
if status_data["status"] == "complete":
audio_resp = await client.get(
f"{VOICEBOX_URL}/generations/{generation_id}/audio"
)
return audio_resp.content
if status_data["status"] == "failed":
raise RuntimeError(f"Generation failed: {status_data.get('error')}")
await asyncio.sleep(1.0)
raise TimeoutError("Generation timed out after 120s")
# 用法
audio_bytes = asyncio.run(
generate_speech(
text="The quick brown fox jumps over the lazy dog.",
profile_id="your-profile-id",
language="en",
engine="chatterbox",
)
)
with open("output.wav", "wb") as f:
f.write(audio_bytes)
| 引擎 | 最适合 | 语言 | VRAM | 备注 |
|---|---|---|---|---|
qwen3-tts (0.6B/1.7B) | 质量 + 指令 | 10 | 中等 | 支持在文本中嵌入表达指令 |
luxtts | 快速 CPU 生成 | 仅英语 | ~1GB | CPU 上 150 倍实时速度,48kHz |
chatterbox | 多语言覆盖 | 23 | 中等 | 阿拉伯语、印地语、斯瓦希里语、中日韩语等 |
chatterbox-turbo | 富有表现力/情感 | 仅英语 | 低 (350M) | 使用 [laugh]、[sigh]、[gasp] 等标签 |
tada (1B/3B) | 长篇连贯性 | 10 | 高 | 700 秒以上音频,HumeAI 模型 |
直接在文本中嵌入自然语言指令:
await generateSpeech({
text: "(whisper) I have a secret to tell you.",
profile_id: "abc123",
engine: "qwen3-tts",
});
await generateSpeech({
text: "(speak slowly and clearly) Step one: open the application.",
profile_id: "abc123",
engine: "qwen3-tts",
});
const tags = [
"[laugh]", "[chuckle]", "[gasp]", "[cough]",
"[sigh]", "[groan]", "[sniff]", "[shush]", "[clear throat]"
];
await generateSpeech({
text: "Oh really? [gasp] I had no idea! [laugh] That's incredible.",
profile_id: "abc123",
engine: "chatterbox-turbo",
});
# 自定义模型目录(在启动前设置)
export VOICEBOX_MODELS_DIR=/path/to/models
# 对于 AMD ROCm GPU(自动配置,但可以覆盖)
export HSA_OVERRIDE_GFX_VERSION=11.0.0
Docker 配置(docker-compose.yml 覆盖):
services:
voicebox:
environment:
- VOICEBOX_MODELS_DIR=/models
volumes:
- /host/models:/models
ports:
- "17493:17493"
# 对于 NVIDIA GPU 透传:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
// 1. 创建配置文件
const profile = await fetch(`${VOICEBOX_URL}/profiles`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ name: "My Voice", language: "en" }),
}).then((r) => r.json());
// 2. 上传音频样本(WAV/MP3,理想情况下 5–30 秒的清晰语音)
const formData = new FormData();
formData.append("file", audioBlob, "sample.wav");
await fetch(`${VOICEBOX_URL}/profiles/${profile.id}/samples`, {
method: "POST",
body: formData,
});
// 3. 使用新配置文件生成语音
const gen = await generateSpeech({
text: "Testing my cloned voice.",
profile_id: profile.id,
});
async function batchGenerate(
items: Array<{ text: string; profileId: string }>,
engine = "qwen3-tts"
): Promise<string[]> {
// 提交所有任务 — Voicebox 将它们串行排队以避免 GPU 争用
const submissions = await Promise.all(
items.map((item) =>
generateSpeech({ text: item.text, profile_id: item.profileId, engine })
)
);
// 等待所有任务完成
const audioUrls = await Promise.all(
submissions.map((s) => waitForGeneration(s.generation_id))
);
return audioUrls;
}
Voicebox 在句子边界处自动分块 — 只需发送完整文本:
const longScript = `
Chapter one. The morning fog rolled across the valley floor...
// 最多支持 50,000 个字符
`;
await generateSpeech({
text: longScript,
profile_id: "narrator-profile-id",
engine: "tada", // 最适合长篇连贯性
language: "en",
});
# 检查后端是否正在运行
curl http://localhost:17493/health
# 仅重启后端(开发模式)
just backend
# 检查日志
just logs
# 检查检测到的后端
curl http://localhost:17493/system/info
# 强制 CPU 模式(在启动前设置)
export VOICEBOX_FORCE_CPU=1
# 设置具有更多空间的自定义模型目录
export VOICEBOX_MODELS_DIR=/path/with/space
just dev
# 通过 API 取消卡住的下载
curl -X DELETE http://localhost:17493/models/{model_id}/download
# 列出已加载的模型
curl http://localhost:17493/models | jq '.[] | select(.loaded == true)'
# 卸载特定模型
curl -X POST http://localhost:17493/models/{model_id}/unload
chatterbox 引擎luxtts 获得英语的最高输出质量(48kHz)Voicebox 在启动时会自动恢复陈旧的生成任务。如果问题仍然存在:
curl -X POST http://localhost:17493/generations/{generation_id}/retry
import { useState } from "react";
const VOICEBOX_URL = import.meta.env.VITE_VOICEBOX_URL ?? "http://localhost:17493";
export function VoiceGenerator({ profileId }: { profileId: string }) {
const [text, setText] = useState("");
const [audioUrl, setAudioUrl] = useState<string | null>(null);
const [loading, setLoading] = useState(false);
const handleGenerate = async () => {
setLoading(true);
try {
const res = await fetch(`${VOICEBOX_URL}/generate`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ text, profile_id: profileId, language: "en" }),
});
const { generation_id } = await res.json();
// 轮询完成状态
let done = false;
while (!done) {
await new Promise((r) => setTimeout(r, 1000));
const statusRes = await fetch(`${VOICEBOX_URL}/generations/${generation_id}`);
const { status } = await statusRes.json();
if (status === "complete") {
setAudioUrl(`${VOICEBOX_URL}/generations/${generation_id}/audio`);
done = true;
} else if (status === "failed") {
throw new Error("Generation failed");
}
}
} finally {
setLoading(false);
}
};
return (
<div>
<textarea value={text} onChange={(e) => setText(e.target.value)} />
<button onClick={handleGenerate} disabled={loading}>
{loading ? "Generating..." : "Generate Speech"}
</button>
{audioUrl && <audio controls src={audioUrl} />}
</div>
);
}
每周安装量
271
仓库
GitHub Stars
8
首次出现
6 天前
安全审计
已安装于
gemini-cli270
github-copilot270
codex270
amp270
cline270
kimi-cli270
Skill by ara.so — Daily 2026 Skills collection.
Voicebox is a local-first, open-source voice cloning and TTS studio — a self-hosted alternative to ElevenLabs. It runs entirely on your machine (macOS MLX/Metal, Windows/Linux CUDA, CPU fallback), exposes a REST API on localhost:17493, and ships with 5 TTS engines, 23 languages, post-processing effects, and a multi-track Stories editor.
| Platform | Link |
|---|---|
| macOS Apple Silicon | https://voicebox.sh/download/mac-arm |
| macOS Intel | https://voicebox.sh/download/mac-intel |
| Windows | https://voicebox.sh/download/windows |
| Docker | docker compose up |
Linux requires building from source: https://voicebox.sh/linux-install
Prerequisites: Bun, Rust, Python 3.11+, Tauri prerequisites
git clone https://github.com/jamiepine/voicebox.git
cd voicebox
# Install just task runner
brew install just # macOS
cargo install just # any platform
# Set up Python venv + all dependencies
just setup
# Start backend + desktop app in dev mode
just dev
# List all available commands
just --list
| Layer | Technology |
|---|---|
| Desktop App | Tauri (Rust) |
| Frontend | React + TypeScript + Tailwind CSS |
| State | Zustand + React Query |
| Backend | FastAPI (Python) on port 17493 |
| TTS Engines | Qwen3-TTS, LuxTTS, Chatterbox, Chatterbox Turbo, TADA |
| Effects | Pedalboard (Spotify) |
| Transcription | Whisper / Whisper Turbo |
| Inference | MLX (Apple Silicon) / PyTorch (CUDA/ROCm/XPU/CPU) |
| Database | SQLite |
The Python FastAPI backend handles all ML inference. The Tauri Rust shell wraps the frontend and manages the backend process lifecycle. The API is accessible directly at http://localhost:17493 even when using the desktop app.
Base URL: http://localhost:17493
Interactive docs: http://localhost:17493/docs
# Basic generation
curl -X POST http://localhost:17493/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Hello world, this is a voice clone.",
"profile_id": "abc123",
"language": "en"
}'
# With engine selection
curl -X POST http://localhost:17493/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Speak slowly and with gravitas.",
"profile_id": "abc123",
"language": "en",
"engine": "qwen3-tts"
}'
# With paralinguistic tags (Chatterbox Turbo only)
curl -X POST http://localhost:17493/generate \
-H "Content-Type: application/json" \
-d '{
"text": "That is absolutely hilarious! [laugh] I cannot believe it.",
"profile_id": "abc123",
"engine": "chatterbox-turbo",
"language": "en"
}'
# List all profiles
curl http://localhost:17493/profiles
# Create a new profile
curl -X POST http://localhost:17493/profiles \
-H "Content-Type: application/json" \
-d '{
"name": "Narrator",
"language": "en",
"description": "Deep narrative voice"
}'
# Upload audio sample to a profile
curl -X POST http://localhost:17493/profiles/{profile_id}/samples \
-F "file=@/path/to/voice-sample.wav"
# Export a profile
curl http://localhost:17493/profiles/{profile_id}/export \
--output narrator-profile.zip
# Import a profile
curl -X POST http://localhost:17493/profiles/import \
-F "file=@narrator-profile.zip"
# Get generation status (SSE stream)
curl -N http://localhost:17493/generate/{generation_id}/status
# List recent generations
curl http://localhost:17493/generations
# Retry a failed generation
curl -X POST http://localhost:17493/generations/{generation_id}/retry
# Download generated audio
curl http://localhost:17493/generations/{generation_id}/audio \
--output output.wav
# List available models and download status
curl http://localhost:17493/models
# Unload a model from GPU memory (without deleting)
curl -X POST http://localhost:17493/models/{model_id}/unload
const VOICEBOX_URL = process.env.VOICEBOX_API_URL ?? "http://localhost:17493";
interface GenerateRequest {
text: string;
profile_id: string;
language?: string;
engine?: "qwen3-tts" | "luxtts" | "chatterbox" | "chatterbox-turbo" | "tada";
}
interface GenerateResponse {
generation_id: string;
status: "queued" | "processing" | "complete" | "failed";
audio_url?: string;
}
async function generateSpeech(req: GenerateRequest): Promise<GenerateResponse> {
const response = await fetch(`${VOICEBOX_URL}/generate`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(req),
});
if (!response.ok) {
throw new Error(`Voicebox API error: ${response.status} ${await response.text()}`);
}
return response.json();
}
// Usage
const result = await generateSpeech({
text: "Welcome to our application.",
profile_id: "abc123",
language: "en",
engine: "qwen3-tts",
});
console.log("Generation ID:", result.generation_id);
async function waitForGeneration(
generationId: string,
timeoutMs = 60_000
): Promise<string> {
const start = Date.now();
while (Date.now() - start < timeoutMs) {
const res = await fetch(`${VOICEBOX_URL}/generations/${generationId}`);
const data = await res.json();
if (data.status === "complete") {
return `${VOICEBOX_URL}/generations/${generationId}/audio`;
}
if (data.status === "failed") {
throw new Error(`Generation failed: ${data.error}`);
}
await new Promise((r) => setTimeout(r, 1000));
}
throw new Error("Generation timed out");
}
function streamGenerationStatus(
generationId: string,
onStatus: (status: string) => void
): () => void {
const eventSource = new EventSource(
`${VOICEBOX_URL}/generate/${generationId}/status`
);
eventSource.onmessage = (event) => {
const data = JSON.parse(event.data);
onStatus(data.status);
if (data.status === "complete" || data.status === "failed") {
eventSource.close();
}
};
eventSource.onerror = () => eventSource.close();
// Return cleanup function
return () => eventSource.close();
}
// Usage
const cleanup = streamGenerationStatus("gen_abc123", (status) => {
console.log("Status update:", status);
});
async function downloadAudio(generationId: string): Promise<Blob> {
const response = await fetch(
`${VOICEBOX_URL}/generations/${generationId}/audio`
);
if (!response.ok) {
throw new Error(`Failed to download audio: ${response.status}`);
}
return response.blob();
}
// Play in browser
async function playGeneratedAudio(generationId: string): Promise<void> {
const blob = await downloadAudio(generationId);
const url = URL.createObjectURL(blob);
const audio = new Audio(url);
audio.play();
audio.onended = () => URL.revokeObjectURL(url);
}
import httpx
import asyncio
VOICEBOX_URL = "http://localhost:17493"
async def generate_speech(
text: str,
profile_id: str,
language: str = "en",
engine: str = "qwen3-tts"
) -> bytes:
async with httpx.AsyncClient(timeout=120.0) as client:
# Submit generation
resp = await client.post(
f"{VOICEBOX_URL}/generate",
json={
"text": text,
"profile_id": profile_id,
"language": language,
"engine": engine,
}
)
resp.raise_for_status()
generation_id = resp.json()["generation_id"]
# Poll until complete
for _ in range(120):
status_resp = await client.get(
f"{VOICEBOX_URL}/generations/{generation_id}"
)
status_data = status_resp.json()
if status_data["status"] == "complete":
audio_resp = await client.get(
f"{VOICEBOX_URL}/generations/{generation_id}/audio"
)
return audio_resp.content
if status_data["status"] == "failed":
raise RuntimeError(f"Generation failed: {status_data.get('error')}")
await asyncio.sleep(1.0)
raise TimeoutError("Generation timed out after 120s")
# Usage
audio_bytes = asyncio.run(
generate_speech(
text="The quick brown fox jumps over the lazy dog.",
profile_id="your-profile-id",
language="en",
engine="chatterbox",
)
)
with open("output.wav", "wb") as f:
f.write(audio_bytes)
| Engine | Best For | Languages | VRAM | Notes |
|---|---|---|---|---|
qwen3-tts (0.6B/1.7B) | Quality + instructions | 10 | Medium | Supports delivery instructions in text |
luxtts | Fast CPU generation | English only | ~1GB | 150x realtime on CPU, 48kHz |
chatterbox | Multilingual coverage | 23 | Medium | Arabic, Hindi, Swahili, CJK + more |
chatterbox-turbo |
Embed natural language instructions directly in the text:
await generateSpeech({
text: "(whisper) I have a secret to tell you.",
profile_id: "abc123",
engine: "qwen3-tts",
});
await generateSpeech({
text: "(speak slowly and clearly) Step one: open the application.",
profile_id: "abc123",
engine: "qwen3-tts",
});
const tags = [
"[laugh]", "[chuckle]", "[gasp]", "[cough]",
"[sigh]", "[groan]", "[sniff]", "[shush]", "[clear throat]"
];
await generateSpeech({
text: "Oh really? [gasp] I had no idea! [laugh] That's incredible.",
profile_id: "abc123",
engine: "chatterbox-turbo",
});
# Custom models directory (set before launching)
export VOICEBOX_MODELS_DIR=/path/to/models
# For AMD ROCm GPU (auto-configured, but can override)
export HSA_OVERRIDE_GFX_VERSION=11.0.0
Docker configuration (docker-compose.yml override):
services:
voicebox:
environment:
- VOICEBOX_MODELS_DIR=/models
volumes:
- /host/models:/models
ports:
- "17493:17493"
# For NVIDIA GPU passthrough:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
// 1. Create profile
const profile = await fetch(`${VOICEBOX_URL}/profiles`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ name: "My Voice", language: "en" }),
}).then((r) => r.json());
// 2. Upload audio sample (WAV/MP3, ideally 5–30 seconds clean speech)
const formData = new FormData();
formData.append("file", audioBlob, "sample.wav");
await fetch(`${VOICEBOX_URL}/profiles/${profile.id}/samples`, {
method: "POST",
body: formData,
});
// 3. Generate with the new profile
const gen = await generateSpeech({
text: "Testing my cloned voice.",
profile_id: profile.id,
});
async function batchGenerate(
items: Array<{ text: string; profileId: string }>,
engine = "qwen3-tts"
): Promise<string[]> {
// Submit all — Voicebox queues them serially to avoid GPU contention
const submissions = await Promise.all(
items.map((item) =>
generateSpeech({ text: item.text, profile_id: item.profileId, engine })
)
);
// Wait for all completions
const audioUrls = await Promise.all(
submissions.map((s) => waitForGeneration(s.generation_id))
);
return audioUrls;
}
Voicebox auto-chunks at sentence boundaries — just send the full text:
const longScript = `
Chapter one. The morning fog rolled across the valley floor...
// Up to 50,000 characters supported
`;
await generateSpeech({
text: longScript,
profile_id: "narrator-profile-id",
engine: "tada", // Best for long-form coherence
language: "en",
});
# Check if backend is running
curl http://localhost:17493/health
# Restart backend only (dev mode)
just backend
# Check logs
just logs
# Check detected backend
curl http://localhost:17493/system/info
# Force CPU mode (set before launch)
export VOICEBOX_FORCE_CPU=1
# Set custom models directory with more space
export VOICEBOX_MODELS_DIR=/path/with/space
just dev
# Cancel stuck download via API
curl -X DELETE http://localhost:17493/models/{model_id}/download
# List loaded models
curl http://localhost:17493/models | jq '.[] | select(.loaded == true)'
# Unload specific model
curl -X POST http://localhost:17493/models/{model_id}/unload
chatterbox engineluxtts for highest output quality (48kHz) in EnglishVoicebox auto-recovers stale generations on startup. If the issue persists:
curl -X POST http://localhost:17493/generations/{generation_id}/retry
import { useState } from "react";
const VOICEBOX_URL = import.meta.env.VITE_VOICEBOX_URL ?? "http://localhost:17493";
export function VoiceGenerator({ profileId }: { profileId: string }) {
const [text, setText] = useState("");
const [audioUrl, setAudioUrl] = useState<string | null>(null);
const [loading, setLoading] = useState(false);
const handleGenerate = async () => {
setLoading(true);
try {
const res = await fetch(`${VOICEBOX_URL}/generate`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ text, profile_id: profileId, language: "en" }),
});
const { generation_id } = await res.json();
// Poll for completion
let done = false;
while (!done) {
await new Promise((r) => setTimeout(r, 1000));
const statusRes = await fetch(`${VOICEBOX_URL}/generations/${generation_id}`);
const { status } = await statusRes.json();
if (status === "complete") {
setAudioUrl(`${VOICEBOX_URL}/generations/${generation_id}/audio`);
done = true;
} else if (status === "failed") {
throw new Error("Generation failed");
}
}
} finally {
setLoading(false);
}
};
return (
<div>
<textarea value={text} onChange={(e) => setText(e.target.value)} />
<button onClick={handleGenerate} disabled={loading}>
{loading ? "Generating..." : "Generate Speech"}
</button>
{audioUrl && <audio controls src={audioUrl} />}
</div>
);
}
Weekly Installs
271
Repository
GitHub Stars
8
First Seen
6 days ago
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
gemini-cli270
github-copilot270
codex270
amp270
cline270
kimi-cli270
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
41,800 周安装
| Expressive/emotion |
| English only |
| Low (350M) |
Use [laugh], [sigh], [gasp] tags |
tada (1B/3B) | Long-form coherence | 10 | High | 700s+ audio, HumeAI model |