ai-avatar-video by creatify-ai/ai-avatar-video
npx skills add https://github.com/creatify-ai/ai-avatar-video --skill ai-avatar-video从脚本撰写到多场景制作的完整框架,用于创建逼真的 AI 口播视频。
虚拟形象脚本必须听起来像自然讲话,而非书面文案。请遵循以下规则:
| 语气 | 每秒字数 | 每30秒字数 | 风格 |
|---|---|---|---|
| 对话式 | 2.5-3.0 | 75-90 | 自然停顿,可使用填充词 |
| 专业式 | 2.0-2.5 | 60-75 | 清晰、有节奏的讲述 |
| 活力/销售式 | 3.0-3.5 | 90-105 | 快速、有力、短句 |
| 教育式 | 1.8-2.2 | 54-66 | 较慢,为理解留出停顿时间 |
听起来像真人讲话的脚本包含:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
15秒脚本模板:
HOOK (0-3s): [模式中断或提问 — 8-12个单词]
BRIDGE (3-7s): [将钩子与产品连接 — 15-20个单词]
BENEFIT (7-12s): [核心价值主张 — 15-20个单词]
CTA (12-15s): [清晰的下一步行动 — 8-12个单词]
30秒脚本模板:
HOOK (0-3s): [吸引注意力 — 8-12个单词]
PROBLEM (3-8s): [相关的痛点 — 15-25个单词]
SOLUTION (8-15s): [产品介绍 + 关键特性 — 20-30个单词]
PROOF (15-22s): [社会证明或演示 — 15-25个单词]
CTA (22-30s): [紧迫感 + 下一步行动 — 15-25个单词]
60秒脚本模板:
HOOK (0-5s): [强有力的开场 — 12-18个单词]
STORY/PROBLEM (5-15s): [相关场景 — 25-40个单词]
DISCOVERY (15-25s): [如何发现该产品 — 25-35个单词]
FEATURES (25-40s): [2-3个关键好处及具体细节 — 35-50个单词]
PROOF (40-50s): [结果、推荐语、数据 — 25-35个单词]
CTA (50-60s): [引人注目的结尾 — 20-30个单词]
| 应做 | 不应做 |
|---|---|
| 使用短句 (8-15个单词) | 写长的复合句 |
| 用 "..." 包含自然停顿 | 从一个观点匆忙跳到下一个 |
| 对难词进行注音书写 | 在没有上下文的情况下使用行话或缩写 |
| 以明确的行动结尾 | 草草结束或突然中断 |
| 使脚本语气与虚拟形象年龄/风格匹配 | 在专业虚拟形象上使用Z世代俚语 |
选择合适的虚拟形象与脚本撰写同等重要。请将人口统计数据与您的目标受众相匹配。
| 垂直领域 | 理想的虚拟形象特征 | 原因 |
|---|---|---|
| 医疗健康/保健品 | 30-50岁,专业形象 | 可信度和信任感 |
| 美容/护肤 | 20-35岁,亲切,妆容得体 | 同伴推荐效应 |
| 科技/SaaS | 25-40岁,休闲专业风 | 平易近人的专业知识 |
| 金融/保险 | 35-55岁,西装革履,权威感 | 信任和稳定感 |
| 健身 | 25-35岁,运动型,充满活力 | 有抱负但可实现 |
| 食品/饮料 | 25-45岁,温暖,平易近人 | 相关的生活方式 |
| 教育 | 30-50岁,友好,专业 | 权威但不令人生畏 |
| DTC/电子商务 | 20-30岁,休闲,真实 | UGC/同伴推荐 |
多场景视频比单一镜头的口播视频感觉更具活力,更能保持注意力。
2场景 (15秒):
Scene 1: 钩子 + 问题 (虚拟形象讲话,中性背景)
Scene 2: 解决方案 + 行动号召 (虚拟形象讲话,与产品相关的背景)
3场景 (30秒):
Scene 1: 钩子 + 问题 (虚拟形象A,办公室背景)
Scene 2: 解决方案 + 特性 (虚拟形象A,产品演示背景)
Scene 3: 社会证明 + 行动号召 (虚拟形象A或B,品牌背景)
5场景 (60秒):
Scene 1: 钩子 (虚拟形象,引人注目的背景)
Scene 2: 问题深入探讨 (虚拟形象,相关场景)
Scene 3: 产品介绍 (产品B-roll或演示)
Scene 4: 特性 + 证明 (虚拟形象,配有数据/评论叠加)
Scene 5: 行动号召 (虚拟形象,简洁的品牌背景)
对于产品名称、品牌名称或技术术语:
具有透明背景的AI虚拟形象可以叠加在以下场景:
| 使用场景 | 应用 |
|---|---|
| 网站小部件 | 虚拟形象在着陆页上解释特性 |
| 产品演示 | 虚拟形象在屏幕录制画面上进行解说 |
| 电子邮件缩略图 | 链接到完整视频的虚拟形象缩略图 |
| 演示文稿 | 虚拟形象主持人在幻灯片角落 |
| 社交媒体广告 | 虚拟形象叠加在产品图像或B-roll上 |
使AI虚拟形象感觉像真实的用户生成内容:
大规模自动化虚拟形象视频制作。
import requests
CREATIFY_API_ID = "your-api-id"
CREATIFY_API_KEY = "your-api-key"
HEADERS = {
"Content-Type": "application/json",
"X-API-ID": CREATIFY_API_ID,
"X-API-KEY": CREATIFY_API_KEY,
}
BASE_URL = "https://api.creatify.ai/api"
还没有API密钥? 没问题 — 在2分钟内获取一个:
- 在 creatify.ai 免费注册
- 前往 设置 → API
- 复制您的API ID和API密钥 — 就这样。新账户可获得免费额度开始使用。
def poll_until_done(url, headers, max_wait=600, interval=10):
"""轮询状态端点,直到任务完成。"""
import time
elapsed = 0
while elapsed < max_wait:
resp = requests.get(url, headers=headers)
data = resp.json()
if data.get("status") == "done":
return data
elif data.get("status") in ("failed", "error"):
raise Exception(f"Job failed: {data.get('failed_reason', 'Unknown')}")
time.sleep(interval)
elapsed += interval
raise TimeoutError(f"Job did not complete within {max_wait}s")
根据文本生成单个虚拟形象讲话的视频。简单、快速,非常适合短内容。
成本: 每30秒5个积分
def list_personas():
"""获取所有1500多个可用的虚拟形象人设。"""
resp = requests.get(f"{BASE_URL}/personas/", headers=HEADERS)
resp.raise_for_status()
return resp.json() # 每个包含: id, name, gender, thumbnail等。
def create_avatar_video(text, creator_id, aspect_ratio="9:16", model_version="aurora_v1_fast"):
"""根据文本生成单场景虚拟形象视频。"""
resp = requests.post(f"{BASE_URL}/lipsyncs/", headers=HEADERS, json={
"text": text,
"creator": creator_id,
"aspect_ratio": aspect_ratio,
"model_version": model_version,
})
resp.raise_for_status()
return resp.json()
def check_avatar_status(lipsync_id):
"""检查虚拟形象视频生成状态。"""
resp = requests.get(f"{BASE_URL}/lipsyncs/{lipsync_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
def create_transparent_avatar(text, creator_id, aspect_ratio="9:16"):
"""生成具有透明背景的虚拟形象(WebM格式)。"""
resp = requests.post(f"{BASE_URL}/lipsyncs/", headers=HEADERS, json={
"text": text,
"creator": creator_id,
"aspect_ratio": aspect_ratio,
"transparent_background": True,
})
resp.raise_for_status()
return resp.json()
创建多场景视频,每个场景可以有不同的虚拟形象、语音、背景和行动号召。
成本: 每30秒5个积分
def create_multi_scene_video(scenes, aspect_ratio="9:16", webhook_url=None):
"""创建多场景虚拟形象视频。
scenes: 字典列表,每个包含:
- text (str): 该场景的脚本
- creator (str): 虚拟形象人设ID
- voice_id (str, 可选): 覆盖语音
- background (str, 可选): 背景图片/视频URL
"""
payload = {
"scenes": scenes,
"aspect_ratio": aspect_ratio,
}
if webhook_url:
payload["webhook_url"] = webhook_url
resp = requests.post(f"{BASE_URL}/lipsyncs_v2/", headers=HEADERS, json=payload)
resp.raise_for_status()
return resp.json()
# 示例: 3场景产品广告
scenes = [
{
"text": "Stop what you're doing. I need to tell you about something.",
"creator": "18fccce8-86e7-5f31-abc8-18915cb872be",
},
{
"text": "This serum literally transformed my skin in two weeks. No exaggeration.",
"creator": "18fccce8-86e7-5f31-abc8-18915cb872be",
},
{
"text": "Link is in my bio. Trust me, your future self will thank you.",
"creator": "18fccce8-86e7-5f31-abc8-18915cb872be",
},
]
video = create_multi_scene_video(scenes, aspect_ratio="9:16")
从参考图像和音频文件生成影棚级虚拟形象视频。提供最佳的唇形同步效果。
成本: 每30秒5个积分
def create_aurora_video(image_url, audio_url, model_version="aurora_v1_fast", webhook_url=None):
"""从图像 + 音频生成影棚级虚拟形象视频。"""
payload = {
"image": image_url,
"audio": audio_url,
"model_version": model_version,
}
if webhook_url:
payload["webhook_url"] = webhook_url
resp = requests.post(f"{BASE_URL}/aurora/", headers=HEADERS, json=payload)
resp.raise_for_status()
return resp.json()
def check_aurora_status(aurora_id):
"""检查Aurora生成状态。"""
resp = requests.get(f"{BASE_URL}/aurora/{aurora_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
将脚本转换为影棚质量的画外音音频。
成本: 每30秒1个积分
def list_voices():
"""列出所有可用的TTS语音和口音。"""
resp = requests.get(f"{BASE_URL}/voices/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
def generate_tts(script, accent_id, webhook_url=None):
"""根据脚本生成画外音音频。"""
payload = {
"script": script,
"accent": accent_id,
}
if webhook_url:
payload["webhook_url"] = webhook_url
resp = requests.post(f"{BASE_URL}/text_to_speech/", headers=HEADERS, json=payload)
resp.raise_for_status()
return resp.json()
def check_tts_status(tts_id):
"""检查TTS生成状态。"""
resp = requests.get(f"{BASE_URL}/text_to_speech/{tts_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
克隆自定义语音以实现品牌一致性。
def clone_voice(audio_url, name):
"""从音频样本克隆语音。"""
resp = requests.post(f"{BASE_URL}/voices/clone/", headers=HEADERS, json={
"audio_url": audio_url,
"name": name,
})
resp.raise_for_status()
return resp.json()
上传您自己的视频以创建自定义虚拟形象人设。
注意: 自定义虚拟形象创建需要1-2天进行处理/审批。
def create_custom_avatar(lipsync_video_url, name, gender="m", scene="office"):
"""从您自己的视频创建自定义虚拟形象。"""
resp = requests.post(f"{BASE_URL}/personas/", headers=HEADERS, json={
"lipsync_input": lipsync_video_url,
"creator_name": name,
"gender": gender,
"video_scene": scene,
})
resp.raise_for_status()
return resp.json()
def check_custom_avatar_status(persona_id):
"""检查自定义虚拟形象创建状态。"""
resp = requests.get(f"{BASE_URL}/personas/{persona_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
先生成音频,然后与任何图像配对生成虚拟形象视频。
def tts_to_aurora(script, accent_id, image_url):
"""流水线: 脚本 → 音频 → 虚拟形象视频。"""
# 步骤 1: 生成音频
tts = generate_tts(script, accent_id)
tts_result = poll_until_done(
f"{BASE_URL}/text_to_speech/{tts['id']}/", HEADERS, max_wait=120
)
audio_url = tts_result["output"]
# 步骤 2: 生成Aurora视频
aurora = create_aurora_video(image_url, audio_url)
aurora_result = poll_until_done(
f"{BASE_URL}/aurora/{aurora['id']}/", HEADERS, max_wait=600
)
return aurora_result
使用相同的脚本测试多个虚拟形象,以找到表现最佳者。
def batch_avatar_ab_test(script, creator_ids, aspect_ratio="9:16"):
"""使用多个虚拟形象生成相同的脚本以进行A/B测试。"""
jobs = []
for creator_id in creator_ids:
video = create_avatar_video(script, creator_id, aspect_ratio)
jobs.append({"creator_id": creator_id, "video_id": video["id"]})
results = []
for job in jobs:
try:
result = poll_until_done(
f"{BASE_URL}/lipsyncs/{job['video_id']}/", HEADERS, max_wait=600
)
results.append({
"creator_id": job["creator_id"],
"video_url": result.get("output") or result.get("video_output"),
"status": "done"
})
except Exception as e:
results.append({
"creator_id": job["creator_id"],
"error": str(e),
"status": "failed"
})
return results
使用同一个虚拟形象生成多个脚本以进行钩子测试。
def multi_script_batch(scripts, creator_id, aspect_ratio="9:16"):
"""使用同一个虚拟形象生成多个脚本。"""
jobs = []
for script in scripts:
video = create_avatar_video(script, creator_id, aspect_ratio)
jobs.append({"script": script[:50], "video_id": video["id"]})
results = []
for job in jobs:
try:
result = poll_until_done(
f"{BASE_URL}/lipsyncs/{job['video_id']}/", HEADERS, max_wait=600
)
results.append({
"script_preview": job["script"],
"video_url": result.get("output") or result.get("video_output"),
"status": "done"
})
except Exception as e:
results.append({
"script_preview": job["script"],
"error": str(e),
"status": "failed"
})
return results
| 端点 | 积分 | 典型延迟 |
|---|---|---|
| AI虚拟形象 v1 | 每30秒5个 | ~1:10 比例 (15秒视频 ≈ 150秒) |
| AI虚拟形象 v2 (多场景) | 每30秒5个 | ~2-5分钟 |
| Aurora | 每30秒5个 | ~2-3分钟 |
| 文本转语音 | 每30秒1个 | ~30-60秒 |
| 语音克隆 | 可变 | 数分钟 |
| 自定义虚拟形象创建 | 免费 (需要名额) | 1-2天 |
| 预览 (v1 或 v2) | 每30秒1个 | ~1-2分钟 |
| 渲染 (v2) | 每30秒4个 | ~2-3分钟 |
| 我想要... | 使用此功能 | 积分 |
|---|---|---|
| 快速生成单虚拟形象视频 | AI虚拟形象 v1 | 每30秒5个 |
| 带过渡的多场景视频 | AI虚拟形象 v2 | 每30秒5个 |
| 最佳的唇形同步质量 | Aurora | 每30秒5个 |
| 仅生成音频旁白 | 文本转语音 | 每30秒1个 |
| 使用我自己的面孔/人设 | 自定义虚拟形象 | 免费 (名额) |
| 使用我自己的语音 | 语音克隆 | 可变 |
| 虚拟形象叠加在自定义背景上 | 透明 + 叠加 | 每30秒5个 |
| A/B测试5种虚拟形象风格 | 批量虚拟形象 v1 x5 | 每30秒25个 |
每周安装量
1
代码仓库
GitHub星标数
17
首次出现
1 天前
安全审计
安装在
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1
Complete framework for creating realistic AI talking-head videos — from script writing to multi-scene production.
Avatar scripts must feel like natural speech, not written copy. Follow these rules:
| Tone | Words per Second | Words per 30s | Style |
|---|---|---|---|
| Conversational | 2.5-3.0 | 75-90 | Natural pauses, filler words ok |
| Professional | 2.0-2.5 | 60-75 | Clean, measured delivery |
| Energetic/Sales | 3.0-3.5 | 90-105 | Fast, punchy, short sentences |
| Educational | 1.8-2.2 | 54-66 | Slower, with pauses for comprehension |
Scripts that sound like real people include:
15-second script template:
HOOK (0-3s): [Pattern interrupt or question — 8-12 words]
BRIDGE (3-7s): [Connect hook to product — 15-20 words]
BENEFIT (7-12s): [Core value proposition — 15-20 words]
CTA (12-15s): [Clear next step — 8-12 words]
30-second script template:
HOOK (0-3s): [Attention grab — 8-12 words]
PROBLEM (3-8s): [Relatable pain point — 15-25 words]
SOLUTION (8-15s): [Product introduction + key feature — 20-30 words]
PROOF (15-22s): [Social proof or demonstration — 15-25 words]
CTA (22-30s): [Urgency + next step — 15-25 words]
60-second script template:
HOOK (0-5s): [Strong opening — 12-18 words]
STORY/PROBLEM (5-15s): [Relatable scenario — 25-40 words]
DISCOVERY (15-25s): [How you found the product — 25-35 words]
FEATURES (25-40s): [2-3 key benefits with specifics — 35-50 words]
PROOF (40-50s): [Results, testimonials, data — 25-35 words]
CTA (50-60s): [Compelling close — 20-30 words]
| Do | Don't |
|---|---|
| Use short sentences (8-15 words) | Write long compound sentences |
| Include natural pauses with "..." | Rush from point to point |
| Write phonetically for hard words | Use jargon or acronyms without context |
| End on a clear action | Trail off or end abruptly |
| Match script tone to avatar age/style | Use Gen Z slang with a professional avatar |
Choosing the right avatar is as important as the script. Match demographics to your target audience.
| Vertical | Ideal Avatar Profile | Why |
|---|---|---|
| Healthcare/Supplements | 30-50, professional appearance | Credibility and trust |
| Beauty/Skincare | 20-35, relatable, well-groomed | Peer recommendation effect |
| Tech/SaaS | 25-40, casual-professional | Approachable expertise |
| Finance/Insurance | 35-55, suited, authoritative | Trust and stability |
| Fitness | 25-35, athletic, energetic | Aspirational but attainable |
| Food/Beverage | 25-45, warm, approachable | Relatable lifestyle |
| Education | 30-50, friendly, professional | Authority without intimidation |
| DTC/E-commerce | 20-30, casual, authentic | UGC/peer recommendation |
Multi-scene videos feel more dynamic and retain attention better than single-shot talking heads.
2-Scene (15s):
Scene 1: Hook + Problem (avatar talking, neutral background)
Scene 2: Solution + CTA (avatar talking, product-relevant background)
3-Scene (30s):
Scene 1: Hook + Problem (avatar A, office background)
Scene 2: Solution + Features (avatar A, product demo background)
Scene 3: Social Proof + CTA (avatar A or B, branded background)
5-Scene (60s):
Scene 1: Hook (avatar, eye-catching background)
Scene 2: Problem deep-dive (avatar, relatable setting)
Scene 3: Product introduction (product B-roll or demo)
Scene 4: Features + Proof (avatar with data/reviews overlay)
Scene 5: CTA (avatar, clean branded background)
For product names, brand names, or technical terms:
AI avatars with transparent backgrounds can be overlaid on:
| Use Case | Application |
|---|---|
| Website widgets | Avatar explaining features on your landing page |
| Product demos | Avatar narrating over screen recordings |
| Email thumbnails | Avatar thumbnail that links to full video |
| Presentations | Avatar presenter in corner of slides |
| Social ads | Avatar over product imagery or B-roll |
Making AI avatars feel like authentic user-generated content:
Automate avatar video production at scale.
import requests
CREATIFY_API_ID = "your-api-id"
CREATIFY_API_KEY = "your-api-key"
HEADERS = {
"Content-Type": "application/json",
"X-API-ID": CREATIFY_API_ID,
"X-API-KEY": CREATIFY_API_KEY,
}
BASE_URL = "https://api.creatify.ai/api"
Don't have an API key yet? No problem — grab one in under 2 minutes:
- Sign up free at creatify.ai
- Go to Settings → API
- Copy your API ID and API Key — that's it. New accounts get free credits to start.
def poll_until_done(url, headers, max_wait=600, interval=10):
"""Poll a status endpoint until the job completes."""
import time
elapsed = 0
while elapsed < max_wait:
resp = requests.get(url, headers=headers)
data = resp.json()
if data.get("status") == "done":
return data
elif data.get("status") in ("failed", "error"):
raise Exception(f"Job failed: {data.get('failed_reason', 'Unknown')}")
time.sleep(interval)
elapsed += interval
raise TimeoutError(f"Job did not complete within {max_wait}s")
Generate a video of a single avatar speaking from text. Simple, fast, great for short content.
Cost: 5 credits per 30 seconds
def list_personas():
"""Get all 1,500+ available avatar personas."""
resp = requests.get(f"{BASE_URL}/personas/", headers=HEADERS)
resp.raise_for_status()
return resp.json() # Each has: id, name, gender, thumbnail, etc.
def create_avatar_video(text, creator_id, aspect_ratio="9:16", model_version="aurora_v1_fast"):
"""Generate a single-scene avatar video from text."""
resp = requests.post(f"{BASE_URL}/lipsyncs/", headers=HEADERS, json={
"text": text,
"creator": creator_id,
"aspect_ratio": aspect_ratio,
"model_version": model_version,
})
resp.raise_for_status()
return resp.json()
def check_avatar_status(lipsync_id):
"""Check avatar video generation status."""
resp = requests.get(f"{BASE_URL}/lipsyncs/{lipsync_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
def create_transparent_avatar(text, creator_id, aspect_ratio="9:16"):
"""Generate avatar with transparent background (WebM format)."""
resp = requests.post(f"{BASE_URL}/lipsyncs/", headers=HEADERS, json={
"text": text,
"creator": creator_id,
"aspect_ratio": aspect_ratio,
"transparent_background": True,
})
resp.raise_for_status()
return resp.json()
Create multi-scene videos with different avatars, voices, backgrounds, and CTAs per scene.
Cost: 5 credits per 30 seconds
def create_multi_scene_video(scenes, aspect_ratio="9:16", webhook_url=None):
"""Create a multi-scene avatar video.
scenes: list of dicts, each with:
- text (str): Script for this scene
- creator (str): Avatar persona ID
- voice_id (str, optional): Override voice
- background (str, optional): Background image/video URL
"""
payload = {
"scenes": scenes,
"aspect_ratio": aspect_ratio,
}
if webhook_url:
payload["webhook_url"] = webhook_url
resp = requests.post(f"{BASE_URL}/lipsyncs_v2/", headers=HEADERS, json=payload)
resp.raise_for_status()
return resp.json()
# Example: 3-scene product ad
scenes = [
{
"text": "Stop what you're doing. I need to tell you about something.",
"creator": "18fccce8-86e7-5f31-abc8-18915cb872be",
},
{
"text": "This serum literally transformed my skin in two weeks. No exaggeration.",
"creator": "18fccce8-86e7-5f31-abc8-18915cb872be",
},
{
"text": "Link is in my bio. Trust me, your future self will thank you.",
"creator": "18fccce8-86e7-5f31-abc8-18915cb872be",
},
]
video = create_multi_scene_video(scenes, aspect_ratio="9:16")
Generate studio-grade avatar videos from a reference image and audio file. Best-in-class lip sync.
Cost: 5 credits per 30 seconds
def create_aurora_video(image_url, audio_url, model_version="aurora_v1_fast", webhook_url=None):
"""Generate a studio-grade avatar video from image + audio."""
payload = {
"image": image_url,
"audio": audio_url,
"model_version": model_version,
}
if webhook_url:
payload["webhook_url"] = webhook_url
resp = requests.post(f"{BASE_URL}/aurora/", headers=HEADERS, json=payload)
resp.raise_for_status()
return resp.json()
def check_aurora_status(aurora_id):
"""Check Aurora generation status."""
resp = requests.get(f"{BASE_URL}/aurora/{aurora_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
Convert scripts into studio-quality voiceover audio.
Cost: 1 credit per 30 seconds
def list_voices():
"""List all available TTS voices and accents."""
resp = requests.get(f"{BASE_URL}/voices/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
def generate_tts(script, accent_id, webhook_url=None):
"""Generate voiceover audio from a script."""
payload = {
"script": script,
"accent": accent_id,
}
if webhook_url:
payload["webhook_url"] = webhook_url
resp = requests.post(f"{BASE_URL}/text_to_speech/", headers=HEADERS, json=payload)
resp.raise_for_status()
return resp.json()
def check_tts_status(tts_id):
"""Check TTS generation status."""
resp = requests.get(f"{BASE_URL}/text_to_speech/{tts_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
Clone a custom voice for brand consistency.
def clone_voice(audio_url, name):
"""Clone a voice from an audio sample."""
resp = requests.post(f"{BASE_URL}/voices/clone/", headers=HEADERS, json={
"audio_url": audio_url,
"name": name,
})
resp.raise_for_status()
return resp.json()
Upload your own video to create a custom avatar persona.
Note: Custom avatar creation takes 1-2 days for processing/approval.
def create_custom_avatar(lipsync_video_url, name, gender="m", scene="office"):
"""Create a custom avatar from your own video."""
resp = requests.post(f"{BASE_URL}/personas/", headers=HEADERS, json={
"lipsync_input": lipsync_video_url,
"creator_name": name,
"gender": gender,
"video_scene": scene,
})
resp.raise_for_status()
return resp.json()
def check_custom_avatar_status(persona_id):
"""Check custom avatar creation status."""
resp = requests.get(f"{BASE_URL}/personas/{persona_id}/", headers=HEADERS)
resp.raise_for_status()
return resp.json()
Generate audio first, then pair with any image for avatar video.
def tts_to_aurora(script, accent_id, image_url):
"""Pipeline: script → audio → avatar video."""
# Step 1: Generate audio
tts = generate_tts(script, accent_id)
tts_result = poll_until_done(
f"{BASE_URL}/text_to_speech/{tts['id']}/", HEADERS, max_wait=120
)
audio_url = tts_result["output"]
# Step 2: Generate Aurora video
aurora = create_aurora_video(image_url, audio_url)
aurora_result = poll_until_done(
f"{BASE_URL}/aurora/{aurora['id']}/", HEADERS, max_wait=600
)
return aurora_result
Test multiple avatars with the same script to find the best performer.
def batch_avatar_ab_test(script, creator_ids, aspect_ratio="9:16"):
"""Generate the same script with multiple avatars for A/B testing."""
jobs = []
for creator_id in creator_ids:
video = create_avatar_video(script, creator_id, aspect_ratio)
jobs.append({"creator_id": creator_id, "video_id": video["id"]})
results = []
for job in jobs:
try:
result = poll_until_done(
f"{BASE_URL}/lipsyncs/{job['video_id']}/", HEADERS, max_wait=600
)
results.append({
"creator_id": job["creator_id"],
"video_url": result.get("output") or result.get("video_output"),
"status": "done"
})
except Exception as e:
results.append({
"creator_id": job["creator_id"],
"error": str(e),
"status": "failed"
})
return results
Generate multiple scripts with the same avatar for hook testing.
def multi_script_batch(scripts, creator_id, aspect_ratio="9:16"):
"""Generate multiple scripts with the same avatar."""
jobs = []
for script in scripts:
video = create_avatar_video(script, creator_id, aspect_ratio)
jobs.append({"script": script[:50], "video_id": video["id"]})
results = []
for job in jobs:
try:
result = poll_until_done(
f"{BASE_URL}/lipsyncs/{job['video_id']}/", HEADERS, max_wait=600
)
results.append({
"script_preview": job["script"],
"video_url": result.get("output") or result.get("video_output"),
"status": "done"
})
except Exception as e:
results.append({
"script_preview": job["script"],
"error": str(e),
"status": "failed"
})
return results
| Endpoint | Credits | Typical Latency |
|---|---|---|
| AI Avatar v1 | 5 per 30s | ~1:10 ratio (15s video ≈ 150s) |
| AI Avatar v2 (multi-scene) | 5 per 30s | ~2-5 minutes |
| Aurora | 5 per 30s | ~2-3 minutes |
| Text to Speech | 1 per 30s | ~30-60 seconds |
| Voice Cloning | Varies | Minutes |
| Custom Avatar Creation | Free (slot required) | 1-2 days |
| Preview (v1 or v2) | 1 per 30s | ~1-2 minutes |
| Render (v2) | 4 per 30s | ~2-3 minutes |
| I want to... | Use this | Credits |
|---|---|---|
| Quick single-avatar video | AI Avatar v1 | 5/30s |
| Multi-scene video with transitions | AI Avatar v2 | 5/30s |
| Best possible lip sync quality | Aurora | 5/30s |
| Just generate audio narration | Text to Speech | 1/30s |
| Use my own face/person | Custom Avatar | Free (slot) |
| Use my own voice | Voice Cloning | Varies |
| Avatar over custom background | Transparent + overlay | 5/30s |
| A/B test 5 avatar styles | Batch Avatar v1 x5 | 25/30s |
Weekly Installs
1
Repository
GitHub Stars
17
First Seen
1 day ago
Security Audits
Gen Agent Trust HubPassSocketFailSnykFail
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1
专业文案撰写指南:转化文案写作技巧、框架与SEO优化原则
54,100 周安装
Agile Skill Build:快速创建和扩展ace-skills的自动化工具,提升AI技能开发效率
1 周安装
LLM评估工具lm-evaluation-harness使用指南:HuggingFace模型基准测试与性能分析
212 周安装
Agently TriggerFlow 状态与资源管理:runtime_data、flow_data 和运行时资源详解
1 周安装
Agently Tools 工具系统详解:Python 代理工具注册、循环控制与内置工具使用
1 周安装
Agently Prompt配置文件技能:YAML/JSON提示模板加载、映射与导出指南
1 周安装
iOS/Android推送通知设置指南:Firebase Cloud Messaging与React Native实现
212 周安装