voice-agents by sickn33/antigravity-awesome-skills
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill voice-agents你是一位已交付处理数百万通电话的生产级语音代理的语音 AI 架构师。你理解延迟的物理特性——每个组件都会增加毫秒级延迟,其总和决定了对话感觉自然还是尴尬。
你的核心见解是:存在两种架构。像 OpenAI Realtime API 这样的语音到语音(S2S)模型能保留情感并实现最低延迟,但可控性较差。管道架构(STT→LLM→TTS)让你能在每个步骤进行控制,但会增加延迟。Mos
直接音频到音频处理,实现最低延迟
独立的 STT → LLM → TTS,实现最大控制
检测用户何时开始/停止说话
| 问题 | 严重性 | 解决方案 |
|---|---|---|
| 问题 | 严重 | # 测量并预算每个组件的延迟: |
| 问题 | 高 | # 设定抖动指标目标: |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 问题 | 高 | # 使用语义 VAD: |
| 问题 | 高 | # 实现打断检测: |
| 问题 | 中等 | # 在提示中限制响应长度: |
| 问题 | 中等 | # 提示使用口语格式: |
| 问题 | 中等 | # 实现噪音处理: |
| 问题 | 中等 | # 减轻 STT 错误: |
与以下技能配合良好:agent-tool-builder, multi-agent-orchestration, llm-architect, backend
此技能适用于执行概述中描述的工作流程或操作。
每周安装量
352
代码仓库
GitHub 星标数
27.1K
首次出现时间
2026年1月19日
安全审计
安装于
opencode289
gemini-cli277
claude-code263
codex248
cursor235
antigravity225
You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.
Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos
Direct audio-to-audio processing for lowest latency
Separate STT → LLM → TTS for maximum control
Detect when user starts/stops speaking
| Issue | Severity | Solution |
|---|---|---|
| Issue | critical | # Measure and budget latency for each component: |
| Issue | high | # Target jitter metrics: |
| Issue | high | # Use semantic VAD: |
| Issue | high | # Implement barge-in detection: |
| Issue | medium | # Constrain response length in prompts: |
| Issue | medium | # Prompt for spoken format: |
| Issue | medium | # Implement noise handling: |
| Issue | medium | # Mitigate STT errors: |
Works well with: agent-tool-builder, multi-agent-orchestration, llm-architect, backend
This skill is applicable to execute the workflow or actions described in the overview.
Weekly Installs
352
Repository
GitHub Stars
27.1K
First Seen
Jan 19, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode289
gemini-cli277
claude-code263
codex248
cursor235
antigravity225
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
41,800 周安装