重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
agents-py by codestackr/livekit-skills
npx skills add https://github.com/codestackr/livekit-skills --skill agents-py使用 LiveKit 的 Python Agents SDK 构建语音 AI 代理。
此技能与 LiveKit MCP 服务器协同工作,该服务器提供对最新 LiveKit 文档、代码示例和更新日志的直接访问。当您需要获取自本技能创建以来可能已更新的最新信息时,请使用这些工具。
可用的 MCP 工具:
docs_search - 搜索 LiveKit 文档站点get_pages - 按路径获取特定文档页面get_changelog - 获取 LiveKit 软件包的近期发布和更新code_search - 在 LiveKit 代码库中搜索代码示例get_python_agent_example - 浏览 100 多个 Python 代理示例何时使用 MCP 工具:
何时使用本地参考资料:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
结合使用 MCP 工具和本地参考资料以获得最佳体验。
根据需要查阅这些资源:
uv add "livekit-agents[silero,turn-detector]~=1.3" \
"livekit-plugins-noise-cancellation~=0.2" \
"python-dotenv"
使用 LiveKit CLI 将您的凭据加载到 .env.local 文件中:
lk app env -w
或者手动创建 .env.local 文件:
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret
LIVEKIT_URL=wss://your-project.livekit.cloud
from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel
load_dotenv(".env.local")
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions="""You are a helpful voice AI assistant.
Keep responses concise, 1-3 sentences. No markdown or emojis.""",
)
server = AgentServer()
@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(
stt="assemblyai/universal-streaming:en",
llm="openai/gpt-4.1-mini",
tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
vad=silero.VAD.load(),
turn_detection=MultilingualModel(),
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
else noise_cancellation.BVC(),
),
),
)
await session.generate_reply(
instructions="Greet the user and offer your assistance."
)
if __name__ == "__main__":
agents.cli.run_app(server)
from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import openai, noise_cancellation
load_dotenv(".env.local")
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions="You are a helpful voice AI assistant."
)
server = AgentServer()
@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(
llm=openai.realtime.RealtimeModel(voice="coral")
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
else noise_cancellation.BVC(),
),
),
)
await session.generate_reply(
instructions="Greet the user and offer your assistance."
)
if __name__ == "__main__":
agents.cli.run_app(server)
通过子类化 Agent 来定义代理行为:
from livekit.agents import Agent, function_tool
class MyAgent(Agent):
def __init__(self) -> None:
super().__init__(
instructions="Your system prompt here",
)
async def on_enter(self) -> None:
"""Called when agent becomes active."""
await self.session.generate_reply(
instructions="Greet the user"
)
async def on_exit(self) -> None:
"""Called before agent hands off to another agent."""
pass
@function_tool()
async def my_tool(self, param: str) -> str:
"""Tool description for the LLM."""
return f"Result: {param}"
会话负责编排语音流水线:
session = AgentSession(
stt="assemblyai/universal-streaming:en",
llm="openai/gpt-4.1-mini",
tts="cartesia/sonic-3:voice_id",
vad=silero.VAD.load(),
turn_detection=MultilingualModel(),
)
关键方法:
session.start(room, agent) - 启动会话session.say(text) - 直接说出文本session.generate_reply(instructions) - 生成 LLM 响应session.interrupt() - 停止当前语音session.update_agent(new_agent) - 切换到不同的代理使用 @function_tool 装饰器:
from livekit.agents import function_tool, RunContext
@function_tool()
async def get_weather(self, context: RunContext, location: str) -> str:
"""Get the current weather for a location."""
return f"Weather in {location}: Sunny, 72°F"
# 开发模式,带自动重载
uv run agent.py dev
# 控制台模式(本地测试)
uv run agent.py console
# 生产模式
uv run agent.py start
# 下载所需的模型文件
uv run agent.py download-files
使用模型字符串进行简单配置,无需 API 密钥:
STT(语音转文本) :
"assemblyai/universal-streaming:en" - AssemblyAI 流式处理"deepgram/nova-3:en" - Deepgram Nova"cartesia/ink" - Cartesia STTLLM(大语言模型) :
"openai/gpt-4.1-mini" - GPT-4.1 mini(推荐)"openai/gpt-4.1" - GPT-4.1"openai/gpt-5" - GPT-5"gemini/gemini-3-flash" - Gemini 3 Flash"gemini/gemini-2.5-flash" - Gemini 2.5 FlashTTS(文本转语音) :
"cartesia/sonic-3:{voice_id}" - Cartesia Sonic 3"elevenlabs/eleven_turbo_v2_5:{voice_id}" - ElevenLabs"deepgram/aura:{voice}" - Deepgram Auralk app env -w 将 LiveKit Cloud 凭据加载到您的环境中。每周安装量
46
代码库
GitHub 星标数
3
首次出现
2026年2月3日
安全审计
安装于
opencode44
codex44
gemini-cli43
github-copilot41
amp35
kimi-cli35
Build voice AI agents with LiveKit's Python Agents SDK.
This skill works alongside the LiveKit MCP server, which provides direct access to the latest LiveKit documentation, code examples, and changelogs. Use these tools when you need up-to-date information that may have changed since this skill was created.
Available MCP tools:
docs_search - Search the LiveKit docs siteget_pages - Fetch specific documentation pages by pathget_changelog - Get recent releases and updates for LiveKit packagescode_search - Search LiveKit repositories for code examplesget_python_agent_example - Browse 100+ Python agent examplesWhen to use MCP tools:
When to use local references:
Use MCP tools and local references together for the best experience.
Consult these resources as needed:
uv add "livekit-agents[silero,turn-detector]~=1.3" \
"livekit-plugins-noise-cancellation~=0.2" \
"python-dotenv"
Use the LiveKit CLI to load your credentials into a .env.local file:
lk app env -w
Or manually create a .env.local file:
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret
LIVEKIT_URL=wss://your-project.livekit.cloud
from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel
load_dotenv(".env.local")
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions="""You are a helpful voice AI assistant.
Keep responses concise, 1-3 sentences. No markdown or emojis.""",
)
server = AgentServer()
@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(
stt="assemblyai/universal-streaming:en",
llm="openai/gpt-4.1-mini",
tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
vad=silero.VAD.load(),
turn_detection=MultilingualModel(),
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
else noise_cancellation.BVC(),
),
),
)
await session.generate_reply(
instructions="Greet the user and offer your assistance."
)
if __name__ == "__main__":
agents.cli.run_app(server)
from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import openai, noise_cancellation
load_dotenv(".env.local")
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions="You are a helpful voice AI assistant."
)
server = AgentServer()
@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(
llm=openai.realtime.RealtimeModel(voice="coral")
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
else noise_cancellation.BVC(),
),
),
)
await session.generate_reply(
instructions="Greet the user and offer your assistance."
)
if __name__ == "__main__":
agents.cli.run_app(server)
Define agent behavior by subclassing Agent:
from livekit.agents import Agent, function_tool
class MyAgent(Agent):
def __init__(self) -> None:
super().__init__(
instructions="Your system prompt here",
)
async def on_enter(self) -> None:
"""Called when agent becomes active."""
await self.session.generate_reply(
instructions="Greet the user"
)
async def on_exit(self) -> None:
"""Called before agent hands off to another agent."""
pass
@function_tool()
async def my_tool(self, param: str) -> str:
"""Tool description for the LLM."""
return f"Result: {param}"
The session orchestrates the voice pipeline:
session = AgentSession(
stt="assemblyai/universal-streaming:en",
llm="openai/gpt-4.1-mini",
tts="cartesia/sonic-3:voice_id",
vad=silero.VAD.load(),
turn_detection=MultilingualModel(),
)
Key methods:
session.start(room, agent) - Start the sessionsession.say(text) - Speak text directlysession.generate_reply(instructions) - Generate LLM responsesession.interrupt() - Stop current speechsession.update_agent(new_agent) - Switch to different agentUse the @function_tool decorator:
from livekit.agents import function_tool, RunContext
@function_tool()
async def get_weather(self, context: RunContext, location: str) -> str:
"""Get the current weather for a location."""
return f"Weather in {location}: Sunny, 72°F"
# Development mode with auto-reload
uv run agent.py dev
# Console mode (local testing)
uv run agent.py console
# Production mode
uv run agent.py start
# Download required model files
uv run agent.py download-files
Use model strings for simple configuration without API keys:
STT (Speech-to-Text) :
"assemblyai/universal-streaming:en" - AssemblyAI streaming"deepgram/nova-3:en" - Deepgram Nova"cartesia/ink" - Cartesia STTLLM (Large Language Model) :
"openai/gpt-4.1-mini" - GPT-4.1 mini (recommended)"openai/gpt-4.1" - GPT-4.1"openai/gpt-5" - GPT-5"gemini/gemini-3-flash" - Gemini 3 Flash"gemini/gemini-2.5-flash" - Gemini 2.5 FlashTTS (Text-to-Speech) :
"cartesia/sonic-3:{voice_id}" - Cartesia Sonic 3"elevenlabs/eleven_turbo_v2_5:{voice_id}" - ElevenLabs"deepgram/aura:{voice}" - Deepgram Auralk app env -w to load LiveKit Cloud credentials into your environment.Weekly Installs
46
Repository
GitHub Stars
3
First Seen
Feb 3, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode44
codex44
gemini-cli43
github-copilot41
amp35
kimi-cli35
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
76,800 周安装
Microsoft 365 Copilot 声明式智能体创建指南 - 集成MCP服务器实现外部数据访问
7,900 周安装
GitHub Copilot 指令蓝图生成器 - 自动创建项目专属AI编程助手配置
8,000 周安装
ASP.NET Core Docker容器化指南:一键生成优化Dockerfile配置
7,900 周安装
GitHub Copilot 自定义智能体推荐工具 - 智能分析仓库上下文,自动推荐和更新智能体
7,900 周安装
repo-story-time:GitHub代码库故事生成器,AI驱动技术叙事分析工具
8,000 周安装
Power BI DAX 公式优化器 - 提升性能、可读性与可维护性的专家指南
7,900 周安装