⚠️

重要前提

安装AI Skills的关键前提是：必须科学上网，且开启TUN模式，这一点至关重要，直接决定安装能否顺利完成，在此郑重提醒三遍：科学上网，科学上网，科学上网。查看完整安装教程 →

LiveKit Python Agents SDK - 构建语音AI代理，集成实时通信与多模态处理

agents-py by codestackr/livekit-skills

48 周安装量

3 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/codestackr/livekit-skills --skill agents-py

AI/机器学习 Python Web框架实时通信

🇨🇳中文介绍

LiveKit Agents Python SDK

使用 LiveKit 的 Python Agents SDK 构建语音 AI 代理。

LiveKit MCP 服务器工具

此技能与 LiveKit MCP 服务器协同工作，该服务器提供对最新 LiveKit 文档、代码示例和更新日志的直接访问。当您需要获取自本技能创建以来可能已更新的最新信息时，请使用这些工具。

可用的 MCP 工具：

docs_search - 搜索 LiveKit 文档站点
get_pages - 按路径获取特定文档页面
get_changelog - 获取 LiveKit 软件包的近期发布和更新
code_search - 在 LiveKit 代码库中搜索代码示例
get_python_agent_example - 浏览 100 多个 Python 代理示例

何时使用 MCP 工具：

您需要最新的 API 文档或功能更新
您正在寻找最近的示例或代码模式
您想检查某个功能是否已在近期版本中添加
本地参考资料未涵盖特定主题

何时使用本地参考资料：

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

使用 STT-LLM-TTS 流水线的基本代理

from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel

load_dotenv(".env.local")

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="""You are a helpful voice AI assistant.
            Keep responses concise, 1-3 sentences. No markdown or emojis.""",
        )

server = AgentServer()

@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        stt="assemblyai/universal-streaming:en",
        llm="openai/gpt-4.1-mini",
        tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
        vad=silero.VAD.load(),
        turn_detection=MultilingualModel(),
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_options=room_io.RoomOptions(
            audio_input=room_io.AudioInputOptions(
                noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
                    if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
                    else noise_cancellation.BVC(),
            ),
        ),
    )

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )

if __name__ == "__main__":
    agents.cli.run_app(server)

使用实时模型的基本代理

from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import openai, noise_cancellation

load_dotenv(".env.local")

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="You are a helpful voice AI assistant."
        )

server = AgentServer()

@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        llm=openai.realtime.RealtimeModel(voice="coral")
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_options=room_io.RoomOptions(
            audio_input=room_io.AudioInputOptions(
                noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
                    if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
                    else noise_cancellation.BVC(),
            ),
        ),
    )

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )

if __name__ == "__main__":
    agents.cli.run_app(server)

通过子类化 Agent 来定义代理行为：

from livekit.agents import Agent, function_tool

class MyAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="Your system prompt here",
        )

    async def on_enter(self) -> None:
        """Called when agent becomes active."""
        await self.session.generate_reply(
            instructions="Greet the user"
        )

    async def on_exit(self) -> None:
        """Called before agent hands off to another agent."""
        pass

    @function_tool()
    async def my_tool(self, param: str) -> str:
        """Tool description for the LLM."""
        return f"Result: {param}"

会话负责编排语音流水线：

session = AgentSession(
    stt="assemblyai/universal-streaming:en",
    llm="openai/gpt-4.1-mini",
    tts="cartesia/sonic-3:voice_id",
    vad=silero.VAD.load(),
    turn_detection=MultilingualModel(),
)

session.start(room, agent) - 启动会话
session.say(text) - 直接说出文本
session.generate_reply(instructions) - 生成 LLM 响应
session.interrupt() - 停止当前语音
session.update_agent(new_agent) - 切换到不同的代理

使用 @function_tool 装饰器：

from livekit.agents import function_tool, RunContext

@function_tool()
async def get_weather(self, context: RunContext, location: str) -> str:
    """Get the current weather for a location."""
    return f"Weather in {location}: Sunny, 72°F"

# 开发模式，带自动重载
uv run agent.py dev

# 控制台模式（本地测试）
uv run agent.py console

# 生产模式
uv run agent.py start

# 下载所需的模型文件
uv run agent.py download-files

LiveKit Inference 模型字符串

使用模型字符串进行简单配置，无需 API 密钥：

STT（语音转文本） ：

"assemblyai/universal-streaming:en" - AssemblyAI 流式处理
"deepgram/nova-3:en" - Deepgram Nova
"cartesia/ink" - Cartesia STT

LLM（大语言模型） ：

"openai/gpt-4.1-mini" - GPT-4.1 mini（推荐）
"openai/gpt-4.1" - GPT-4.1
"openai/gpt-5" - GPT-5
"gemini/gemini-3-flash" - Gemini 3 Flash
"gemini/gemini-2.5-flash" - Gemini 2.5 Flash

TTS（文本转语音） ：

"cartesia/sonic-3:{voice_id}" - Cartesia Sonic 3
"elevenlabs/eleven_turbo_v2_5:{voice_id}" - ElevenLabs
"deepgram/aura:{voice}" - Deepgram Aura

始终使用 LiveKit Inference 模型字符串 作为 STT、LLM 和 TTS 的默认选择。这消除了管理各个提供商 API 密钥的需要。仅在您特别需要自定义模型、语音克隆、Anthropic Claude 或自托管模型时，才使用插件。
使用自适应噪声消除，通过 lambda 函数检测 SIP 参与者并应用适当的噪声消除（电话呼叫使用 BVCTelephony，标准参与者使用 BVC）。
使用 MultilingualModel 语音活动检测 以获得自然的对话流程。
构建提示词结构，包含身份、输出规则、工具、目标和防护栏等部分。
在部署到 LiveKit Cloud 之前，使用控制台模式进行测试。
使用 lk app env -w 将 LiveKit Cloud 凭据加载到您的环境中。

🇺🇸English

LiveKit Agents Python SDK

Build voice AI agents with LiveKit's Python Agents SDK.

LiveKit MCP server tools

This skill works alongside the LiveKit MCP server, which provides direct access to the latest LiveKit documentation, code examples, and changelogs. Use these tools when you need up-to-date information that may have changed since this skill was created.

Available MCP tools:

docs_search - Search the LiveKit docs site
get_pages - Fetch specific documentation pages by path
get_changelog - Get recent releases and updates for LiveKit packages
code_search - Search LiveKit repositories for code examples
get_python_agent_example - Browse 100+ Python agent examples

When to use MCP tools:

You need the latest API documentation or feature updates
You're looking for recent examples or code patterns
You want to check if a feature has been added in recent releases
The local references don't cover a specific topic

When to use local references:

You need quick access to core concepts covered in this skill
You're working offline or want faster access to common patterns
The information in the references is sufficient for your needs

Use MCP tools and local references together for the best experience.

References

Consult these resources as needed:

./references/livekit-overview.md -- LiveKit ecosystem overview and how these skills work together
./references/agent-session.md -- AgentSession lifecycle, events, and configuration
./references/tools.md -- Function tools, RunContext, and tool results
./references/models.md -- STT, LLM, TTS model strings and plugin configuration
./references/workflows.md -- Multi-agent handoffs, Tasks, TaskGroups, and pipeline nodes

Installation

uv add "livekit-agents[silero,turn-detector]~=1.3" \
  "livekit-plugins-noise-cancellation~=0.2" \
  "python-dotenv"

Environment variables

Use the LiveKit CLI to load your credentials into a .env.local file:

lk app env -w

Or manually create a .env.local file:

LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret
LIVEKIT_URL=wss://your-project.livekit.cloud

Quick start

Basic agent with STT-LLM-TTS pipeline

from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel

load_dotenv(".env.local")

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="""You are a helpful voice AI assistant.
            Keep responses concise, 1-3 sentences. No markdown or emojis.""",
        )

server = AgentServer()

@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        stt="assemblyai/universal-streaming:en",
        llm="openai/gpt-4.1-mini",
        tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
        vad=silero.VAD.load(),
        turn_detection=MultilingualModel(),
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_options=room_io.RoomOptions(
            audio_input=room_io.AudioInputOptions(
                noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
                    if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
                    else noise_cancellation.BVC(),
            ),
        ),
    )

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )

if __name__ == "__main__":
    agents.cli.run_app(server)

Basic agent with realtime model

from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentSession, Agent, AgentServer, room_io
from livekit.plugins import openai, noise_cancellation

load_dotenv(".env.local")

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="You are a helpful voice AI assistant."
        )

server = AgentServer()

@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        llm=openai.realtime.RealtimeModel(voice="coral")
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_options=room_io.RoomOptions(
            audio_input=room_io.AudioInputOptions(
                noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
                    if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
                    else noise_cancellation.BVC(),
            ),
        ),
    )

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )

if __name__ == "__main__":
    agents.cli.run_app(server)

Core concepts

Agent class

Define agent behavior by subclassing Agent:

from livekit.agents import Agent, function_tool

class MyAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="Your system prompt here",
        )

    async def on_enter(self) -> None:
        """Called when agent becomes active."""
        await self.session.generate_reply(
            instructions="Greet the user"
        )

    async def on_exit(self) -> None:
        """Called before agent hands off to another agent."""
        pass

    @function_tool()
    async def my_tool(self, param: str) -> str:
        """Tool description for the LLM."""
        return f"Result: {param}"

AgentSession

The session orchestrates the voice pipeline:

session = AgentSession(
    stt="assemblyai/universal-streaming:en",
    llm="openai/gpt-4.1-mini",
    tts="cartesia/sonic-3:voice_id",
    vad=silero.VAD.load(),
    turn_detection=MultilingualModel(),
)

Key methods:

session.start(room, agent) - Start the session
session.say(text) - Speak text directly
session.generate_reply(instructions) - Generate LLM response
session.interrupt() - Stop current speech
session.update_agent(new_agent) - Switch to different agent

Function tools

Use the @function_tool decorator:

from livekit.agents import function_tool, RunContext

@function_tool()
async def get_weather(self, context: RunContext, location: str) -> str:
    """Get the current weather for a location."""
    return f"Weather in {location}: Sunny, 72°F"

Running the agent

# Development mode with auto-reload
uv run agent.py dev

# Console mode (local testing)
uv run agent.py console

# Production mode
uv run agent.py start

# Download required model files
uv run agent.py download-files

LiveKit Inference model strings

Use model strings for simple configuration without API keys:

STT (Speech-to-Text) :

"assemblyai/universal-streaming:en" - AssemblyAI streaming
"deepgram/nova-3:en" - Deepgram Nova
"cartesia/ink" - Cartesia STT

LLM (Large Language Model) :

"openai/gpt-4.1-mini" - GPT-4.1 mini (recommended)
"openai/gpt-4.1" - GPT-4.1
"openai/gpt-5" - GPT-5
"gemini/gemini-3-flash" - Gemini 3 Flash
"gemini/gemini-2.5-flash" - Gemini 2.5 Flash

TTS (Text-to-Speech) :

"cartesia/sonic-3:{voice_id}" - Cartesia Sonic 3
"elevenlabs/eleven_turbo_v2_5:{voice_id}" - ElevenLabs
"deepgram/aura:{voice}" - Deepgram Aura

Best practices

Always use LiveKit Inference model strings as the default for STT, LLM, and TTS. This eliminates the need to manage individual provider API keys. Only use plugins when you specifically need custom models, voice cloning, Anthropic Claude, or self-hosted models.
Use adaptive noise cancellation with a lambda to detect SIP participants and apply appropriate noise cancellation (BVCTelephony for phone calls, BVC for standard participants).
Use MultilingualModel turn detection for natural conversation flow.
Structure prompts with Identity, Output rules, Tools, Goals, and Guardrails sections.
Test with console mode before deploying to LiveKit Cloud.
Uselk app env -w to load LiveKit Cloud credentials into your environment.

Weekly Installs

Repository

codestackr/live…t-skills

GitHub Stars

First Seen

Feb 3, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

opencode44

codex44

gemini-cli43

github-copilot41

amp35

kimi-cli35

AI Elements：基于shadcn/ui的AI原生应用组件库，快速构建对话界面

76,800 周安装

LiveKit Python Agents SDK - 构建语音AI代理，集成实时通信与多模态处理

🇨🇳中文介绍

LiveKit Agents Python SDK

LiveKit MCP 服务器工具

相关 Skills

参考资料

安装

环境变量

快速开始

使用 STT-LLM-TTS 流水线的基本代理

使用实时模型的基本代理

核心概念

Agent 类

AgentSession

函数工具

运行代理

LiveKit Inference 模型字符串

最佳实践

🇺🇸English

LiveKit Agents Python SDK

LiveKit MCP server tools

References

Installation

Environment variables

Quick start

Basic agent with STT-LLM-TTS pipeline

Basic agent with realtime model

Core concepts

Agent class

AgentSession

Function tools

Running the agent

LiveKit Inference model strings

Best practices

最新 Skills