Langfuse LLM 可观测性平台：AI应用追踪、监控与评估全攻略

langfuse by sickn33/antigravity-awesome-skills

337 周安装量

28,900 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill langfuse

AI/机器学习开发运维监控

🇨🇳中文介绍

Langfuse

角色 : LLM 可观测性架构师

您是大语言模型可观测性和评估方面的专家。您从追踪、跨度和指标的角度思考问题。您知道 LLM 应用程序和传统软件一样需要监控——但维度不同（成本、质量、延迟）。您使用数据来驱动提示改进并捕捉回归问题。

能力

LLM 追踪与可观测性
提示管理与版本控制
评估与评分
数据集管理
成本跟踪
性能监控
提示 A/B 测试

要求

Python 或 TypeScript/JavaScript
Langfuse 账户（云端或自托管）
LLM API 密钥

模式

基础追踪设置

使用 Langfuse 对 LLM 调用进行插桩

使用时机 : 任何 LLM 应用程序

from langfuse import Langfuse

# 初始化客户端
langfuse = Langfuse(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com"  # 或自托管 URL
)

# 为用户请求创建追踪
trace = langfuse.trace(
    name="chat-completion",
    user_id="user-123",
    session_id="session-456",  # 分组相关追踪
    metadata={"feature": "customer-support"},
    tags=["production", "v2"]
)

# 记录一次生成（LLM 调用）
generation = trace.generation(
    name="gpt-4o-response",
    model="gpt-4o",
    model_parameters={"temperature": 0.7},
    input={"messages": [{"role": "user", "content": "Hello"}]},
    metadata={"attempt": 1}
)

# 执行实际的 LLM 调用
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# 使用输出完成生成记录
generation.end(
    output=response.choices[0].message.content,
    usage={
        "input": response.usage.prompt_tokens,
        "output": response.usage.completion_tokens
    }
)

# 为追踪评分
trace.score(
    name="user-feedback",
    value=1,  # 1 = 正面，0 = 负面
    comment="User clicked helpful"
)

# 退出前刷新（在无服务器环境中很重要）
langfuse.flush()

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

776,000 周安装

Vercel React 最佳实践指南 | 58条Next.js性能优化规则与代码重构

261,300 周安装

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

140,500 周安装

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

106,200 周安装

from langfuse.openai import openai

# 作为 OpenAI 客户端的直接替代品
# 所有调用自动追踪

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    # Langfuse 特定参数
    name="greeting",  # 追踪名称
    session_id="session-123",
    user_id="user-456",
    tags=["test"],
    metadata={"feature": "chat"}
)

# 支持流式传输
stream = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
    name="story-generation"
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

# 支持异步
import asyncio
from langfuse.openai import AsyncOpenAI

async_client = AsyncOpenAI()

async def main():
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
        name="async-greeting"
    )

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langfuse.callback import CallbackHandler

# 创建 Langfuse 回调处理器
langfuse_handler = CallbackHandler(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com",
    session_id="session-123",
    user_id="user-456"
)

# 与任何 LangChain 组件一起使用
llm = ChatOpenAI(model="gpt-4o")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}")
])

chain = prompt | llm

# 将处理器传递给 invoke 方法
response = chain.invoke(
    {"input": "Hello"},
    config={"callbacks": [langfuse_handler]}
)

# 或设置为默认处理器
import langchain
langchain.callbacks.manager.set_handler(langfuse_handler)

# 之后所有调用都会被追踪
response = chain.invoke({"input": "Hello"})

# 适用于代理、检索器等
from langchain.agents import create_openai_tools_agent

agent = create_openai_tools_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

result = agent_executor.invoke(
    {"input": "What's the weather?"},
    config={"callbacks": [langfuse_handler]}
)

🇺🇸English

Langfuse

Role : LLM Observability Architect

You are an expert in LLM observability and evaluation. You think in terms of traces, spans, and metrics. You know that LLM applications need monitoring just like traditional software - but with different dimensions (cost, quality, latency). You use data to drive prompt improvements and catch regressions.

Capabilities

LLM tracing and observability
Prompt management and versioning
Evaluation and scoring
Dataset management
Cost tracking
Performance monitoring
A/B testing prompts

Requirements

Python or TypeScript/JavaScript
Langfuse account (cloud or self-hosted)
LLM API keys

Patterns

Basic Tracing Setup

Instrument LLM calls with Langfuse

When to use : Any LLM application

from langfuse import Langfuse

# Initialize client
langfuse = Langfuse(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com"  # or self-hosted URL
)

# Create a trace for a user request
trace = langfuse.trace(
    name="chat-completion",
    user_id="user-123",
    session_id="session-456",  # Groups related traces
    metadata={"feature": "customer-support"},
    tags=["production", "v2"]
)

# Log a generation (LLM call)
generation = trace.generation(
    name="gpt-4o-response",
    model="gpt-4o",
    model_parameters={"temperature": 0.7},
    input={"messages": [{"role": "user", "content": "Hello"}]},
    metadata={"attempt": 1}
)

# Make actual LLM call
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# Complete the generation with output
generation.end(
    output=response.choices[0].message.content,
    usage={
        "input": response.usage.prompt_tokens,
        "output": response.usage.completion_tokens
    }
)

# Score the trace
trace.score(
    name="user-feedback",
    value=1,  # 1 = positive, 0 = negative
    comment="User clicked helpful"
)

# Flush before exit (important in serverless)
langfuse.flush()

OpenAI Integration

Automatic tracing with OpenAI SDK

When to use : OpenAI-based applications

from langfuse.openai import openai

# Drop-in replacement for OpenAI client
# All calls automatically traced

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    # Langfuse-specific parameters
    name="greeting",  # Trace name
    session_id="session-123",
    user_id="user-456",
    tags=["test"],
    metadata={"feature": "chat"}
)

# Works with streaming
stream = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
    name="story-generation"
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

# Works with async
import asyncio
from langfuse.openai import AsyncOpenAI

async_client = AsyncOpenAI()

async def main():
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
        name="async-greeting"
    )

LangChain Integration

Trace LangChain applications

When to use : LangChain-based applications

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langfuse.callback import CallbackHandler

# Create Langfuse callback handler
langfuse_handler = CallbackHandler(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com",
    session_id="session-123",
    user_id="user-456"
)

# Use with any LangChain component
llm = ChatOpenAI(model="gpt-4o")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}")
])

chain = prompt | llm

# Pass handler to invoke
response = chain.invoke(
    {"input": "Hello"},
    config={"callbacks": [langfuse_handler]}
)

# Or set as default
import langchain
langchain.callbacks.manager.set_handler(langfuse_handler)

# Then all calls are traced
response = chain.invoke({"input": "Hello"})

# Works with agents, retrievers, etc.
from langchain.agents import create_openai_tools_agent

agent = create_openai_tools_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

result = agent_executor.invoke(
    {"input": "What's the weather?"},
    config={"callbacks": [langfuse_handler]}
)

Anti-Patterns

❌ Not Flushing in Serverless

Why bad : Traces are batched. Serverless may exit before flush. Data is lost.

Instead : Always call langfuse.flush() at end. Use context managers where available. Consider sync mode for critical traces.

❌ Tracing Everything

Why bad : Noisy traces. Performance overhead. Hard to find important info.

Instead : Focus on: LLM calls, key logic, user actions. Group related operations. Use meaningful span names.

❌ No User/Session IDs

Why bad : Can't debug specific users. Can't track sessions. Analytics limited.

Instead : Always pass user_id and session_id. Use consistent identifiers. Add relevant metadata.

Limitations

Self-hosted requires infrastructure
High-volume may need optimization
Real-time dashboard has latency
Evaluation requires setup

Related Skills

Works well with: langgraph, crewai, structured-output, autonomous-agents

When to Use

This skill is applicable to execute the workflow or actions described in the overview.

Weekly Installs

314

Repository

sickn33/antigra…e-skills

GitHub Stars

27.1K

First Seen

Jan 19, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykFail

Installed on

claude-code262

opencode258

gemini-cli247

codex219

antigravity213

cursor212

Langfuse LLM 可观测性平台：AI应用追踪、监控与评估全攻略

🇨🇳中文介绍

Langfuse

能力

要求

模式

基础追踪设置

相关 Skills

OpenAI 集成

LangChain 集成

反面模式

❌ 在无服务器环境中不刷新

❌ 追踪所有内容

❌ 没有用户/会话 ID

限制

相关技能

使用时机

🇺🇸English

Langfuse

Capabilities

Requirements

Patterns

Basic Tracing Setup

OpenAI Integration

LangChain Integration

Anti-Patterns

❌ Not Flushing in Serverless

❌ Tracing Everything

❌ No User/Session IDs

Limitations

Related Skills

When to Use

最新 Skills