Langfuse：LLM应用可观测性与评估平台，实现追踪、监控与提示管理

langfuse by claudiodearaujo/izacenter

1 周安装量

1 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/claudiodearaujo/izacenter --skill langfuse

AI/机器学习开发运维监控

🇨🇳中文介绍

Langfuse

角色 : LLM 可观测性架构师

您是 LLM 可观测性和评估方面的专家。您以追踪、跨度和指标的角度进行思考。您知道 LLM 应用程序和传统软件一样需要监控——但维度不同（成本、质量、延迟）。您使用数据来驱动提示改进并捕捉性能衰退。

能力

LLM 追踪和可观测性
提示管理和版本控制
评估和评分
数据集管理
成本跟踪
性能监控
提示的 A/B 测试

要求

Python 或 TypeScript/JavaScript
Langfuse 账户（云端或自托管）
LLM API 密钥

模式

基础追踪设置

使用 Langfuse 对 LLM 调用进行插桩

使用场景 : 任何 LLM 应用程序

from langfuse import Langfuse

# 初始化客户端
langfuse = Langfuse(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com"  # 或自托管 URL
)

# 为用户请求创建一个追踪
trace = langfuse.trace(
    name="chat-completion",
    user_id="user-123",
    session_id="session-456",  # 分组相关的追踪
    metadata={"feature": "customer-support"},
    tags=["production", "v2"]
)

# 记录一个生成事件（LLM 调用）
generation = trace.generation(
    name="gpt-4o-response",
    model="gpt-4o",
    model_parameters={"temperature": 0.7},
    input={"messages": [{"role": "user", "content": "Hello"}]},
    metadata={"attempt": 1}
)

# 执行实际的 LLM 调用
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# 使用输出完成生成事件
generation.end(
    output=response.choices[0].message.content,
    usage={
        "input": response.usage.prompt_tokens,
        "output": response.usage.completion_tokens
    }
)

# 为追踪评分
trace.score(
    name="user-feedback",
    value=1,  # 1 = 正面， 0 = 负面
    comment="User clicked helpful"
)

# 退出前刷新（在无服务器环境中很重要）
langfuse.flush()

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

812,900 周安装

Vercel React 最佳实践指南 | 58条Next.js性能优化规则与代码重构

269,400 周安装

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

147,400 周安装

Azure Data Explorer (Kusto) 查询技能：KQL数据分析、日志遥测与时间序列处理

114,200 周安装

from langfuse.openai import openai

# 直接替换 OpenAI 客户端
# 所有调用自动被追踪

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    # Langfuse 特定参数
    name="greeting",  # 追踪名称
    session_id="session-123",
    user_id="user-456",
    tags=["test"],
    metadata={"feature": "chat"}
)

# 支持流式处理
stream = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
    name="story-generation"
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

# 支持异步
import asyncio
from langfuse.openai import AsyncOpenAI

async_client = AsyncOpenAI()

async def main():
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
        name="async-greeting"
    )

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langfuse.callback import CallbackHandler

# 创建 Langfuse 回调处理器
langfuse_handler = CallbackHandler(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com",
    session_id="session-123",
    user_id="user-456"
)

# 与任何 LangChain 组件一起使用
llm = ChatOpenAI(model="gpt-4o")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}")
])

chain = prompt | llm

# 将处理器传递给 invoke 方法
response = chain.invoke(
    {"input": "Hello"},
    config={"callbacks": [langfuse_handler]}
)

# 或设置为默认处理器
import langchain
langchain.callbacks.manager.set_handler(langfuse_handler)

# 之后所有调用都会被追踪
response = chain.invoke({"input": "Hello"})

# 适用于代理、检索器等
from langchain.agents import create_openai_tools_agent

agent = create_openai_tools_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

result = agent_executor.invoke(
    {"input": "What's the weather?"},
    config={"callbacks": [langfuse_handler]}
)

🇺🇸English

Langfuse

Role : LLM Observability Architect

You are an expert in LLM observability and evaluation. You think in terms of traces, spans, and metrics. You know that LLM applications need monitoring just like traditional software - but with different dimensions (cost, quality, latency). You use data to drive prompt improvements and catch regressions.

Capabilities

LLM tracing and observability
Prompt management and versioning
Evaluation and scoring
Dataset management
Cost tracking
Performance monitoring
A/B testing prompts

Requirements

Python or TypeScript/JavaScript
Langfuse account (cloud or self-hosted)
LLM API keys

Patterns

Basic Tracing Setup

Instrument LLM calls with Langfuse

When to use : Any LLM application

from langfuse import Langfuse

# Initialize client
langfuse = Langfuse(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com"  # or self-hosted URL
)

# Create a trace for a user request
trace = langfuse.trace(
    name="chat-completion",
    user_id="user-123",
    session_id="session-456",  # Groups related traces
    metadata={"feature": "customer-support"},
    tags=["production", "v2"]
)

# Log a generation (LLM call)
generation = trace.generation(
    name="gpt-4o-response",
    model="gpt-4o",
    model_parameters={"temperature": 0.7},
    input={"messages": [{"role": "user", "content": "Hello"}]},
    metadata={"attempt": 1}
)

# Make actual LLM call
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# Complete the generation with output
generation.end(
    output=response.choices[0].message.content,
    usage={
        "input": response.usage.prompt_tokens,
        "output": response.usage.completion_tokens
    }
)

# Score the trace
trace.score(
    name="user-feedback",
    value=1,  # 1 = positive, 0 = negative
    comment="User clicked helpful"
)

# Flush before exit (important in serverless)
langfuse.flush()

OpenAI Integration

Automatic tracing with OpenAI SDK

When to use : OpenAI-based applications

from langfuse.openai import openai

# Drop-in replacement for OpenAI client
# All calls automatically traced

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    # Langfuse-specific parameters
    name="greeting",  # Trace name
    session_id="session-123",
    user_id="user-456",
    tags=["test"],
    metadata={"feature": "chat"}
)

# Works with streaming
stream = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
    name="story-generation"
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

# Works with async
import asyncio
from langfuse.openai import AsyncOpenAI

async_client = AsyncOpenAI()

async def main():
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
        name="async-greeting"
    )

LangChain Integration

Trace LangChain applications

When to use : LangChain-based applications

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langfuse.callback import CallbackHandler

# Create Langfuse callback handler
langfuse_handler = CallbackHandler(
    public_key="pk-...",
    secret_key="sk-...",
    host="https://cloud.langfuse.com",
    session_id="session-123",
    user_id="user-456"
)

# Use with any LangChain component
llm = ChatOpenAI(model="gpt-4o")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}")
])

chain = prompt | llm

# Pass handler to invoke
response = chain.invoke(
    {"input": "Hello"},
    config={"callbacks": [langfuse_handler]}
)

# Or set as default
import langchain
langchain.callbacks.manager.set_handler(langfuse_handler)

# Then all calls are traced
response = chain.invoke({"input": "Hello"})

# Works with agents, retrievers, etc.
from langchain.agents import create_openai_tools_agent

agent = create_openai_tools_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

result = agent_executor.invoke(
    {"input": "What's the weather?"},
    config={"callbacks": [langfuse_handler]}
)

Anti-Patterns

❌ Not Flushing in Serverless

Why bad : Traces are batched. Serverless may exit before flush. Data is lost.

Instead : Always call langfuse.flush() at end. Use context managers where available. Consider sync mode for critical traces.

❌ Tracing Everything

Why bad : Noisy traces. Performance overhead. Hard to find important info.

Instead : Focus on: LLM calls, key logic, user actions. Group related operations. Use meaningful span names.

❌ No User/Session IDs

Why bad : Can't debug specific users. Can't track sessions. Analytics limited.

Instead : Always pass user_id and session_id. Use consistent identifiers. Add relevant metadata.

Limitations

Self-hosted requires infrastructure
High-volume may need optimization
Real-time dashboard has latency
Evaluation requires setup

Related Skills

Works well with: langgraph, crewai, structured-output, autonomous-agents

Weekly Installs

Repository

claudiodearaujo…zacenter

GitHub Stars

First Seen

Today

Security Audits

Gen Agent Trust HubPass SocketPass SnykFail

Installed on

zencoder1

amp1

cline1

openclaw1

opencode1

cursor1

Langfuse：LLM应用可观测性与评估平台，实现追踪、监控与提示管理

🇨🇳中文介绍

Langfuse

能力

要求

模式

基础追踪设置

相关 Skills

OpenAI 集成

LangChain 集成

反面模式

❌ 在无服务器环境中不刷新

❌ 追踪所有内容

❌ 没有用户/会话 ID

限制

相关技能

🇺🇸English

Langfuse

Capabilities

Requirements

Patterns

Basic Tracing Setup

OpenAI Integration

LangChain Integration

Anti-Patterns

❌ Not Flushing in Serverless

❌ Tracing Everything

❌ No User/Session IDs

Limitations

Related Skills

最新 Skills