langsmith-dataset by langchain-ai/langsmith-skills
npx skills add https://github.com/langchain-ai/langsmith-skills --skill langsmith-datasetLANGSMITH_API_KEY=lsv2_pt_your_api_key_here # 必需 LANGSMITH_PROJECT=your-project-name # 检查此项以了解哪个项目包含追踪记录 LANGSMITH_WORKSPACE_ID=your-workspace-id # 可选:用于组织范围的密钥
重要提示: 在查询或与 LangSmith 交互之前,请务必检查环境变量或 .env 文件中的 LANGSMITH_PROJECT。这会告诉你哪个项目包含相关的追踪记录和数据。如果 LangSmith 项目不可用,请运用你的最佳判断来确定正确的项目。
Python 依赖项
pip install langsmith
JavaScript 依赖项
npm install langsmith
CLI 工具
curl -sSL https://raw.githubusercontent.com/langchain-ai/langsmith-cli/main/scripts/install.sh | sh
langsmith dataset list - 列出 LangSmith 中的数据集langsmith dataset get <name-or-id> - 查看数据集详情广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
langsmith dataset create --name <name> - 创建新的空数据集langsmith dataset delete <name-or-id> - 删除数据集langsmith dataset export <name-or-id> <output-file> - 将数据集导出到本地 JSON 文件langsmith dataset upload <file> --name <name> - 将本地 JSON 文件上传为数据集langsmith example list --dataset <name> - 列出数据集中的示例langsmith example create --dataset <name> --inputs <json> - 向数据集添加示例langsmith example delete <example-id> - 删除示例langsmith experiment list --dataset <name> - 列出数据集的实验langsmith experiment get <name> - 查看实验结果--limit N - 限制结果数量--yes - 跳过确认提示(谨慎使用)重要提示 - 安全确认:
--yes--yes 来跳过确认提示<dataset_types_overview> 常见的评估数据集类型:
<creating_datasets>
数据集是包含示例数组的 JSON 文件。每个示例都有 inputs 和 outputs。
首先导出追踪记录,然后使用代码将其处理成数据集格式:
# 1. 将追踪记录导出到 JSONL 文件
langsmith trace export ./traces --project my-project --limit 20 --full
client = Client()
examples = [] for jsonl_file in Path("./traces").glob("*.jsonl"): runs = [json.loads(line) for line in jsonl_file.read_text().strip().split("\n")] root = next((r for r in runs if r.get("parent_run_id") is None), None) if root and root.get("inputs") and root.get("outputs"): examples.append({ "trace_id": root.get("trace_id"), "inputs": root["inputs"], "outputs": root["outputs"] })
with open("/tmp/dataset.json", "w") as f: json.dump(examples, f, indent=2)
</python>
<typescript>
```typescript
import { Client } from "langsmith";
import { readFileSync, writeFileSync, readdirSync } from "fs";
import { join } from "path";
const client = new Client();
// 2. Process traces into dataset examples
const examples: Array<{trace_id?: string, inputs: Record<string, any>, outputs: Record<string, any>}> = [];
const files = readdirSync("./traces").filter(f => f.endsWith(".jsonl"));
for (const file of files) {
const lines = readFileSync(join("./traces", file), "utf-8").trim().split("\n");
const runs = lines.map(line => JSON.parse(line));
const root = runs.find(r => r.parent_run_id == null);
if (root?.inputs && root?.outputs) {
examples.push({ trace_id: root.trace_id, inputs: root.inputs, outputs: root.outputs });
}
}
// 3. Save locally
writeFileSync("/tmp/dataset.json", JSON.stringify(examples, null, 2));
# 将本地 JSON 文件上传为数据集
langsmith dataset upload /tmp/dataset.json --name "My Evaluation Dataset"
client = Client()
dataset = client.create_dataset("My Dataset", description="Evaluation dataset")
client.create_examples( inputs=[{"query": "What is AI?"}, {"query": "Explain RAG"}], outputs=[{"answer": "AI is..."}, {"answer": "RAG is..."}], dataset_name="My Dataset", )
</python>
<typescript>
```typescript
import { Client } from "langsmith";
const client = new Client();
// Create dataset and add examples
const dataset = await client.createDataset("My Dataset", {
description: "Evaluation dataset",
});
await client.createExamples({
inputs: [{ query: "What is AI?" }, { query: "Explain RAG" }],
outputs: [{ answer: "AI is..." }, { answer: "RAG is..." }],
datasetName: "My Dataset",
});
<dataset_structures>
{"trace_id": "...", "inputs": {"query": "What are the top genres?"}, "outputs": {"response": "The top genres are..."}}
{"trace_id": "...", "inputs": {"messages": [...]}, "outputs": {"content": "..."}, "metadata": {"node_name": "model"}}
{"trace_id": "...", "inputs": {"query": "..."}, "outputs": {"expected_trajectory": ["tool_a", "tool_b", "tool_c"]}}
{"trace_id": "...", "inputs": {"question": "How do I..."}, "outputs": {"answer": "...", "retrieved_chunks": ["..."], "cited_chunks": ["..."]}}
</dataset_structures>
<script_usage>
# 列出所有数据集
langsmith dataset list
# 获取数据集详情
langsmith dataset get "My Dataset"
# 创建空数据集
langsmith dataset create --name "New Dataset" --description "For evaluation"
# 上传本地 JSON 文件
langsmith dataset upload /tmp/dataset.json --name "My Dataset"
# 将数据集导出到本地文件
langsmith dataset export "My Dataset" /tmp/exported.json --limit 100
# 删除数据集
langsmith dataset delete "My Dataset"
# 列出数据集中的示例
langsmith example list --dataset "My Dataset" --limit 10
# 添加示例
langsmith example create --dataset "My Dataset" \
--inputs '{"query": "test"}' \
--outputs '{"answer": "result"}'
# 列出实验
langsmith experiment list --dataset "My Dataset"
langsmith experiment get "eval-v1"
</script_usage>
<example_workflow> 从追踪记录到上传 LangSmith 数据集的完整工作流程:
# 1. 从 LangSmith 导出追踪记录
langsmith trace export ./traces --project my-project --limit 20 --full
# 2. 将追踪记录处理成数据集格式(使用 Python/JS 代码)
# 参见上面的"创建数据集"部分
# 3. 上传到 LangSmith
langsmith dataset upload /tmp/final_response.json --name "Skills: Final Response"
langsmith dataset upload /tmp/trajectory.json --name "Skills: Trajectory"
# 4. 验证上传
langsmith dataset list
langsmith dataset get "Skills: Final Response"
langsmith example list --dataset "Skills: Final Response" --limit 3
# 5. 运行实验
langsmith experiment list --dataset "Skills: Final Response"
</example_workflow>
上传后数据集为空:
inputs 键的对象数组langsmith example list --dataset "Name"导出没有数据:
--full 标志导出追踪记录以包含输入/输出inputs 和 outputs示例数量不匹配:
langsmith dataset get "Name" 检查远程数量每周安装次数
630
代码仓库
GitHub 星标数
76
首次出现
2026年3月4日
安全审计
安装于
claude-code579
codex458
cursor396
gemini-cli385
github-copilot384
opencode384
LANGSMITH_API_KEY=lsv2_pt_your_api_key_here # Required LANGSMITH_PROJECT=your-project-name # Check this to know which project has traces LANGSMITH_WORKSPACE_ID=your-workspace-id # Optional: for org-scoped keys
IMPORTANT: Always check the environment variables or .env file for LANGSMITH_PROJECT before querying or interacting with LangSmith. This tells you which project contains the relevant traces and data. If the LangSmith project is not available, use your best judgement to identify the right one.
Python Dependencies
pip install langsmith
JavaScript Dependencies
npm install langsmith
CLI Tool
curl -sSL https://raw.githubusercontent.com/langchain-ai/langsmith-cli/main/scripts/install.sh | sh
langsmith dataset list - List datasets in LangSmithlangsmith dataset get <name-or-id> - View dataset detailslangsmith dataset create --name <name> - Create a new empty datasetlangsmith dataset delete <name-or-id> - Delete a datasetlangsmith dataset export <name-or-id> <output-file> - Export dataset to local JSON filelangsmith dataset upload <file> --name <name> - Upload a local JSON file as a datasetlangsmith example list --dataset <name> - List examples in a datasetlangsmith example create --dataset <name> --inputs <json> - Add an example to a datasetlangsmith example delete <example-id> - Delete an examplelangsmith experiment list --dataset <name> - List experiments for a datasetlangsmith experiment get <name> - View experiment results--limit N - Limit number of results--yes - Skip confirmation prompts (use with caution)IMPORTANT - Safety Prompts:
--yes unless the user explicitly requests it--yes to skip confirmation prompts<dataset_types_overview> Common evaluation dataset types:
<creating_datasets>
Datasets are JSON files with an array of examples. Each example has inputs and outputs.
Export traces first, then process them into dataset format using code:
# 1. Export traces to JSONL files
langsmith trace export ./traces --project my-project --limit 20 --full
client = Client()
examples = [] for jsonl_file in Path("./traces").glob("*.jsonl"): runs = [json.loads(line) for line in jsonl_file.read_text().strip().split("\n")] root = next((r for r in runs if r.get("parent_run_id") is None), None) if root and root.get("inputs") and root.get("outputs"): examples.append({ "trace_id": root.get("trace_id"), "inputs": root["inputs"], "outputs": root["outputs"] })
with open("/tmp/dataset.json", "w") as f: json.dump(examples, f, indent=2)
</python>
<typescript>
```typescript
import { Client } from "langsmith";
import { readFileSync, writeFileSync, readdirSync } from "fs";
import { join } from "path";
const client = new Client();
// 2. Process traces into dataset examples
const examples: Array<{trace_id?: string, inputs: Record<string, any>, outputs: Record<string, any>}> = [];
const files = readdirSync("./traces").filter(f => f.endsWith(".jsonl"));
for (const file of files) {
const lines = readFileSync(join("./traces", file), "utf-8").trim().split("\n");
const runs = lines.map(line => JSON.parse(line));
const root = runs.find(r => r.parent_run_id == null);
if (root?.inputs && root?.outputs) {
examples.push({ trace_id: root.trace_id, inputs: root.inputs, outputs: root.outputs });
}
}
// 3. Save locally
writeFileSync("/tmp/dataset.json", JSON.stringify(examples, null, 2));
# Upload local JSON file as a dataset
langsmith dataset upload /tmp/dataset.json --name "My Evaluation Dataset"
client = Client()
dataset = client.create_dataset("My Dataset", description="Evaluation dataset")
client.create_examples( inputs=[{"query": "What is AI?"}, {"query": "Explain RAG"}], outputs=[{"answer": "AI is..."}, {"answer": "RAG is..."}], dataset_name="My Dataset", )
</python>
<typescript>
```typescript
import { Client } from "langsmith";
const client = new Client();
// Create dataset and add examples
const dataset = await client.createDataset("My Dataset", {
description: "Evaluation dataset",
});
await client.createExamples({
inputs: [{ query: "What is AI?" }, { query: "Explain RAG" }],
outputs: [{ answer: "AI is..." }, { answer: "RAG is..." }],
datasetName: "My Dataset",
});
<dataset_structures>
{"trace_id": "...", "inputs": {"query": "What are the top genres?"}, "outputs": {"response": "The top genres are..."}}
{"trace_id": "...", "inputs": {"messages": [...]}, "outputs": {"content": "..."}, "metadata": {"node_name": "model"}}
{"trace_id": "...", "inputs": {"query": "..."}, "outputs": {"expected_trajectory": ["tool_a", "tool_b", "tool_c"]}}
{"trace_id": "...", "inputs": {"question": "How do I..."}, "outputs": {"answer": "...", "retrieved_chunks": ["..."], "cited_chunks": ["..."]}}
</dataset_structures>
<script_usage>
# List all datasets
langsmith dataset list
# Get dataset details
langsmith dataset get "My Dataset"
# Create an empty dataset
langsmith dataset create --name "New Dataset" --description "For evaluation"
# Upload a local JSON file
langsmith dataset upload /tmp/dataset.json --name "My Dataset"
# Export a dataset to local file
langsmith dataset export "My Dataset" /tmp/exported.json --limit 100
# Delete a dataset
langsmith dataset delete "My Dataset"
# List examples in a dataset
langsmith example list --dataset "My Dataset" --limit 10
# Add an example
langsmith example create --dataset "My Dataset" \
--inputs '{"query": "test"}' \
--outputs '{"answer": "result"}'
# List experiments
langsmith experiment list --dataset "My Dataset"
langsmith experiment get "eval-v1"
</script_usage>
<example_workflow> Complete workflow from traces to uploaded LangSmith dataset:
# 1. Export traces from LangSmith
langsmith trace export ./traces --project my-project --limit 20 --full
# 2. Process traces into dataset format (using Python/JS code)
# See "Creating Datasets" section above
# 3. Upload to LangSmith
langsmith dataset upload /tmp/final_response.json --name "Skills: Final Response"
langsmith dataset upload /tmp/trajectory.json --name "Skills: Trajectory"
# 4. Verify upload
langsmith dataset list
langsmith dataset get "Skills: Final Response"
langsmith example list --dataset "Skills: Final Response" --limit 3
# 5. Run experiments
langsmith experiment list --dataset "Skills: Final Response"
</example_workflow>
Empty dataset after upload:
inputs keylangsmith example list --dataset "Name"Export has no data:
--full flag to include inputs/outputsinputs and outputs populatedExample count mismatch:
langsmith dataset get "Name" to check remote countWeekly Installs
630
Repository
GitHub Stars
76
First Seen
Mar 4, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
claude-code579
codex458
cursor396
gemini-cli385
github-copilot384
opencode384
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
41,400 周安装
IMAP SMTP 邮件处理技能 - 开源邮件客户端集成开发工具
609 周安装
腾讯云CloudBase云函数开发指南:事件函数与HTTP函数部署调试全攻略
609 周安装
Power BI DAX 公式优化器 - 提升性能、可读性与可维护性的专家指南
7,400 周安装
cook-cli:CLI开发结构化工作流工具 - Plan→Code→Review→Test四阶段
2 周安装
AGENTS.md 通用 AI 代理配置指南 | 兼容 Claude、Cursor、GitHub Copilot 等 25+ 代理
2 周安装
Base链.basename.eth域名注册教程 - 智能体钱包名称注册完整指南
2 周安装