OpenAI Responses API 指南：有状态对话、内置工具与 Chat Completions 对比

openai-responses by jezweb/claude-skills

346 周安装量

670 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/jezweb/claude-skills --skill openai-responses

AI/机器学习 Node.js API

🇨🇳中文介绍

OpenAI Responses API

状态 : 生产就绪 最后更新 : 2026-01-21 API 发布时间 : 2025年3月 依赖项 : openai@6.16.0 (Node.js) 或 fetch API (Cloudflare Workers)

什么是 Responses API？

OpenAI 为智能体应用提供的统一接口，于 2025年3月 发布。提供具有 跨轮次保留推理状态 的 有状态对话。

关键创新： 与 Chat Completions（推理在轮次间被丢弃）不同，Responses 保留了模型的推理笔记，在 TAUBench 上性能提升 5%，并实现了更好的多轮交互。

与 Chat Completions 对比：

特性	Chat Completions	Responses API
状态	手动历史记录跟踪	自动（对话 ID）
推理	轮次间丢弃	跨轮次保留（TAUBench +5%）
工具	客户端往返	服务器端托管
输出	单一消息

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

何时使用 Responses 与 Chat Completions

使用 Responses：

智能体应用（推理 + 行动）
多轮对话（保留推理 = TAUBench +5%）
内置工具（代码解释器、文件搜索、网络搜索、MCP）
后台处理（60秒标准，10分钟延长超时）

使用 Chat Completions：

简单的一次性生成
完全无状态的交互
遗留集成

使用对话 ID 进行自动状态管理：

// 创建对话
const conv = await openai.conversations.create({
  metadata: { user_id: 'user_123' },
});

// 第一轮
const response1 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id,
  input: 'What are the 5 Ds of dodgeball?',
});

// 第二轮 - 模型记住上下文 + 推理
const response2 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id,
  input: 'Tell me more about the first one',
});

优势： 无需手动跟踪历史记录，推理被保留，缓存利用率提升 40-80%

对话限制： 90 天过期

内置工具（服务器端）

服务器端托管工具消除了后端往返：

工具	用途	备注
`code_interpreter`	执行 Python 代码	沙盒化，30秒超时（更长时间请使用 `background: true`）
`file_search`	无需向量存储的 RAG	每个文件最大 512MB，支持 PDF/Word/Markdown/HTML/代码
`web_search`	实时网络信息	自动来源引用
`image_generation`	DALL-E 集成	默认 DALL-E 3
`mcp`	连接外部工具	支持 OAuth，令牌不存储

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Calculate mean of: 10, 20, 30, 40, 50',
  tools: [{ type: 'code_interpreter' }],
});

Web Search TypeScript 说明

TypeScript 限制：web_search 工具的 external_web_access 选项在 SDK 类型中缺失（截至 v6.16.0）。

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Search for recent news',
  tools: [{
    type: 'web_search',
    external_web_access: true,
  } as any],  // ✅ 类型断言以抑制错误
});

内置支持模型上下文协议 (MCP) 服务器以连接外部工具（Stripe、数据库、自定义 API）。

默认情况下，在将任何数据共享给远程 MCP 服务器之前，需要明确的用户批准（安全特性）。

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Get my Stripe balance',
  tools: [{
    type: 'mcp',
    server_label: 'stripe',
    server_url: 'https://mcp.stripe.com',
    authorization: process.env.STRIPE_TOKEN,
  }],
});

if (response.status === 'requires_approval') {
  // 向用户显示："此操作需要与 Stripe 共享数据。批准吗？"
  // 用户批准后，使用批准令牌重试
}

替代方案：在 OpenAI 仪表板中预先批准 MCP 服务器（用户通过设置配置受信任的服务器）

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Roll 2d6 dice',
  tools: [{
    type: 'mcp',
    server_label: 'dice',
    server_url: 'https://example.com/mcp',
    authorization: process.env.TOKEN, // ⚠️ 不存储，每次请求都需要
  }],
});

MCP 输出类型：

mcp_list_tools - 在服务器上发现的工具
mcp_call - 工具调用 + 结果
message - 最终响应

关键创新： 模型的内部推理状态在轮次间得以保留（与 Chat Completions 丢弃它不同）。

Chat Completions：模型在响应前撕掉草稿纸页面
Responses API：草稿纸保持打开状态供下一轮使用

性能： 在 TAUBench 上提升 5%（GPT-5），纯粹来自保留的推理

推理摘要（免费）：

response.output.forEach(item => {
  if (item.type === 'reasoning') console.log(item.summary[0].text);
  if (item.type === 'message') console.log(item.content[0].text);
});

重要：推理痕迹隐私

您获得的内容：推理摘要（非完整内部痕迹） OpenAI 保留的内容：完整的思维链推理（专有，出于安全/隐私考虑）

对于 GPT-5-Thinking 模型：

OpenAI 在其后端内部保留推理
这种保留的推理提高了多轮性能（TAUBench +5%）
但开发者只收到摘要，而非实际的思维链
完整的推理痕迹不会暴露（OpenAI 的知识产权保护）

对于长时间运行的任务，使用 background: true：

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Analyze 500-page document',
  background: true,
  tools: [{ type: 'file_search', file_ids: [fileId] }],
});

// 轮询完成状态（每 5 秒检查一次）
const result = await openai.responses.retrieve(response.id);
if (result.status === 'completed') console.log(result.output_text);

标准：60 秒
后台：10 分钟

首令牌时间 (TTFT) 延迟： 后台模式目前比同步响应具有更高的 TTFT。OpenAI 正在努力缩小这一差距。

对于面向用户的实时响应，使用同步模式（延迟更低）
对于长时间运行的异步任务，使用后台模式（延迟可接受）

数据保留和隐私

默认保留：当 store: true（默认）时为 30 天 零数据保留 (ZDR)：具有 ZDR 的组织自动强制执行 store: false 后台模式：不兼容 ZDR（存储数据约 10 分钟用于轮询）

2025年9月26日：OpenAI 法院命令的保留期结束
当前：使用 store: true 时默认保留 30 天

// 禁用存储（无保留）
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello!',
  store: false,  // ✅ 无保留
});

// ZDR 组织：store 始终被视为 false
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello!',
  store: true,  // ⚠️ 对于 ZDR 组织，OpenAI 会忽略，视为 false
});

ZDR 合规性：

避免后台模式（需要临时存储）
为清晰起见，明确设置 store: false
注意：同步模式下适用 60 秒超时

返回 8 种输出类型，而非单一消息：

类型	示例
`message`	最终答案、解释
`reasoning`	逐步思考过程（免费！）
`code_interpreter_call`	Python 代码 + 结果
`mcp_call`	工具名称、参数、输出
`mcp_list_tools`	来自 MCP 服务器的工具定义
`file_search_call`	匹配的块、引用
`web_search_call`	URL、片段
`image_generation_call`	图片 URL

response.output.forEach(item => {
  if (item.type === 'reasoning') console.log(item.summary[0].text);
  if (item.type === 'web_search_call') console.log(item.results);
  if (item.type === 'message') console.log(item.content[0].text);
});

// 或使用仅文本的辅助函数
console.log(response.output_text);

从 Chat Completions 迁移

特性	Chat Completions	Responses API
端点	`/v1/chat/completions`	`/v1/responses`
参数	`messages`	`input`
角色	`system`	`developer`
输出	`choices[0].message.content`	`output_text`
状态	手动数组	自动（对话 ID）
流式传输	`data: {"choices":[...]}`	包含 8 种项目类型的 SSE

// 之前
const response = await openai.chat.completions.create({
  model: 'gpt-5',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});
console.log(response.choices[0].message.content);

// 之后
const response = await openai.responses.create({
  model: 'gpt-5',
  input: [
    { role: 'developer', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});
console.log(response.output_text);

从 Assistants API 迁移

关键：Assistants API 停用时间线

2025年8月26日：Assistants API 正式弃用
2025-2026年：OpenAI 提供迁移工具
2026年8月26日：Assistants API 停用（停止工作）

在 2026年8月26日之前迁移以避免重大变更。

关键重大变更：

Assistants API	Responses API
助手（通过 API 创建）	提示（在仪表板中创建）
线程	对话（存储项目，不仅仅是消息）
运行（服务器端生命周期）	响应（无状态调用）
运行步骤	项目（多态输出）

// 之前（Assistants API - 已弃用）
const assistant = await openai.beta.assistants.create({
  model: 'gpt-4',
  instructions: 'You are helpful.',
});

const thread = await openai.beta.threads.create();

const run = await openai.beta.threads.runs.create(thread.id, {
  assistant_id: assistant.id,
});

// 之后（Responses API - 当前）
const conversation = await openai.conversations.create({
  metadata: { purpose: 'customer_support' },
});

const response = await openai.responses.create({
  model: 'gpt-5',
  conversation: conversation.id,
  input: [
    { role: 'developer', content: 'You are helpful.' },
    { role: 'user', content: 'Hello!' },
  ],
});

此技能预防 11 个已记录的错误：

1. 会话状态未持久化

原因：未使用对话 ID 或每轮使用不同的 ID
修复：创建一次对话（const conv = await openai.conversations.create()），在所有轮次中重用 conv.id

2. MCP 服务器连接失败 (mcp_connection_error)

原因：无效 URL、缺失/过期的认证令牌、服务器宕机
修复：验证 URL 是否正确，使用 fetch() 手动测试，检查令牌过期

3. 代码解释器超时 (code_interpreter_timeout)

原因：代码运行超过 30 秒
修复：使用 background: true 以延长超时（最长 10 分钟）

4. 图像生成速率限制 (rate_limit_error)

原因：DALL-E 请求过多
修复：实现指数退避重试（1秒、2秒、3秒延迟）

5. 文件搜索相关性问题

原因：模糊查询返回不相关的结果
修复：使用具体查询（"2024年第四季度定价"而非"查找定价"），按 chunk.score > 0.7 过滤

6. 成本跟踪混淆

原因：Responses 对输入 + 输出 + 工具 + 存储的对话计费（对比 Chat Completions：仅输入 + 输出）
修复：如果不需要，设置 store: false，监控 response.usage.tool_tokens

7. 对话未找到 (invalid_request_error)

原因：ID 拼写错误、对话被删除或已过期（90 天限制）
修复：在使用前通过 openai.conversations.list() 验证存在性

8. 工具输出解析失败

原因：访问了错误的输出结构
修复：使用 response.output_text 辅助函数或迭代 response.output.forEach(item => ...) 并检查 item.type

9. Zod v4 与结构化输出不兼容

错误：Invalid schema for response_format 'name': schema must be a JSON Schema of 'type: "object"', got 'type: "string"'.
来源：GitHub Issue #1597
发生原因：SDK 的 vendored zod-to-json-schema 库不支持 Zod v4（缺少 ZodFirstPartyTypeKind 导出）
预防：固定使用 Zod v3（"zod": "^3.23.8"）或使用自定义 zodTextFormat 并设置 z.toJSONSchema({ target: "draft-7" })

// 解决方法：固定使用 Zod v3（推荐） { "dependencies": { "openai": "^6.16.0", "zod": "^3.23.8" // 暂时不要升级到 v4 } }

10. 后台模式网络搜索缺少来源

错误：web_search_call 输出项目包含查询但没有来源/结果
来源：GitHub Issue #1676
发生原因：当使用 background: true + web_search 工具时，OpenAI 不会在响应中返回来源
预防：当需要网络搜索来源时，使用同步模式（background: false）

// ✅ 同步模式下来源可用 const response = await openai.responses.create({ model: 'gpt-5', input: 'Latest AI news?', background: false, // 获取来源所必需 tools: [{ type: 'web_search' }], });

11. 流式模式缺少 output_text 辅助函数

错误：finalResponse().output_text 在流式模式下是 undefined
来源：GitHub Issue #1662
发生原因：stream.finalResponse() 不包含 output_text 便利字段（仅在非流式响应中可用）
预防：监听 output_text.done 事件或手动从 output 项目中提取

// 解决方法：监听事件 const stream = openai.responses.stream({ model: 'gpt-5', input: 'Hello!' }); let outputText = ''; for await (const event of stream) { if (event.type === 'output_text.done') { outputText = event.output_text; // ✅ 在事件中可用 } }

为多轮对话使用对话 ID（缓存提升 40-80%）
在多态响应中处理所有 8 种输出类型
对超过 30 秒的任务使用 background: true
提供 MCP authorization 令牌（不存储，每次请求都需要）
监控 response.usage.total_tokens 以控制成本

在客户端代码中暴露 API 密钥
假设是单一消息输出（使用 response.output_text 辅助函数）
跨用户重用对话 ID（安全风险）
忽略错误类型（特别处理 rate_limit_error、mcp_connection_error）
对后台任务的轮询快于 1 秒（使用 5 秒间隔）

技能资源： templates/, references/responses-vs-chat-completions.md, references/mcp-integration-guide.md, references/built-in-tools-guide.md, references/migration-guide.md, references/top-errors.md

最后验证 : 2026-01-21 | 技能版本 : 2.1.0 | 变更 : 添加了 3 个 TIER 1 问题（Zod v4、后台网络搜索、流式 output_text），2 个 TIER 2 发现（MCP 批准、推理隐私），数据保留与 ZDR 部分，Assistants API 停用时间线，后台模式 TTFT 说明，网络搜索 TypeScript 限制。更新 SDK 版本至 6.16.0。

🇺🇸English

OpenAI Responses API

Status : Production Ready Last Updated : 2026-01-21 API Launch : March 2025 Dependencies : openai@6.16.0 (Node.js) or fetch API (Cloudflare Workers)

What Is the Responses API?

OpenAI's unified interface for agentic applications, launched March 2025. Provides stateful conversations with preserved reasoning state across turns.

Key Innovation: Unlike Chat Completions (reasoning discarded between turns), Responses preserves the model's reasoning notebook , improving performance by 5% on TAUBench and enabling better multi-turn interactions.

vs Chat Completions:

Feature	Chat Completions	Responses API
State	Manual history tracking	Automatic (conversation IDs)
Reasoning	Dropped between turns	Preserved across turns (+5% TAUBench)
Tools	Client-side round trips	Server-side hosted
Output	Single message	Polymorphic (8 types)
Cache	Baseline	40-80% better utilization
MCP	Manual	Built-in

Quick Start

npm install openai@6.16.0



import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'What are the 5 Ds of dodgeball?',
});

console.log(response.output_text);

Key differences from Chat Completions:

Endpoint: /v1/responses (not /v1/chat/completions)
Parameter: input (not messages)
Role: developer (not system)
Output: response.output_text (not choices[0].message.content)

When to Use Responses vs Chat Completions

Use Responses:

Agentic applications (reasoning + actions)
Multi-turn conversations (preserved reasoning = +5% TAUBench)
Built-in tools (Code Interpreter, File Search, Web Search, MCP)
Background processing (60s standard, 10min extended timeout)

Use Chat Completions:

Simple one-off generation
Fully stateless interactions
Legacy integrations

Stateful Conversations

Automatic State Management using conversation IDs:

// Create conversation
const conv = await openai.conversations.create({
  metadata: { user_id: 'user_123' },
});

// First turn
const response1 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id,
  input: 'What are the 5 Ds of dodgeball?',
});

// Second turn - model remembers context + reasoning
const response2 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id,
  input: 'Tell me more about the first one',
});

Benefits: No manual history tracking, reasoning preserved, 40-80% better cache utilization

Conversation Limits: 90-day expiration

Built-in Tools (Server-Side)

Server-side hosted tools eliminate backend round trips:

Tool	Purpose	Notes
`code_interpreter`	Execute Python code	Sandboxed, 30s timeout (use `background: true` for longer)
`file_search`	RAG without vector stores	Max 512MB per file, supports PDF/Word/Markdown/HTML/code
`web_search`	Real-time web information	Automatic source citations
`image_generation`	DALL-E integration	DALL-E 3 default

Usage:

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Calculate mean of: 10, 20, 30, 40, 50',
  tools: [{ type: 'code_interpreter' }],
});

Web Search TypeScript Note

TypeScript Limitation : The web_search tool's external_web_access option is missing from SDK types (as of v6.16.0).

Workaround :

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Search for recent news',
  tools: [{
    type: 'web_search',
    external_web_access: true,
  } as any],  // ✅ Type assertion to suppress error
});

Source : GitHub Issue #1716

MCP Server Integration

Built-in support for Model Context Protocol (MCP) servers to connect external tools (Stripe, databases, custom APIs).

User Approval Requirement

By default, explicit user approval is required before any data is shared with a remote MCP server (security feature).

Handling Approval :

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Get my Stripe balance',
  tools: [{
    type: 'mcp',
    server_label: 'stripe',
    server_url: 'https://mcp.stripe.com',
    authorization: process.env.STRIPE_TOKEN,
  }],
});

if (response.status === 'requires_approval') {
  // Show user: "This action requires sharing data with Stripe. Approve?"
  // After user approves, retry with approval token
}

Alternative : Pre-approve MCP servers in OpenAI dashboard (users configure trusted servers via settings)

Source : Official MCP Guide

Basic MCP Usage

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Roll 2d6 dice',
  tools: [{
    type: 'mcp',
    server_label: 'dice',
    server_url: 'https://example.com/mcp',
    authorization: process.env.TOKEN, // ⚠️ NOT stored, required each request
  }],
});

MCP Output Types:

mcp_list_tools - Tools discovered on server
mcp_call - Tool invocation + result
message - Final response

Reasoning Preservation

Key Innovation: Model's internal reasoning state survives across turns (unlike Chat Completions which discards it).

Visual Analogy:

Chat Completions: Model tears out scratchpad page before responding
Responses API: Scratchpad stays open for next turn

Performance: +5% on TAUBench (GPT-5) purely from preserved reasoning

Reasoning Summaries (free):

response.output.forEach(item => {
  if (item.type === 'reasoning') console.log(item.summary[0].text);
  if (item.type === 'message') console.log(item.content[0].text);
});

Important: Reasoning Traces Privacy

What You Get : Reasoning summaries (not full internal traces) What OpenAI Keeps : Full chain-of-thought reasoning (proprietary, for security/privacy)

For GPT-5-Thinking models:

OpenAI preserves reasoning internally in their backend
This preserved reasoning improves multi-turn performance (+5% TAUBench)
But developers only receive summaries , not the actual chain-of-thought
Full reasoning traces are not exposed (OpenAI's IP protection)

Source : Sean Goedecke Analysis

Background Mode

For long-running tasks, use background: true:

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Analyze 500-page document',
  background: true,
  tools: [{ type: 'file_search', file_ids: [fileId] }],
});

// Poll for completion (check every 5s)
const result = await openai.responses.retrieve(response.id);
if (result.status === 'completed') console.log(result.output_text);

Timeout Limits:

Standard: 60 seconds
Background: 10 minutes

Performance Considerations

Time-to-First-Token (TTFT) Latency: Background mode currently has higher TTFT compared to synchronous responses. OpenAI is working to reduce this gap.

Recommendation:

For user-facing real-time responses, use sync mode (lower latency)
For long-running async tasks, use background mode (latency acceptable)

Source : OpenAI Background Mode Docs

Data Retention and Privacy

Default Retention : 30 days when store: true (default) Zero Data Retention (ZDR) : Organizations with ZDR automatically enforce store: false Background Mode : NOT ZDR compatible (stores data ~10 minutes for polling)

Timeline :

September 26, 2025: OpenAI court-ordered retention ended
Current: 30-day default retention with store: true

Control Storage :

// Disable storage (no retention)
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello!',
  store: false,  // ✅ No retention
});

// ZDR organizations: store always treated as false
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello!',
  store: true,  // ⚠️ Ignored by OpenAI for ZDR orgs, treated as false
});

ZDR Compliance :

Avoid background mode (requires temporary storage)
Explicitly set store: false for clarity
Note: 60s timeout applies in sync mode

Source : OpenAI Data Controls

Polymorphic Outputs

Returns 8 output types instead of single message:

Type	Example
`message`	Final answer, explanation
`reasoning`	Step-by-step thought process (free!)
`code_interpreter_call`	Python code + results
`mcp_call`	Tool name, args, output
`mcp_list_tools`	Tool definitions from MCP server
`file_search_call`	Matched chunks, citations

Processing:

response.output.forEach(item => {
  if (item.type === 'reasoning') console.log(item.summary[0].text);
  if (item.type === 'web_search_call') console.log(item.results);
  if (item.type === 'message') console.log(item.content[0].text);
});

// Or use helper for text-only
console.log(response.output_text);

Migration from Chat Completions

Breaking Changes:

Feature	Chat Completions	Responses API
Endpoint	`/v1/chat/completions`	`/v1/responses`
Parameter	`messages`	`input`
Role	`system`	`developer`
Output	`choices[0].message.content`

Example:

// Before
const response = await openai.chat.completions.create({
  model: 'gpt-5',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});
console.log(response.choices[0].message.content);

// After
const response = await openai.responses.create({
  model: 'gpt-5',
  input: [
    { role: 'developer', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});
console.log(response.output_text);

Migration from Assistants API

CRITICAL: Assistants API Sunset Timeline

August 26, 2025 : Assistants API officially deprecated
2025-2026 : OpenAI providing migration utilities
August 26, 2026 : Assistants API sunset (stops working)

Migrate before August 26, 2026 to avoid breaking changes.

Source : Assistants API Sunset Announcement

Key Breaking Changes:

Assistants API	Responses API
Assistants (created via API)	Prompts (created in dashboard)
Threads	Conversations (store items, not just messages)
Runs (server-side lifecycle)	Responses (stateless calls)
Run-Steps	Items (polymorphic outputs)

Migration Example:

// Before (Assistants API - deprecated)
const assistant = await openai.beta.assistants.create({
  model: 'gpt-4',
  instructions: 'You are helpful.',
});

const thread = await openai.beta.threads.create();

const run = await openai.beta.threads.runs.create(thread.id, {
  assistant_id: assistant.id,
});

// After (Responses API - current)
const conversation = await openai.conversations.create({
  metadata: { purpose: 'customer_support' },
});

const response = await openai.responses.create({
  model: 'gpt-5',
  conversation: conversation.id,
  input: [
    { role: 'developer', content: 'You are helpful.' },
    { role: 'user', content: 'Hello!' },
  ],
});

Migration Guide : Official Assistants Migration Docs

Known Issues Prevention

This skill prevents 11 documented errors:

1. Session State Not Persisting

Cause: Not using conversation IDs or using different IDs per turn
Fix: Create conversation once (const conv = await openai.conversations.create()), reuse conv.id for all turns

2. MCP Server Connection Failed (mcp_connection_error)

Causes: Invalid URL, missing/expired auth token, server down
Fix: Verify URL is correct, test manually with fetch(), check token expiration

3. Code Interpreter Timeout (code_interpreter_timeout)

Cause: Code runs longer than 30 seconds
Fix: Use background: true for extended timeout (up to 10 min)

4. Image Generation Rate Limit (rate_limit_error)

Cause: Too many DALL-E requests
Fix: Implement exponential backoff retry (1s, 2s, 3s delays)

5. File Search Relevance Issues

Cause: Vague queries return irrelevant results
Fix: Use specific queries ("pricing in Q4 2024" not "find pricing"), filter by chunk.score > 0.7

6. Cost Tracking Confusion

Cause: Responses bills for input + output + tools + stored conversations (vs Chat Completions: input + output only)
Fix: Set store: false if not needed, monitor response.usage.tool_tokens

7. Conversation Not Found (invalid_request_error)

Causes: ID typo, conversation deleted, or expired (90-day limit)
Fix: Verify exists with openai.conversations.list() before using

8. Tool Output Parsing Failed

Cause: Accessing wrong output structure
Fix: Use response.output_text helper or iterate response.output.forEach(item => ...) checking item.type

9. Zod v4 Incompatibility with Structured Outputs

Error : Invalid schema for response_format 'name': schema must be a JSON Schema of 'type: "object"', got 'type: "string"'.
Source : GitHub Issue #1597
Why It Happens : SDK's vendored zod-to-json-schema library doesn't support Zod v4 (missing ZodFirstPartyTypeKind export)
Prevention : Pin to Zod v3 ("zod": "^3.23.8") or use custom zodTextFormat with z.toJSONSchema({ target: "draft-7" })

// Workaround: Pin to Zod v3 (recommended) { "dependencies": { "openai": "^6.16.0", "zod": "^3.23.8" // DO NOT upgrade to v4 yet } }

10. Background Mode Web Search Missing Sources

Error : web_search_call output items contain query but no sources/results
Source : GitHub Issue #1676
Why It Happens : When using background: true + web_search tool, OpenAI doesn't return sources in the response
Prevention : Use synchronous mode (background: false) when web search sources are needed

// ✅ Sources available in sync mode const response = await openai.responses.create({ model: 'gpt-5', input: 'Latest AI news?', background: false, // Required for sources tools: [{ type: 'web_search' }], });

11. Streaming Mode Missing output_text Helper

Error : finalResponse().output_text is undefined in streaming mode
Source : GitHub Issue #1662
Why It Happens : stream.finalResponse() doesn't include output_text convenience field (only available in non-streaming responses)
Prevention : Listen for output_text.done event or manually extract from output items

// Workaround: Listen for event const stream = openai.responses.stream({ model: 'gpt-5', input: 'Hello!' }); let outputText = ''; for await (const event of stream) { if (event.type === 'output_text.done') { outputText = event.output_text; // ✅ Available in event } }

Critical Patterns

✅ Always:

Use conversation IDs for multi-turn (40-80% better cache)
Handle all 8 output types in polymorphic responses
Use background: true for tasks >30s
Provide MCP authorization tokens (NOT stored, required each request)
Monitor response.usage.total_tokens for cost control

❌ Never:

Expose API keys in client-side code
Assume single message output (use response.output_text helper)
Reuse conversation IDs across users (security risk)
Ignore error types (handle rate_limit_error, mcp_connection_error specifically)
Poll faster than 1s for background tasks (use 5s intervals)

References

Official Docs:

Responses API Guide: https://platform.openai.com/docs/guides/responses
API Reference: https://platform.openai.com/docs/api-reference/responses
MCP Integration: https://platform.openai.com/docs/guides/tools-connectors-mcp
Blog Post: https://developers.openai.com/blog/responses-api/
Starter App: https://github.com/openai/openai-responses-starter-app

Skill Resources: templates/, references/responses-vs-chat-completions.md, references/mcp-integration-guide.md, references/built-in-tools-guide.md, references/migration-guide.md, references/top-errors.md

Last verified : 2026-01-21 | Skill version : 2.1.0 | Changes : Added 3 TIER 1 issues (Zod v4, background web search, streaming output_text), 2 TIER 2 findings (MCP approval, reasoning privacy), Data Retention & ZDR section, Assistants API sunset timeline, background mode TTFT note, web search TypeScript limitation. Updated SDK version to 6.16.0.

Weekly Installs

346

Repository

jezweb/claude-skills

GitHub Stars

650

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubFail SocketPass SnykWarn

Installed on

claude-code291

gemini-cli238

opencode234

codex212

cursor208

antigravity205

AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具

43,400 周安装

OpenAI Responses API 指南：有状态对话、内置工具与 Chat Completions 对比

🇨🇳中文介绍

OpenAI Responses API

什么是 Responses API？

相关 Skills

快速开始

何时使用 Responses 与 Chat Completions

有状态对话

内置工具（服务器端）

Web Search TypeScript 说明

MCP 服务器集成

用户批准要求

基本 MCP 用法

推理保留

重要：推理痕迹隐私

后台模式

性能考虑

数据保留和隐私

多态输出

从 Chat Completions 迁移

从 Assistants API 迁移

已知问题预防

关键模式

参考资料

🇺🇸English

OpenAI Responses API

What Is the Responses API?

Quick Start

When to Use Responses vs Chat Completions

Stateful Conversations

Built-in Tools (Server-Side)

Web Search TypeScript Note

MCP Server Integration

User Approval Requirement

Basic MCP Usage

Reasoning Preservation

Important: Reasoning Traces Privacy

Background Mode

Performance Considerations

Data Retention and Privacy

Polymorphic Outputs

Migration from Chat Completions

Migration from Assistants API

Known Issues Prevention

Critical Patterns

References

最新 Skills