openai-api by jezweb/claude-skills
npx skills add https://github.com/jezweb/claude-skills --skill openai-api版本 : 生产就绪 ✅ 包 : openai@6.16.0 最后更新 : 2026-01-20
✅ 生产就绪 :
npm install openai@6.16.0
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
export OPENAI_API_KEY="sk-..."
或创建 .env 文件:
OPENAI_API_KEY=sk-...
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const completion = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'user', content: '机器人三定律是什么?' }
],
});
console.log(completion.choices[0].message.content);
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-5',
messages: [
{ role: 'user', content: '机器人三定律是什么?' }
],
}),
});
const data = await response.json();
console.log(data.choices[0].message.content);
端点 : POST /v1/chat/completions
聊天补全 API 是与 OpenAI 语言模型交互的核心接口。它支持对话式 AI、文本生成、函数调用、结构化输出和视觉功能。
{
model: string, // 要使用的模型 (例如,"gpt-5")
messages: Message[], // 对话历史
reasoning_effort?: string, // 仅限 GPT-5:"minimal" | "low" | "medium" | "high"
verbosity?: string, // 仅限 GPT-5:"low" | "medium" | "high"
temperature?: number, // GPT-5 不支持
max_tokens?: number, // 要生成的最大令牌数
stream?: boolean, // 启用流式传输
tools?: Tool[], // 函数调用工具
}
{
id: string, // 唯一的补全 ID
object: "chat.completion",
created: number, // Unix 时间戳
model: string, // 使用的模型
choices: [{
index: number,
message: {
role: "assistant",
content: string, // 生成的文本
tool_calls?: ToolCall[] // 如果进行函数调用
},
finish_reason: string // "stop" | "length" | "tool_calls"
}],
usage: {
prompt_tokens: number,
completion_tokens: number,
total_tokens: number
}
}
三种角色:system (行为设定)、user (输入)、assistant (模型响应)。
重要 : API 是无状态的 - 每次请求都需要发送完整的对话历史。对于有状态对话,请使用 openai-responses 技能。
GPT-5 模型 (2025年8月发布) 引入了推理和详细程度控制。
最新的旗舰模型 :
gpt-5.2 : 400k 上下文窗口,128k 输出令牌
xhigh reasoning_effort : 超越 "high" 的新级别,用于复杂问题
压缩 : 为长工作流扩展上下文 (通过 API 端点)
定价 : 每百万令牌 $1.75/$14 (GPT-5.1 的 1.4 倍)
// 具有最大推理能力的 GPT-5.2 const completion = await openai.chat.completions.create({ model: 'gpt-5.2', messages: [{ role: 'user', content: '解决这个极其复杂的问题...' }], reasoning_effort: 'xhigh', // 新功能:超越 "high" });
更温暖、更智能的模型 :
重大变更 : GPT-5.1/5.2 默认使用 reasoning_effort: 'none' (而 GPT-5 默认使用 'medium')。
专用的推理模型 (与 GPT-5 分开):
| 模型 | 发布日期 | 用途 |
|---|---|---|
| o3 | 2025年4月16日 | o1 的继任者,高级推理 |
| o3-pro | 2025年6月10日 | o3 的扩展计算版本 |
| o3-mini | 2025年1月31日 | 更小、更快的 o3 变体 |
| o4-mini | 2025年4月16日 | 快速、高性价比的推理 |
// O 系列模型
const completion = await openai.chat.completions.create({
model: 'o3', // 或 'o3-mini', 'o4-mini'
messages: [{ role: 'user', content: '复杂的推理任务...' }],
});
注意 : O 系列可能会被弃用,转而支持带有 reasoning_effort 参数的 GPT-5。
控制思考深度 (GPT-5/5.1/5.2):
控制输出详细程度 (GPT-5 系列):
不支持 :
temperature、top_p、logprobs 参数替代方案 : 使用 GPT-4o 处理 temperature/top_p,或使用 openai-responses 技能进行有状态推理
使用 stream: true 启用逐令牌交付。
const stream = await openai.chat.completions.create({
model: 'gpt-5.1',
messages: [{ role: 'user', content: '写一首诗' }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
process.stdout.write(content);
}
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-5.1',
messages: [{ role: 'user', content: '写一首诗' }],
stream: true,
}),
});
const reader = response.body?.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader!.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n').filter(line => line.trim() !== '');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') break;
try {
const json = JSON.parse(data);
const content = json.choices[0]?.delta?.content || '';
console.log(content);
} catch (e) {
// 跳过无效的 JSON
}
}
}
}
服务器发送事件 (SSE) 格式 :
data: {"id":"chatcmpl-xyz","choices":[{"delta":{"content":"Hello"}}]}
data: [DONE]
关键点 : 优雅地处理不完整的块、[DONE] 信号和无效的 JSON。
使用 JSON 模式定义工具,模型根据上下文调用它们。
const tools = [{
type: 'function',
function: {
name: 'get_weather',
description: '获取指定位置的当前天气',
parameters: {
type: 'object',
properties: {
location: { type: 'string', description: '城市名称' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
},
required: ['location']
}
}
}];
const completion = await openai.chat.completions.create({
model: 'gpt-5.1',
messages: [{ role: 'user', content: '旧金山的天气怎么样?' }],
tools: tools,
});
const message = completion.choices[0].message;
if (message.tool_calls) {
for (const toolCall of message.tool_calls) {
const args = JSON.parse(toolCall.function.arguments);
const result = await executeFunction(toolCall.function.name, args);
// 将结果发送回模型
await openai.chat.completions.create({
model: 'gpt-5.1',
messages: [
...messages,
message,
{
role: 'tool',
tool_call_id: toolCall.id,
content: JSON.stringify(result)
}
],
tools: tools,
});
}
}
循环模式 : 持续调用 API,直到响应中没有 tool_calls。
结构化输出允许您对模型响应强制执行 JSON 模式验证。
const completion = await openai.chat.completions.create({
model: 'gpt-4o', // 注意:结构化输出在 GPT-4o 上支持最好
messages: [
{ role: 'user', content: '生成一个人物简介' }
],
response_format: {
type: 'json_schema',
json_schema: {
name: 'person_profile',
strict: true,
schema: {
type: 'object',
properties: {
name: { type: 'string' },
age: { type: 'number' },
skills: {
type: 'array',
items: { type: 'string' }
}
},
required: ['name', 'age', 'skills'],
additionalProperties: false
}
}
}
});
const person = JSON.parse(completion.choices[0].message.content);
// { name: "Alice", age: 28, skills: ["TypeScript", "React"] }
对于不需要严格模式验证的简单用例:
const completion = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'user', content: '以 JSON 格式列出 3 种编程语言' }
],
response_format: { type: 'json_object' }
});
const data = JSON.parse(completion.choices[0].message.content);
重要 : 使用 response_format 时,请在提示中包含 "JSON" 来引导模型。
GPT-4o 支持图像理解以及文本。
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: '这张图片里有什么?' },
{
type: 'image_url',
image_url: {
url: 'https://example.com/image.jpg'
}
}
]
}
]
});
import fs from 'fs';
const imageBuffer = fs.readFileSync('./image.jpg');
const base64Image = imageBuffer.toString('base64');
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: '详细描述这张图片' },
{
type: 'image_url',
image_url: {
url: `data:image/jpeg;base64,${base64Image}`
}
}
]
}
]
});
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: '比较这两张图片' },
{ type: 'image_url', image_url: { url: 'https://example.com/image1.jpg' } },
{ type: 'image_url', image_url: { url: 'https://example.com/image2.jpg' } }
]
}
]
});
端点 : POST /v1/embeddings
将文本转换为向量,用于语义搜索和 RAG。
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: '食物很美味。',
});
// 返回: { data: [{ embedding: [0.002, -0.009, ...] }] }
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: '示例文本',
dimensions: 256, // 从默认的 1536 减少
});
优势 : 存储减少 4-12 倍,搜索更快,质量损失最小。
const embeddings = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: ['第一个文档', '第二个文档', '第三个文档'],
});
限制 : 每个输入 8192 个令牌,批次总计 300k 个令牌,最大数组大小 2048。
关键点 : 使用自定义维度以提高效率,批量处理最多 2048 个文档,缓存嵌入 (确定性)。
端点 : POST /v1/images/generations
const image = await openai.images.generate({
model: 'dall-e-3',
prompt: '一只蓝眼睛引人注目的白色暹罗猫',
size: '1024x1024', // 也支持: 1024x1536, 1536x1024, 1024x1792, 1792x1024
quality: 'standard', // 或 'hd'
style: 'vivid', // 或 'natural'
});
console.log(image.data[0].url);
console.log(image.data[0].revised_prompt); // DALL-E 3 可能会出于安全/质量原因修改提示
DALL-E 3 特性 :
n: 1 (每个请求一张图像)revised_prompt)response_format: 'b64_json' 以持久化)端点 : POST /v1/images/edits
重要 : 使用 multipart/form-data,而不是 JSON。
import FormData from 'form-data';
const formData = new FormData();
formData.append('model', 'gpt-image-1');
formData.append('image', fs.createReadStream('./woman.jpg'));
formData.append('image_2', fs.createReadStream('./logo.png')); // 可选的合成图像
formData.append('prompt', '将徽标添加到织物上。');
formData.append('input_fidelity', 'high'); // low|medium|high
formData.append('format', 'png'); // 支持透明度
formData.append('background', 'transparent'); // transparent|white|black
const response = await fetch('https://api.openai.com/v1/images/edits', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
...formData.getHeaders(),
},
body: formData,
});
GPT-Image-1 功能 : 支持透明度 (PNG/WebP)、与 image_2 合成、输出压缩控制。
端点 : POST /v1/audio/transcriptions
const transcription = await openai.audio.transcriptions.create({
file: fs.createReadStream('./audio.mp3'),
model: 'whisper-1',
});
// 返回: { text: "转录的文本..." }
格式 : mp3, mp4, mpeg, mpga, m4a, wav, webm
端点 : POST /v1/audio/speech
模型 :
11 种语音 : alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse
const mp3 = await openai.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: '要朗读的文本 (最多 4096 个字符)',
speed: 1.0, // 0.25-4.0
response_format: 'mp3', // mp3|opus|aac|flac|wav|pcm
});
const speech = await openai.audio.speech.create({
model: 'gpt-4o-mini-tts',
voice: 'nova',
input: '欢迎使用支持服务。',
instructions: '以平静、专业的语调说话。', // 自定义语音控制
});
const response = await fetch('https://api.openai.com/v1/audio/speech', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-4o-mini-tts',
voice: 'nova',
input: '长文本...',
stream_format: 'sse', // 服务器发送事件
}),
});
注意 : instructions 和 stream_format: "sse" 仅适用于 gpt-4o-mini-tts。
端点 : POST /v1/moderations
检查 11 个安全类别的内容。
const moderation = await openai.moderations.create({
model: 'omni-moderation-latest',
input: '要审核的文本',
});
console.log(moderation.results[0].flagged);
console.log(moderation.results[0].categories);
console.log(moderation.results[0].category_scores); // 0.0-1.0
分数 : 0.0 (低置信度) 到 1.0 (高置信度)
const moderation = await openai.moderations.create({
model: 'omni-moderation-latest',
input: ['文本 1', '文本 2', '文本 3'],
});
最佳实践 : 对严重类别使用较低的阈值 (sexual/minors: 0.1, self-harm/intent: 0.2),批量请求,出错时采用故障关闭策略。
通过 WebSocket/WebRTC 进行低延迟语音和音频交互。于 2025年8月28日发布正式版。
更新 (2025年2月) : 并发会话限制已取消 - 现在支持无限同时连接。
const ws = new WebSocket('wss://api.openai.com/v1/realtime', {
headers: {
Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
'OpenAI-Beta': 'realtime=v1',
},
});
ws.onopen = () => {
ws.send(JSON.stringify({
type: 'session.update',
session: {
voice: 'alloy', // 或: echo, fable, onyx, nova, shimmer, marin, cedar
instructions: '你是一个乐于助人的助手',
input_audio_transcription: { model: 'whisper-1' },
},
}));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
switch (data.type) {
case 'response.audio.delta':
// 处理音频块 (base64 编码)
playAudioChunk(data.delta);
break;
case 'response.text.delta':
// 处理文本转录
console.log(data.delta);
break;
}
};
// 发送用户音频
ws.send(JSON.stringify({
type: 'input_audio_buffer.append',
audio: base64AudioData,
}));
以 50% 的更低成本处理大量数据,最长周转时间为 24 小时。
注意 : 虽然补全窗口最长为 24 小时,但作业通常完成得更快 (报告显示,对于估计需要 10 小时以上的任务,实际完成时间不到 1 小时)。
// 1. 创建包含请求的 JSONL 文件
const requests = [
{ custom_id: 'req-1', method: 'POST', url: '/v1/chat/completions',
body: { model: 'gpt-5.1', messages: [{ role: 'user', content: 'Hello 1' }] } },
{ custom_id: 'req-2', method: 'POST', url: '/v1/chat/completions',
body: { model: 'gpt-5.1', messages: [{ role: 'user', content: 'Hello 2' }] } },
];
// 2. 上传文件
const file = await openai.files.create({
file: new File([requests.map(r => JSON.stringify(r)).join('\n')], 'batch.jsonl'),
purpose: 'batch',
});
// 3. 创建批量任务
const batch = await openai.batches.create({
input_file_id: file.id,
endpoint: '/v1/chat/completions',
completion_window: '24h',
});
console.log(batch.id); // batch_abc123
const batch = await openai.batches.retrieve('batch_abc123');
console.log(batch.status); // validating, in_progress, completed, failed
console.log(batch.request_counts); // { total, completed, failed }
if (batch.status === 'completed') {
const results = await openai.files.content(batch.output_file_id);
// 解析 JSONL 结果
}
| 使用场景 | 使用批量 API? |
|---|---|
| 大规模内容审核 | ✅ |
| 文档处理 (嵌入) | ✅ |
| 批量摘要 | ✅ |
| 实时聊天 | ❌ 使用聊天 API |
| 流式响应 | ❌ 使用聊天 API |
401 : 无效的 API 密钥
429 : 超出速率限制 (实施指数退避)
500/503 : 服务器错误 (使用退避重试)
async function completionWithRetry(params, maxRetries = 3) { for (let i = 0; i < maxRetries; i++) { try { return await openai.chat.completions.create(params); } catch (error) { if (error.status === 429 && i < maxRetries - 1) { await new Promise(resolve => setTimeout(resolve, Math.pow(2, i) * 1000)); continue; } throw error; } } }
response.headers.get('x-ratelimit-limit-requests');
response.headers.get('x-ratelimit-remaining-requests');
response.headers.get('x-ratelimit-reset-requests');
限制 : 基于 RPM (每分钟请求数)、TPM (每分钟令牌数)、IPM (每分钟图像数)。根据层级和模型而异。
错误 : 400 The requested model 'gpt-5.1-mini' does not exist 来源 : GitHub Issue #1706
错误 :
model: 'gpt-5.1-mini' // 不存在
正确 :
model: 'gpt-5-mini' // 正确 (没有 .1 后缀)
可用的 GPT-5 系列模型:
gpt-5, gpt-5-mini, gpt-5-nanogpt-5.1, gpt-5.2gpt-5.1-mini 或 gpt-5.2-mini - mini 变体没有 .1/.2 版本错误 : ValueError: shapes (0,256) and (1536,) not aligned
确保向量数据库维度与嵌入 API 的 dimensions 参数匹配:
// ❌ 错误 - 缺少 dimensions 参数,返回默认的 1536
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'text',
});
// ✅ 正确 - 指定维度以匹配数据库
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'text',
dimensions: 256, // 匹配你的向量数据库配置
});
问题 : GPT-5.1 和 GPT-5.2 默认使用 reasoning_effort: 'none' (与 GPT-5 相比是重大变更)
// GPT-5 (默认 'medium')
model: 'gpt-5' // 自动推理
// GPT-5.1 (默认 'none')
model: 'gpt-5.1' // 除非指定,否则没有推理!
reasoning_effort: 'medium' // 必须显式添加
问题 : GitHub Issue #1402
使用 strictNullChecks: true 时,usage 字段可能导致类型错误:
// ❌ 使用 strictNullChecks 时的 TypeScript 错误
const tokens = completion.usage.total_tokens;
// ✅ 使用可选链或空值检查
const tokens = completion.usage?.total_tokens ?? 0;
// 或显式检查
if (completion.usage) {
const tokens = completion.usage.total_tokens;
}
问题 : GitHub Issue #1718
多模态请求包含 text_tokens 和 image_tokens 字段,但这些字段不在 TypeScript 类型中:
// 这些字段存在但未类型化
const usage = completion.usage as any;
console.log(usage.text_tokens);
console.log(usage.image_tokens);
问题 : GitHub Issue #1709
在 Zod 4.1.13+ 中使用 zodResponseFormat() 会破坏联合类型转换:
// ❌ 在 Zod 4.1.13+ 中损坏
const schema = z.object({
status: z.union([z.literal('success'), z.literal('error')]),
});
// ✅ 解决方法:改用 enum
const schema = z.object({
status: z.enum(['success', 'error']),
});
替代方案 :
安全 : 切勿在客户端暴露 API 密钥,使用服务器端代理,将密钥存储在环境变量中。
性能 : 流式传输超过 100 个令牌的响应,适当设置 max_tokens,缓存确定性响应。
成本 : 对于简单任务使用 gpt-5.1 并设置 reasoning_effort: 'none',对于复杂推理使用 gpt-5.1 并设置 'high'。
传统/无状态 API,适用于:
特点 :
有状态/智能体 API,适用于:
特点 :
Version : Production Ready ✅ Package : openai@6.16.0 Last Updated : 2026-01-20
✅ Production Ready :
npm install openai@6.16.0
export OPENAI_API_KEY="sk-..."
Or create .env file:
OPENAI_API_KEY=sk-...
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const completion = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'user', content: 'What are the three laws of robotics?' }
],
});
console.log(completion.choices[0].message.content);
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-5',
messages: [
{ role: 'user', content: 'What are the three laws of robotics?' }
],
}),
});
const data = await response.json();
console.log(data.choices[0].message.content);
Endpoint : POST /v1/chat/completions
The Chat Completions API is the core interface for interacting with OpenAI's language models. It supports conversational AI, text generation, function calling, structured outputs, and vision capabilities.
{
model: string, // Model to use (e.g., "gpt-5")
messages: Message[], // Conversation history
reasoning_effort?: string, // GPT-5 only: "minimal" | "low" | "medium" | "high"
verbosity?: string, // GPT-5 only: "low" | "medium" | "high"
temperature?: number, // NOT supported by GPT-5
max_tokens?: number, // Max tokens to generate
stream?: boolean, // Enable streaming
tools?: Tool[], // Function calling tools
}
{
id: string, // Unique completion ID
object: "chat.completion",
created: number, // Unix timestamp
model: string, // Model used
choices: [{
index: number,
message: {
role: "assistant",
content: string, // Generated text
tool_calls?: ToolCall[] // If function calling
},
finish_reason: string // "stop" | "length" | "tool_calls"
}],
usage: {
prompt_tokens: number,
completion_tokens: number,
total_tokens: number
}
}
Three roles: system (behavior), user (input), assistant (model responses).
Important : API is stateless - send full conversation history each request. For stateful conversations, use openai-responses skill.
GPT-5 models (released August 2025) introduce reasoning and verbosity controls.
Latest flagship model :
gpt-5.2 : 400k context window, 128k output tokens
xhigh reasoning_effort : New level beyond "high" for complex problems
Compaction : Extends context for long workflows (via API endpoint)
Pricing : $1.75/$14 per million tokens (1.4x of GPT-5.1)
// GPT-5.2 with maximum reasoning const completion = await openai.chat.completions.create({ model: 'gpt-5.2', messages: [{ role: 'user', content: 'Solve this extremely complex problem...' }], reasoning_effort: 'xhigh', // NEW: Beyond "high" });
Warmer, more intelligent model :
BREAKING CHANGE : GPT-5.1/5.2 default to reasoning_effort: 'none' (vs GPT-5 defaulting to 'medium').
Dedicated reasoning models (separate from GPT-5):
| Model | Released | Purpose |
|---|---|---|
| o3 | Apr 16, 2025 | Successor to o1, advanced reasoning |
| o3-pro | Jun 10, 2025 | Extended compute version of o3 |
| o3-mini | Jan 31, 2025 | Smaller, faster o3 variant |
| o4-mini | Apr 16, 2025 | Fast, cost-efficient reasoning |
// O-series models
const completion = await openai.chat.completions.create({
model: 'o3', // or 'o3-mini', 'o4-mini'
messages: [{ role: 'user', content: 'Complex reasoning task...' }],
});
Note : O-series may be deprecated in favor of GPT-5 with reasoning_effort parameter.
Controls thinking depth (GPT-5/5.1/5.2):
Controls output detail (GPT-5 series):
NOT Supported :
temperature, top_p, logprobs parametersAlternatives : Use GPT-4o for temperature/top_p, or openai-responses skill for stateful reasoning
Enable with stream: true for token-by-token delivery.
const stream = await openai.chat.completions.create({
model: 'gpt-5.1',
messages: [{ role: 'user', content: 'Write a poem' }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
process.stdout.write(content);
}
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-5.1',
messages: [{ role: 'user', content: 'Write a poem' }],
stream: true,
}),
});
const reader = response.body?.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader!.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n').filter(line => line.trim() !== '');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') break;
try {
const json = JSON.parse(data);
const content = json.choices[0]?.delta?.content || '';
console.log(content);
} catch (e) {
// Skip invalid JSON
}
}
}
}
Server-Sent Events (SSE) format :
data: {"id":"chatcmpl-xyz","choices":[{"delta":{"content":"Hello"}}]}
data: [DONE]
Key Points : Handle incomplete chunks, [DONE] signal, and invalid JSON gracefully.
Define tools with JSON schema, model invokes them based on context.
const tools = [{
type: 'function',
function: {
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
},
required: ['location']
}
}
}];
const completion = await openai.chat.completions.create({
model: 'gpt-5.1',
messages: [{ role: 'user', content: 'What is the weather in SF?' }],
tools: tools,
});
const message = completion.choices[0].message;
if (message.tool_calls) {
for (const toolCall of message.tool_calls) {
const args = JSON.parse(toolCall.function.arguments);
const result = await executeFunction(toolCall.function.name, args);
// Send result back to model
await openai.chat.completions.create({
model: 'gpt-5.1',
messages: [
...messages,
message,
{
role: 'tool',
tool_call_id: toolCall.id,
content: JSON.stringify(result)
}
],
tools: tools,
});
}
}
Loop pattern : Continue calling API until no tool_calls in response.
Structured outputs allow you to enforce JSON schema validation on model responses.
const completion = await openai.chat.completions.create({
model: 'gpt-4o', // Note: Structured outputs best supported on GPT-4o
messages: [
{ role: 'user', content: 'Generate a person profile' }
],
response_format: {
type: 'json_schema',
json_schema: {
name: 'person_profile',
strict: true,
schema: {
type: 'object',
properties: {
name: { type: 'string' },
age: { type: 'number' },
skills: {
type: 'array',
items: { type: 'string' }
}
},
required: ['name', 'age', 'skills'],
additionalProperties: false
}
}
}
});
const person = JSON.parse(completion.choices[0].message.content);
// { name: "Alice", age: 28, skills: ["TypeScript", "React"] }
For simpler use cases without strict schema validation:
const completion = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'user', content: 'List 3 programming languages as JSON' }
],
response_format: { type: 'json_object' }
});
const data = JSON.parse(completion.choices[0].message.content);
Important : When using response_format, include "JSON" in your prompt to guide the model.
GPT-4o supports image understanding alongside text.
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{
type: 'image_url',
image_url: {
url: 'https://example.com/image.jpg'
}
}
]
}
]
});
import fs from 'fs';
const imageBuffer = fs.readFileSync('./image.jpg');
const base64Image = imageBuffer.toString('base64');
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Describe this image in detail' },
{
type: 'image_url',
image_url: {
url: `data:image/jpeg;base64,${base64Image}`
}
}
]
}
]
});
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Compare these two images' },
{ type: 'image_url', image_url: { url: 'https://example.com/image1.jpg' } },
{ type: 'image_url', image_url: { url: 'https://example.com/image2.jpg' } }
]
}
]
});
Endpoint : POST /v1/embeddings
Convert text to vectors for semantic search and RAG.
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'The food was delicious.',
});
// Returns: { data: [{ embedding: [0.002, -0.009, ...] }] }
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'Sample text',
dimensions: 256, // Reduced from 1536 default
});
Benefits : 4x-12x storage reduction, faster search, minimal quality loss.
const embeddings = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: ['First doc', 'Second doc', 'Third doc'],
});
Limits : 8192 tokens/input, 300k tokens total across batch, 2048 max array size.
Key Points : Use custom dimensions for efficiency, batch up to 2048 docs, cache embeddings (deterministic).
Endpoint : POST /v1/images/generations
const image = await openai.images.generate({
model: 'dall-e-3',
prompt: 'A white siamese cat with striking blue eyes',
size: '1024x1024', // Also: 1024x1536, 1536x1024, 1024x1792, 1792x1024
quality: 'standard', // or 'hd'
style: 'vivid', // or 'natural'
});
console.log(image.data[0].url);
console.log(image.data[0].revised_prompt); // DALL-E 3 may revise for safety
DALL-E 3 Specifics :
n: 1 (one image per request)revised_prompt)response_format: 'b64_json' for persistence)Endpoint : POST /v1/images/edits
Important : Uses multipart/form-data, not JSON.
import FormData from 'form-data';
const formData = new FormData();
formData.append('model', 'gpt-image-1');
formData.append('image', fs.createReadStream('./woman.jpg'));
formData.append('image_2', fs.createReadStream('./logo.png')); // Optional composite
formData.append('prompt', 'Add the logo to the fabric.');
formData.append('input_fidelity', 'high'); // low|medium|high
formData.append('format', 'png'); // Supports transparency
formData.append('background', 'transparent'); // transparent|white|black
const response = await fetch('https://api.openai.com/v1/images/edits', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
...formData.getHeaders(),
},
body: formData,
});
GPT-Image-1 Features : Supports transparency (PNG/WebP), compositing with image_2, output compression control.
Endpoint : POST /v1/audio/transcriptions
const transcription = await openai.audio.transcriptions.create({
file: fs.createReadStream('./audio.mp3'),
model: 'whisper-1',
});
// Returns: { text: "Transcribed text..." }
Formats : mp3, mp4, mpeg, mpga, m4a, wav, webm
Endpoint : POST /v1/audio/speech
Models :
11 Voices : alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse
const mp3 = await openai.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: 'Text to speak (max 4096 chars)',
speed: 1.0, // 0.25-4.0
response_format: 'mp3', // mp3|opus|aac|flac|wav|pcm
});
const speech = await openai.audio.speech.create({
model: 'gpt-4o-mini-tts',
voice: 'nova',
input: 'Welcome to support.',
instructions: 'Speak in a calm, professional tone.', // Custom voice control
});
const response = await fetch('https://api.openai.com/v1/audio/speech', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-4o-mini-tts',
voice: 'nova',
input: 'Long text...',
stream_format: 'sse', // Server-Sent Events
}),
});
Note : instructions and stream_format: "sse" only work with gpt-4o-mini-tts.
Endpoint : POST /v1/moderations
Check content across 11 safety categories.
const moderation = await openai.moderations.create({
model: 'omni-moderation-latest',
input: 'Text to moderate',
});
console.log(moderation.results[0].flagged);
console.log(moderation.results[0].categories);
console.log(moderation.results[0].category_scores); // 0.0-1.0
Scores : 0.0 (low confidence) to 1.0 (high confidence)
const moderation = await openai.moderations.create({
model: 'omni-moderation-latest',
input: ['Text 1', 'Text 2', 'Text 3'],
});
Best Practices : Use lower thresholds for severe categories (sexual/minors: 0.1, self-harm/intent: 0.2), batch requests, fail closed on errors.
Low-latency voice and audio interactions via WebSocket/WebRTC. GA August 28, 2025.
Update (Feb 2025) : Concurrent session limit removed - unlimited simultaneous connections now supported.
const ws = new WebSocket('wss://api.openai.com/v1/realtime', {
headers: {
Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
'OpenAI-Beta': 'realtime=v1',
},
});
ws.onopen = () => {
ws.send(JSON.stringify({
type: 'session.update',
session: {
voice: 'alloy', // or: echo, fable, onyx, nova, shimmer, marin, cedar
instructions: 'You are a helpful assistant',
input_audio_transcription: { model: 'whisper-1' },
},
}));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
switch (data.type) {
case 'response.audio.delta':
// Handle audio chunk (base64 encoded)
playAudioChunk(data.delta);
break;
case 'response.text.delta':
// Handle text transcript
console.log(data.delta);
break;
}
};
// Send user audio
ws.send(JSON.stringify({
type: 'input_audio_buffer.append',
audio: base64AudioData,
}));
Process large volumes with 24-hour maximum turnaround at 50% lower cost.
Note : While the completion window is 24 hours maximum, jobs often complete much faster (reports show completion in under 1 hour for tasks estimated at 10+ hours).
// 1. Create JSONL file with requests
const requests = [
{ custom_id: 'req-1', method: 'POST', url: '/v1/chat/completions',
body: { model: 'gpt-5.1', messages: [{ role: 'user', content: 'Hello 1' }] } },
{ custom_id: 'req-2', method: 'POST', url: '/v1/chat/completions',
body: { model: 'gpt-5.1', messages: [{ role: 'user', content: 'Hello 2' }] } },
];
// 2. Upload file
const file = await openai.files.create({
file: new File([requests.map(r => JSON.stringify(r)).join('\n')], 'batch.jsonl'),
purpose: 'batch',
});
// 3. Create batch
const batch = await openai.batches.create({
input_file_id: file.id,
endpoint: '/v1/chat/completions',
completion_window: '24h',
});
console.log(batch.id); // batch_abc123
const batch = await openai.batches.retrieve('batch_abc123');
console.log(batch.status); // validating, in_progress, completed, failed
console.log(batch.request_counts); // { total, completed, failed }
if (batch.status === 'completed') {
const results = await openai.files.content(batch.output_file_id);
// Parse JSONL results
}
| Use Case | Batch API? |
|---|---|
| Content moderation at scale | ✅ |
| Document processing (embeddings) | ✅ |
| Bulk summarization | ✅ |
| Real-time chat | ❌ Use Chat API |
| Streaming responses | ❌ Use Chat API |
401 : Invalid API key
429 : Rate limit exceeded (implement exponential backoff)
500/503 : Server errors (retry with backoff)
async function completionWithRetry(params, maxRetries = 3) { for (let i = 0; i < maxRetries; i++) { try { return await openai.chat.completions.create(params); } catch (error) { if (error.status === 429 && i < maxRetries - 1) { await new Promise(resolve => setTimeout(resolve, Math.pow(2, i) * 1000)); continue; } throw error; } } }
response.headers.get('x-ratelimit-limit-requests');
response.headers.get('x-ratelimit-remaining-requests');
response.headers.get('x-ratelimit-reset-requests');
Limits : Based on RPM (Requests/Min), TPM (Tokens/Min), IPM (Images/Min). Varies by tier and model.
Error : 400 The requested model 'gpt-5.1-mini' does not exist Source : GitHub Issue #1706
Wrong :
model: 'gpt-5.1-mini' // Does not exist
Correct :
model: 'gpt-5-mini' // Correct (no .1 suffix)
Available GPT-5 series models:
gpt-5, gpt-5-mini, gpt-5-nanogpt-5.1, gpt-5.2gpt-5.1-mini or gpt-5.2-mini - mini variant doesn't have .1/.2 versionsError : ValueError: shapes (0,256) and (1536,) not aligned
Ensure vector database dimensions match embeddings API dimensions parameter:
// ❌ Wrong - missing dimensions, returns 1536 default
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'text',
});
// ✅ Correct - specify dimensions to match database
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'text',
dimensions: 256, // Match your vector database config
});
Issue : GPT-5.1 and GPT-5.2 default to reasoning_effort: 'none' (breaking change from GPT-5)
// GPT-5 (defaults to 'medium')
model: 'gpt-5' // Automatic reasoning
// GPT-5.1 (defaults to 'none')
model: 'gpt-5.1' // NO reasoning unless specified!
reasoning_effort: 'medium' // Must add explicitly
Issue : GitHub Issue #1402
With strictNullChecks: true, the usage field may cause type errors:
// ❌ TypeScript error with strictNullChecks
const tokens = completion.usage.total_tokens;
// ✅ Use optional chaining or null check
const tokens = completion.usage?.total_tokens ?? 0;
// Or explicit check
if (completion.usage) {
const tokens = completion.usage.total_tokens;
}
Issue : GitHub Issue #1718
Multimodal requests include text_tokens and image_tokens fields not in TypeScript types:
// These fields exist but aren't typed
const usage = completion.usage as any;
console.log(usage.text_tokens);
console.log(usage.image_tokens);
Issue : GitHub Issue #1709
Using zodResponseFormat() with Zod 4.1.13+ breaks union type conversion:
// ❌ Broken with Zod 4.1.13+
const schema = z.object({
status: z.union([z.literal('success'), z.literal('error')]),
});
// ✅ Workaround: Use enum instead
const schema = z.object({
status: z.enum(['success', 'error']),
});
Alternatives :
Security : Never expose API keys client-side, use server-side proxy, store keys in environment variables.
Performance : Stream responses >100 tokens, set max_tokens appropriately, cache deterministic responses.
Cost : Use gpt-5.1 with reasoning_effort: 'none' for simple tasks, gpt-5.1 with 'high' for complex reasoning.
Traditional/stateless API for:
Characteristics :
Stateful/agentic API for:
Characteristics :
| Use Case | Use openai-api | Use openai-responses |
|---|---|---|
| Simple chat | ✅ | ❌ |
| RAG/embeddings | ✅ | ❌ |
| Image generation | ✅ | ✅ |
| Audio processing | ✅ | ❌ |
| Agentic workflows | ❌ | ✅ |
| Multi-turn reasoning | ❌ | ✅ |
| Background tasks | ❌ | ✅ |
| Custom tools only | ✅ | ❌ |
| Built-in + custom tools | ❌ | ✅ |
Use both : Many apps use openai-api for embeddings/images/audio and openai-responses for conversational agents.
npm install openai@6.16.0
Environment : OPENAI_API_KEY=sk-...
TypeScript : Fully typed with included definitions.
✅ Skill Complete - Production Ready
All API sections documented:
Remaining Tasks :
See /planning/research-logs/openai-api.md for complete research notes.
Token Savings : ~60% (12,500 tokens saved vs manual implementation) Errors Prevented : 16 documented common issues (6 new from Jan 2026 research) Production Tested : Ready for immediate use Last Verified : 2026-01-20 | Skill Version : 2.1.0 | Changes : Added TypeScript gotchas, common mistakes, and TIER 1-2 findings from community research
Weekly Installs
465
Repository
GitHub Stars
650
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubFailSocketPassSnykWarn
Installed on
claude-code365
opencode324
gemini-cli320
codex286
cursor272
antigravity261
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
41,400 周安装
App Store Connect 元数据自动化本地化工具 - asc-localize-metadata 使用指南
1,000 周安装
Python后端开发指南:FastAPI、SQLAlchemy异步与Upstash缓存最佳实践
959 周安装
NotebookLM Python库:自动化访问Google NotebookLM,实现AI内容创作与文档处理
1,100 周安装
Base44 CLI 工具 - 创建和管理 Base44 应用项目 | 命令行开发工具
975 周安装
Ant Design 最佳实践指南:React 组件库使用决策、主题配置与性能优化
998 周安装
SwiftUI 开发模式指南:状态管理、视图组合与导航最佳实践
1,100 周安装