ai-models by alinaqi/claude-bootstrap
npx skills add https://github.com/alinaqi/claude-bootstrap --skill ai-models加载方式:base.md + llm-patterns.md
最后更新:2025年12月
为任务选择合适的模型。 更大并不总是更好——根据任务需求匹配模型能力。权衡成本、延迟和准确性。
| 任务 | 推荐模型 | 原因 |
|---|---|---|
| 复杂推理 | Claude Opus 4.5, o3, Gemini 3 Pro | 最高准确性 |
| 快速聊天/补全 | Claude Haiku, GPT-4.1 mini, Gemini Flash | 低延迟,成本低 |
| 代码生成 | Claude Sonnet 4.5, Codestral, GPT-4.1 | 强大的编码能力 |
| 视觉/图像 | Claude Sonnet, GPT-4o, Gemini 3 Pro | 多模态 |
| 嵌入 | text-embedding-3-small, Voyage | 性价比高 |
| 语音合成 | Eleven Labs v3, OpenAI TTS | 声音自然 |
| 图像生成 |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| FLUX.2, DALL-E 3, SD 3.5 |
| 不同风格 |
const CLAUDE_MODELS = {
// 旗舰 - 最高能力
opus: 'claude-opus-4-5-20251101',
// 均衡 - 最适合大多数任务
sonnet: 'claude-sonnet-4-5-20250929',
// 上一代 (仍然优秀)
opus4: 'claude-opus-4-20250514',
sonnet4: 'claude-sonnet-4-20250514',
// 快速且廉价 - 高吞吐量任务
haiku: 'claude-haiku-3-5-20241022',
} as const;
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude!' }
],
});
claude-opus-4-5-20251101 (Opus 4.5)
├── 最适合:复杂分析、研究、细致入微的写作
├── 上下文:200K tokens
├── 成本:每 100 万 tokens $5/$25 (输入/输出)
└── 使用场景:准确性最重要时
claude-sonnet-4-5-20250929 (Sonnet 4.5)
├── 最适合:代码、通用任务、均衡性能
├── 上下文:200K tokens
├── 成本:每 100 万 tokens $3/$15
└── 使用场景:大多数应用的默认选择
claude-haiku-3-5-20241022 (Haiku 3.5)
├── 最适合:分类、提取、高吞吐量
├── 上下文:200K tokens
├── 成本:每 100 万 tokens $0.25/$1.25
└── 使用场景:速度和成本最重要时
const OPENAI_MODELS = {
// GPT-5 系列 (最新)
gpt5: 'gpt-5.2',
gpt5Mini: 'gpt-5-mini',
// GPT-4.1 系列 (推荐用于大多数场景)
gpt41: 'gpt-4.1',
gpt41Mini: 'gpt-4.1-mini',
gpt41Nano: 'gpt-4.1-nano',
// 推理模型 (o系列)
o3: 'o3',
o3Pro: 'o3-pro',
o4Mini: 'o4-mini',
// 旧版但仍有用
gpt4o: 'gpt-4o', // 仍支持音频
gpt4oMini: 'gpt-4o-mini',
// 嵌入
embeddingSmall: 'text-embedding-3-small',
embeddingLarge: 'text-embedding-3-large',
// 图像生成
dalle3: 'dall-e-3',
gptImage: 'gpt-image-1',
// 音频
tts: 'tts-1',
ttsHd: 'tts-1-hd',
whisper: 'whisper-1',
} as const;
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// 聊天补全
const response = await openai.chat.completions.create({
model: 'gpt-4.1',
messages: [
{ role: 'user', content: 'Hello!' }
],
});
// 视觉功能
const visionResponse = await openai.chat.completions.create({
model: 'gpt-4.1',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{ type: 'image_url', image_url: { url: 'https://...' } },
],
},
],
});
// 嵌入
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'Your text here',
});
o3 / o3-pro
├── 最适合:数学、编码、复杂的多步推理
├── 上下文:200K tokens
├── 成本:高级定价
└── 使用场景:最困难的问题,需要思维链
gpt-4.1
├── 最适合:通用任务、编码、指令遵循
├── 上下文:100 万 tokens (!)
├── 成本:低于 GPT-4o
└── 使用场景:默认选择,替代 GPT-4o
gpt-4.1-mini / gpt-4.1-nano
├── 最适合:高吞吐量、成本敏感型
├── 上下文:100 万 tokens
├── 成本:非常低
└── 使用场景:大规模的简单任务
o4-mini
├── 最适合:低成本快速推理
├── 上下文:200K tokens
├── 成本:经济型推理
└── 使用场景:需要推理但对成本敏感
const GEMINI_MODELS = {
// Gemini 3 (最新)
gemini3Pro: 'gemini-3-pro-preview',
gemini3ProImage: 'gemini-3-pro-image-preview',
gemini3Flash: 'gemini-3-flash-preview',
// Gemini 2.5 (稳定版)
gemini25Pro: 'gemini-2.5-pro',
gemini25Flash: 'gemini-2.5-flash',
gemini25FlashLite: 'gemini-2.5-flash-lite',
// 专用模型
gemini25FlashTTS: 'gemini-2.5-flash-preview-tts',
gemini25FlashAudio: 'gemini-2.5-flash-native-audio-preview-12-2025',
// 上一代
gemini2Flash: 'gemini-2.0-flash',
} as const;
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
const result = await model.generateContent('Hello!');
const response = result.response.text();
// 视觉功能
const visionModel = genAI.getGenerativeModel({ model: 'gemini-2.5-pro' });
const imagePart = {
inlineData: {
data: base64Image,
mimeType: 'image/jpeg',
},
};
const result = await visionModel.generateContent(['Describe this:', imagePart]);
gemini-3-pro-preview
├── 最适合:"多模态领域的世界最佳模型"
├── 上下文:200 万 tokens
├── 成本:高级
└── 使用场景:需要绝对最佳质量
gemini-2.5-pro
├── 最适合:最先进的思维、复杂任务
├── 上下文:200 万 tokens
├── 成本:每 100 万 tokens $1.25/$5
└── 使用场景:长上下文、复杂推理
gemini-2.5-flash
├── 最适合:快速、均衡的性能
├── 上下文:100 万 tokens
├── 成本:每 100 万 tokens $0.075/$0.30
└── 使用场景:速度和成本重要时
gemini-2.5-flash-lite
├── 最适合:超快速、最低成本
├── 上下文:100 万 tokens
├── 成本:每 100 万 tokens $0.04/$0.15
└── 使用场景:高吞吐量、简单任务
const ELEVENLABS_MODELS = {
// 最新 - 最高质量 (Alpha版)
v3: 'eleven_v3',
// 生产就绪
multilingualV2: 'eleven_multilingual_v2',
turboV2_5: 'eleven_turbo_v2_5',
// 超低延迟
flashV2_5: 'eleven_flash_v2_5',
flashV2: 'eleven_flash_v2', // 仅英语
} as const;
import { ElevenLabsClient } from 'elevenlabs';
const elevenlabs = new ElevenLabsClient({
apiKey: process.env.ELEVENLABS_API_KEY,
});
// 文本转语音
const audio = await elevenlabs.textToSpeech.convert('voice-id', {
text: 'Hello, world!',
model_id: 'eleven_turbo_v2_5',
voice_settings: {
stability: 0.5,
similarity_boost: 0.75,
},
});
// 流式音频 (用于实时场景)
const audioStream = await elevenlabs.textToSpeech.convertAsStream('voice-id', {
text: 'Streaming audio...',
model_id: 'eleven_flash_v2_5',
});
eleven_v3 (Alpha版)
├── 最适合:最高质量、情感范围广
├── 延迟:约 1 秒以上 (不适合实时)
├── 语言:74 种
└── 使用场景:质量优先于速度、预渲染
eleven_turbo_v2_5
├── 最适合:质量与速度的平衡
├── 延迟:约 250-300 毫秒
├── 语言:32 种
└── 使用场景:良好质量且延迟合理
eleven_flash_v2_5
├── 最适合:实时、对话式 AI
├── 延迟:<75 毫秒
├── 语言:32 种
└── 使用场景:实时语音助手、聊天机器人
const REPLICATE_MODELS = {
// FLUX.2 (最新 - 2025年11月)
flux2Pro: 'black-forest-labs/flux-2-pro',
flux2Flex: 'black-forest-labs/flux-2-flex',
flux2Dev: 'black-forest-labs/flux-2-dev',
// FLUX.1 (仍然优秀)
flux11Pro: 'black-forest-labs/flux-1.1-pro',
fluxKontext: 'black-forest-labs/flux-kontext', // 图像编辑
fluxSchnell: 'black-forest-labs/flux-schnell',
// 视频
stableVideo4D: 'stability-ai/sv4d-2.0',
// 音频
musicgen: 'meta/musicgen',
// 大语言模型 (如果需要在主要提供商之外使用)
llama: 'meta/llama-3.2-90b-vision',
} as const;
import Replicate from 'replicate';
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
// 使用 FLUX.2 生成图像
const output = await replicate.run('black-forest-labs/flux-2-pro', {
input: {
prompt: 'A serene mountain landscape at sunset',
aspect_ratio: '16:9',
output_format: 'webp',
},
});
// 使用 Kontext 编辑图像
const edited = await replicate.run('black-forest-labs/flux-kontext', {
input: {
image: 'https://...',
prompt: 'Change the sky to sunset colors',
},
});
flux-2-pro
├── 最适合:最高质量,最高 4MP
├── 速度:约 6 秒
├── 成本:$0.015 + 每百万像素
└── 使用场景:需要专业质量
flux-2-flex
├── 最适合:精细细节、字体排版
├── 速度:约 22 秒
├── 成本:每百万像素 $0.06
└── 使用场景:需要精确控制
flux-2-dev (开源)
├── 最适合:快速生成
├── 速度:约 2.5 秒
├── 成本:每百万像素 $0.012
└── 使用场景:速度优先于质量
flux-kontext
├── 最适合:基于文本的图像编辑
├── 速度:可变
├── 成本:每次运行
└── 使用场景:编辑现有图像
const STABILITY_MODELS = {
// 图像生成
sd35Large: 'sd3.5-large',
sd35LargeTurbo: 'sd3.5-large-turbo',
sd3Medium: 'sd3-medium',
// 视频
sv4d: 'sv4d-2.0', // Stable Video 4D 2.0
// 超分辨率
upscale: 'esrgan-v1-x2plus',
} as const;
const response = await fetch(
'https://api.stability.ai/v2beta/stable-image/generate/sd3',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
},
body: JSON.stringify({
prompt: 'A futuristic city at night',
output_format: 'webp',
aspect_ratio: '16:9',
model: 'sd3.5-large',
}),
}
);
const MISTRAL_MODELS = {
// 旗舰
large: 'mistral-large-latest', // 指向 2411
// 中等层级
medium: 'mistral-medium-2505', // Medium 3
// 小型/快速
small: 'mistral-small-2506', // Small 3.2
// 代码专用
codestral: 'codestral-2508',
devstral: 'devstral-medium-2507',
// 推理 (Magistral)
magistralMedium: 'magistral-medium-2507',
magistralSmall: 'magistral-small-2507',
// 音频
voxtral: 'voxtral-small-2507',
// OCR
ocr: 'mistral-ocr-2505',
} as const;
import MistralClient from '@mistralai/mistralai';
const client = new MistralClient(process.env.MISTRAL_API_KEY);
const response = await client.chat({
model: 'mistral-large-latest',
messages: [{ role: 'user', content: 'Hello!' }],
});
// 使用 Codestral 进行代码补全
const codeResponse = await client.chat({
model: 'codestral-2508',
messages: [{ role: 'user', content: 'Write a Python function to...' }],
});
mistral-large-latest (123B 参数)
├── 最适合:复杂推理、知识型任务
├── 上下文:128K tokens
└── 使用场景:需要高能力时
codestral-2508
├── 最适合:代码生成,80+ 种语言
├── 速度:比前代快 2.5 倍
└── 使用场景:代码密集型任务
magistral-medium-2507
├── 最适合:多步推理
├── 特点:透明的思维链
└── 使用场景:需要推理轨迹时
const VOYAGE_MODELS = {
// 通用
large2: 'voyage-large-2',
large2Instruct: 'voyage-large-2-instruct',
// 代码专用
code2: 'voyage-code-2',
code3: 'voyage-code-3',
// 多语言
multilingual2: 'voyage-multilingual-2',
// 领域专用
law2: 'voyage-law-2',
finance2: 'voyage-finance-2',
} as const;
const response = await fetch('https://api.voyageai.com/v1/embeddings', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.VOYAGE_API_KEY}`,
},
body: JSON.stringify({
model: 'voyage-code-3',
input: ['Your code to embed'],
}),
});
const { data } = await response.json();
const embedding = data[0].embedding;
| 提供商 | 廉价 | 中等 | 高级 |
|---|---|---|---|
| Anthropic | $0.25 (Haiku) | $3 (Sonnet 4.5) | $5 (Opus 4.5) |
| OpenAI | $0.15 (4.1-nano) | $2 (4.1) | $15+ (o3) |
| $0.04 (Flash-lite) | $0.08 (Flash) | $1.25 (Pro) | |
| Mistral | $0.25 (Small) | $2.70 (Medium) | $8 (Large) |
Reasoning/Analysis → Claude Opus 4.5, o3, Gemini 3 Pro
Code Generation → Claude Sonnet 4.5, Codestral 2508, GPT-4.1
Fast Responses → Claude Haiku, GPT-4.1-mini, Gemini Flash
Long Context → Gemini 2.5 Pro (2M), GPT-4.1 (1M), Claude (200K)
Vision → GPT-4.1, Claude Sonnet, Gemini 3 Pro
Embeddings → Voyage code-3, text-embedding-3-small
Voice Synthesis → Eleven Labs v3/flash, OpenAI TTS
Image Generation → FLUX.2 Pro, DALL-E 3, SD 3.5
Video Generation → Stable Video 4D 2.0, Runway
Image Editing → FLUX Kontext, gpt-image-1
# .env.example (切勿提交实际密钥)
# 大语言模型
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AI...
MISTRAL_API_KEY=...
# 媒体
ELEVENLABS_API_KEY=...
REPLICATE_API_TOKEN=r8_...
STABILITY_API_KEY=sk-...
# 嵌入
VOYAGE_API_KEY=pa-...
When models update:
□ Check official changelog/blog
□ Update model ID strings
□ Test with existing prompts
□ Compare output quality
□ Check pricing changes
□ Update context limits if changed
每周安装次数
106
代码仓库
GitHub 星标数
538
首次出现
2026年1月20日
安全审计
安装于
gemini-cli82
opencode82
cursor79
codex77
claude-code76
github-copilot70
Load with: base.md + llm-patterns.md
Last Updated: December 2025
Use the right model for the job. Bigger isn't always better - match model capabilities to task requirements. Consider cost, latency, and accuracy tradeoffs.
| Task | Recommended | Why |
|---|---|---|
| Complex reasoning | Claude Opus 4.5, o3, Gemini 3 Pro | Highest accuracy |
| Fast chat/completion | Claude Haiku, GPT-4.1 mini, Gemini Flash | Low latency, cheap |
| Code generation | Claude Sonnet 4.5, Codestral, GPT-4.1 | Strong coding |
| Vision/images | Claude Sonnet, GPT-4o, Gemini 3 Pro | Multimodal |
| Embeddings | text-embedding-3-small, Voyage | Cost-effective |
| Voice synthesis | Eleven Labs v3, OpenAI TTS | Natural sounding |
| Image generation | FLUX.2, DALL-E 3, SD 3.5 | Different styles |
const CLAUDE_MODELS = {
// Flagship - highest capability
opus: 'claude-opus-4-5-20251101',
// Balanced - best for most tasks
sonnet: 'claude-sonnet-4-5-20250929',
// Previous generation (still excellent)
opus4: 'claude-opus-4-20250514',
sonnet4: 'claude-sonnet-4-20250514',
// Fast & cheap - high volume tasks
haiku: 'claude-haiku-3-5-20241022',
} as const;
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude!' }
],
});
claude-opus-4-5-20251101 (Opus 4.5)
├── Best for: Complex analysis, research, nuanced writing
├── Context: 200K tokens
├── Cost: $5/$25 per 1M tokens (input/output)
└── Use when: Accuracy matters most
claude-sonnet-4-5-20250929 (Sonnet 4.5)
├── Best for: Code, general tasks, balanced performance
├── Context: 200K tokens
├── Cost: $3/$15 per 1M tokens
└── Use when: Default choice for most applications
claude-haiku-3-5-20241022 (Haiku 3.5)
├── Best for: Classification, extraction, high-volume
├── Context: 200K tokens
├── Cost: $0.25/$1.25 per 1M tokens
└── Use when: Speed and cost matter most
const OPENAI_MODELS = {
// GPT-5 series (latest)
gpt5: 'gpt-5.2',
gpt5Mini: 'gpt-5-mini',
// GPT-4.1 series (recommended for most)
gpt41: 'gpt-4.1',
gpt41Mini: 'gpt-4.1-mini',
gpt41Nano: 'gpt-4.1-nano',
// Reasoning models (o-series)
o3: 'o3',
o3Pro: 'o3-pro',
o4Mini: 'o4-mini',
// Legacy but still useful
gpt4o: 'gpt-4o', // Still has audio support
gpt4oMini: 'gpt-4o-mini',
// Embeddings
embeddingSmall: 'text-embedding-3-small',
embeddingLarge: 'text-embedding-3-large',
// Image generation
dalle3: 'dall-e-3',
gptImage: 'gpt-image-1',
// Audio
tts: 'tts-1',
ttsHd: 'tts-1-hd',
whisper: 'whisper-1',
} as const;
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// Chat completion
const response = await openai.chat.completions.create({
model: 'gpt-4.1',
messages: [
{ role: 'user', content: 'Hello!' }
],
});
// With vision
const visionResponse = await openai.chat.completions.create({
model: 'gpt-4.1',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{ type: 'image_url', image_url: { url: 'https://...' } },
],
},
],
});
// Embeddings
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: 'Your text here',
});
o3 / o3-pro
├── Best for: Math, coding, complex multi-step reasoning
├── Context: 200K tokens
├── Cost: Premium pricing
└── Use when: Hardest problems, need chain-of-thought
gpt-4.1
├── Best for: General tasks, coding, instruction following
├── Context: 1M tokens (!)
├── Cost: Lower than GPT-4o
└── Use when: Default choice, replaces GPT-4o
gpt-4.1-mini / gpt-4.1-nano
├── Best for: High-volume, cost-sensitive
├── Context: 1M tokens
├── Cost: Very low
└── Use when: Simple tasks at scale
o4-mini
├── Best for: Fast reasoning at low cost
├── Context: 200K tokens
├── Cost: Budget reasoning
└── Use when: Need reasoning but cost-conscious
const GEMINI_MODELS = {
// Gemini 3 (Latest)
gemini3Pro: 'gemini-3-pro-preview',
gemini3ProImage: 'gemini-3-pro-image-preview',
gemini3Flash: 'gemini-3-flash-preview',
// Gemini 2.5 (Stable)
gemini25Pro: 'gemini-2.5-pro',
gemini25Flash: 'gemini-2.5-flash',
gemini25FlashLite: 'gemini-2.5-flash-lite',
// Specialized
gemini25FlashTTS: 'gemini-2.5-flash-preview-tts',
gemini25FlashAudio: 'gemini-2.5-flash-native-audio-preview-12-2025',
// Previous generation
gemini2Flash: 'gemini-2.0-flash',
} as const;
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
const result = await model.generateContent('Hello!');
const response = result.response.text();
// With vision
const visionModel = genAI.getGenerativeModel({ model: 'gemini-2.5-pro' });
const imagePart = {
inlineData: {
data: base64Image,
mimeType: 'image/jpeg',
},
};
const result = await visionModel.generateContent(['Describe this:', imagePart]);
gemini-3-pro-preview
├── Best for: "Best model in the world for multimodal"
├── Context: 2M tokens
├── Cost: Premium
└── Use when: Need absolute best quality
gemini-2.5-pro
├── Best for: State-of-the-art thinking, complex tasks
├── Context: 2M tokens
├── Cost: $1.25/$5 per 1M tokens
└── Use when: Long context, complex reasoning
gemini-2.5-flash
├── Best for: Fast, balanced performance
├── Context: 1M tokens
├── Cost: $0.075/$0.30 per 1M tokens
└── Use when: Speed and cost matter
gemini-2.5-flash-lite
├── Best for: Ultra-fast, lowest cost
├── Context: 1M tokens
├── Cost: $0.04/$0.15 per 1M tokens
└── Use when: High volume, simple tasks
const ELEVENLABS_MODELS = {
// Latest - highest quality (alpha)
v3: 'eleven_v3',
// Production ready
multilingualV2: 'eleven_multilingual_v2',
turboV2_5: 'eleven_turbo_v2_5',
// Ultra-low latency
flashV2_5: 'eleven_flash_v2_5',
flashV2: 'eleven_flash_v2', // English only
} as const;
import { ElevenLabsClient } from 'elevenlabs';
const elevenlabs = new ElevenLabsClient({
apiKey: process.env.ELEVENLABS_API_KEY,
});
// Text to speech
const audio = await elevenlabs.textToSpeech.convert('voice-id', {
text: 'Hello, world!',
model_id: 'eleven_turbo_v2_5',
voice_settings: {
stability: 0.5,
similarity_boost: 0.75,
},
});
// Stream audio (for real-time)
const audioStream = await elevenlabs.textToSpeech.convertAsStream('voice-id', {
text: 'Streaming audio...',
model_id: 'eleven_flash_v2_5',
});
eleven_v3 (Alpha)
├── Best for: Highest quality, emotional range
├── Latency: ~1s+ (not for real-time)
├── Languages: 74
└── Use when: Quality over speed, pre-rendered
eleven_turbo_v2_5
├── Best for: Balanced quality and speed
├── Latency: ~250-300ms
├── Languages: 32
└── Use when: Good quality with reasonable latency
eleven_flash_v2_5
├── Best for: Real-time, conversational AI
├── Latency: <75ms
├── Languages: 32
└── Use when: Live voice agents, chatbots
const REPLICATE_MODELS = {
// FLUX.2 (Latest - November 2025)
flux2Pro: 'black-forest-labs/flux-2-pro',
flux2Flex: 'black-forest-labs/flux-2-flex',
flux2Dev: 'black-forest-labs/flux-2-dev',
// FLUX.1 (Still excellent)
flux11Pro: 'black-forest-labs/flux-1.1-pro',
fluxKontext: 'black-forest-labs/flux-kontext', // Image editing
fluxSchnell: 'black-forest-labs/flux-schnell',
// Video
stableVideo4D: 'stability-ai/sv4d-2.0',
// Audio
musicgen: 'meta/musicgen',
// LLMs (if needed outside main providers)
llama: 'meta/llama-3.2-90b-vision',
} as const;
import Replicate from 'replicate';
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
// Image generation with FLUX.2
const output = await replicate.run('black-forest-labs/flux-2-pro', {
input: {
prompt: 'A serene mountain landscape at sunset',
aspect_ratio: '16:9',
output_format: 'webp',
},
});
// Image editing with Kontext
const edited = await replicate.run('black-forest-labs/flux-kontext', {
input: {
image: 'https://...',
prompt: 'Change the sky to sunset colors',
},
});
flux-2-pro
├── Best for: Highest quality, up to 4MP
├── Speed: ~6s
├── Cost: $0.015 + per megapixel
└── Use when: Professional quality needed
flux-2-flex
├── Best for: Fine details, typography
├── Speed: ~22s
├── Cost: $0.06 per megapixel
└── Use when: Need precise control
flux-2-dev (Open source)
├── Best for: Fast generation
├── Speed: ~2.5s
├── Cost: $0.012 per megapixel
└── Use when: Speed over quality
flux-kontext
├── Best for: Image editing with text
├── Speed: Variable
├── Cost: Per run
└── Use when: Edit existing images
const STABILITY_MODELS = {
// Image generation
sd35Large: 'sd3.5-large',
sd35LargeTurbo: 'sd3.5-large-turbo',
sd3Medium: 'sd3-medium',
// Video
sv4d: 'sv4d-2.0', // Stable Video 4D 2.0
// Upscaling
upscale: 'esrgan-v1-x2plus',
} as const;
const response = await fetch(
'https://api.stability.ai/v2beta/stable-image/generate/sd3',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
},
body: JSON.stringify({
prompt: 'A futuristic city at night',
output_format: 'webp',
aspect_ratio: '16:9',
model: 'sd3.5-large',
}),
}
);
const MISTRAL_MODELS = {
// Flagship
large: 'mistral-large-latest', // Points to 2411
// Medium tier
medium: 'mistral-medium-2505', // Medium 3
// Small/Fast
small: 'mistral-small-2506', // Small 3.2
// Code specialized
codestral: 'codestral-2508',
devstral: 'devstral-medium-2507',
// Reasoning (Magistral)
magistralMedium: 'magistral-medium-2507',
magistralSmall: 'magistral-small-2507',
// Audio
voxtral: 'voxtral-small-2507',
// OCR
ocr: 'mistral-ocr-2505',
} as const;
import MistralClient from '@mistralai/mistralai';
const client = new MistralClient(process.env.MISTRAL_API_KEY);
const response = await client.chat({
model: 'mistral-large-latest',
messages: [{ role: 'user', content: 'Hello!' }],
});
// Code completion with Codestral
const codeResponse = await client.chat({
model: 'codestral-2508',
messages: [{ role: 'user', content: 'Write a Python function to...' }],
});
mistral-large-latest (123B params)
├── Best for: Complex reasoning, knowledge tasks
├── Context: 128K tokens
└── Use when: Need high capability
codestral-2508
├── Best for: Code generation, 80+ languages
├── Speed: 2.5x faster than predecessor
└── Use when: Code-focused tasks
magistral-medium-2507
├── Best for: Multi-step reasoning
├── Specialty: Transparent chain-of-thought
└── Use when: Need reasoning traces
const VOYAGE_MODELS = {
// General purpose
large2: 'voyage-large-2',
large2Instruct: 'voyage-large-2-instruct',
// Code specialized
code2: 'voyage-code-2',
code3: 'voyage-code-3',
// Multilingual
multilingual2: 'voyage-multilingual-2',
// Domain specific
law2: 'voyage-law-2',
finance2: 'voyage-finance-2',
} as const;
const response = await fetch('https://api.voyageai.com/v1/embeddings', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.VOYAGE_API_KEY}`,
},
body: JSON.stringify({
model: 'voyage-code-3',
input: ['Your code to embed'],
}),
});
const { data } = await response.json();
const embedding = data[0].embedding;
| Provider | Cheap | Mid | Premium |
|---|---|---|---|
| Anthropic | $0.25 (Haiku) | $3 (Sonnet 4.5) | $5 (Opus 4.5) |
| OpenAI | $0.15 (4.1-nano) | $2 (4.1) | $15+ (o3) |
| $0.04 (Flash-lite) | $0.08 (Flash) | $1.25 (Pro) | |
| Mistral | $0.25 (Small) | $2.70 (Medium) | $8 (Large) |
Reasoning/Analysis → Claude Opus 4.5, o3, Gemini 3 Pro
Code Generation → Claude Sonnet 4.5, Codestral 2508, GPT-4.1
Fast Responses → Claude Haiku, GPT-4.1-mini, Gemini Flash
Long Context → Gemini 2.5 Pro (2M), GPT-4.1 (1M), Claude (200K)
Vision → GPT-4.1, Claude Sonnet, Gemini 3 Pro
Embeddings → Voyage code-3, text-embedding-3-small
Voice Synthesis → Eleven Labs v3/flash, OpenAI TTS
Image Generation → FLUX.2 Pro, DALL-E 3, SD 3.5
Video Generation → Stable Video 4D 2.0, Runway
Image Editing → FLUX Kontext, gpt-image-1
# .env.example (NEVER commit actual keys)
# LLMs
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AI...
MISTRAL_API_KEY=...
# Media
ELEVENLABS_API_KEY=...
REPLICATE_API_TOKEN=r8_...
STABILITY_API_KEY=sk-...
# Embeddings
VOYAGE_API_KEY=pa-...
When models update:
□ Check official changelog/blog
□ Update model ID strings
□ Test with existing prompts
□ Compare output quality
□ Check pricing changes
□ Update context limits if changed
Weekly Installs
106
Repository
GitHub Stars
538
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
gemini-cli82
opencode82
cursor79
codex77
claude-code76
github-copilot70
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
49,000 周安装
专业财务计算器套件 - Python贷款、投资、退休、NPV/IRR计算与蒙特卡洛模拟
263 周安装
MiniMax API 使用指南:AI 聊天补全、文本转语音、视频生成
266 周安装
Azure Cosmos DB 最佳实践指南:60+条性能优化规则与SDK模式详解
266 周安装
生物医学搜索工具:一键查询PubMed、临床试验、药物标签等数据库
267 周安装
Chrome扩展UI/UX设计指南:Manifest V3最佳实践与42条规则
272 周安装
Next.js数据获取完整指南:App Router服务器/客户端组件、缓存策略与SWR集成
268 周安装