AI模型选择指南：Claude、GPT、Gemini等主流模型对比与使用教程（2025）

ai-models by alinaqi/claude-bootstrap

119 周安装量

570 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/alinaqi/claude-bootstrap --skill ai-models

AI/机器学习代码生成提示工程

🇨🇳中文介绍

AI 模型参考技能

加载方式：base.md + llm-patterns.md

最后更新：2025年12月

核心理念

为任务选择合适的模型。 更大并不总是更好——根据任务需求匹配模型能力。权衡成本、延迟和准确性。

模型选择矩阵

任务	推荐模型	原因
复杂推理	Claude Opus 4.5, o3, Gemini 3 Pro	最高准确性
快速聊天/补全	Claude Haiku, GPT-4.1 mini, Gemini Flash	低延迟，成本低
代码生成	Claude Sonnet 4.5, Codestral, GPT-4.1	强大的编码能力
视觉/图像	Claude Sonnet, GPT-4o, Gemini 3 Pro	多模态
嵌入	text-embedding-3-small, Voyage	性价比高
语音合成	Eleven Labs v3, OpenAI TTS	声音自然
图像生成

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

Eleven Labs (语音)

成本对比 (每 100 万 tokens，约计)

提供商	廉价	中等	高级
Anthropic	$0.25 (Haiku)	$3 (Sonnet 4.5)	$5 (Opus 4.5)
OpenAI	$0.15 (4.1-nano)	$2 (4.1)	$15+ (o3)
Google	$0.04 (Flash-lite)	$0.08 (Flash)	$1.25 (Pro)
Mistral	$0.25 (Small)	$2.70 (Medium)	$8 (Large)

各任务最佳选择

Reasoning/Analysis    → Claude Opus 4.5, o3, Gemini 3 Pro
Code Generation       → Claude Sonnet 4.5, Codestral 2508, GPT-4.1
Fast Responses        → Claude Haiku, GPT-4.1-mini, Gemini Flash
Long Context          → Gemini 2.5 Pro (2M), GPT-4.1 (1M), Claude (200K)
Vision                → GPT-4.1, Claude Sonnet, Gemini 3 Pro
Embeddings            → Voyage code-3, text-embedding-3-small
Voice Synthesis       → Eleven Labs v3/flash, OpenAI TTS
Image Generation      → FLUX.2 Pro, DALL-E 3, SD 3.5
Video Generation      → Stable Video 4D 2.0, Runway
Image Editing         → FLUX Kontext, gpt-image-1

# .env.example (切勿提交实际密钥)

# 大语言模型
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AI...
MISTRAL_API_KEY=...

# 媒体
ELEVENLABS_API_KEY=...
REPLICATE_API_TOKEN=r8_...
STABILITY_API_KEY=sk-...

# 嵌入
VOYAGE_API_KEY=pa-...

模型更新检查清单

When models update:
□ Check official changelog/blog
□ Update model ID strings
□ Test with existing prompts
□ Compare output quality
□ Check pricing changes
□ Update context limits if changed

🇺🇸English

AI Models Reference Skill

Load with: base.md + llm-patterns.md

Last Updated: December 2025

Philosophy

Use the right model for the job. Bigger isn't always better - match model capabilities to task requirements. Consider cost, latency, and accuracy tradeoffs.

Model Selection Matrix

Task	Recommended	Why
Complex reasoning	Claude Opus 4.5, o3, Gemini 3 Pro	Highest accuracy
Fast chat/completion	Claude Haiku, GPT-4.1 mini, Gemini Flash	Low latency, cheap
Code generation	Claude Sonnet 4.5, Codestral, GPT-4.1	Strong coding
Vision/images	Claude Sonnet, GPT-4o, Gemini 3 Pro	Multimodal
Embeddings	text-embedding-3-small, Voyage	Cost-effective
Voice synthesis	Eleven Labs v3, OpenAI TTS	Natural sounding
Image generation	FLUX.2, DALL-E 3, SD 3.5	Different styles

Anthropic (Claude)

Documentation

API Docs : https://docs.anthropic.com
Models Overview : https://docs.anthropic.com/en/docs/about-claude/models/overview
Pricing : https://www.anthropic.com/pricing

Latest Models (December 2025)

const CLAUDE_MODELS = {
  // Flagship - highest capability
  opus: 'claude-opus-4-5-20251101',

  // Balanced - best for most tasks
  sonnet: 'claude-sonnet-4-5-20250929',

  // Previous generation (still excellent)
  opus4: 'claude-opus-4-20250514',
  sonnet4: 'claude-sonnet-4-20250514',

  // Fast & cheap - high volume tasks
  haiku: 'claude-haiku-3-5-20241022',
} as const;

Usage

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Hello, Claude!' }
  ],
});

Model Selection

claude-opus-4-5-20251101 (Opus 4.5)
├── Best for: Complex analysis, research, nuanced writing
├── Context: 200K tokens
├── Cost: $5/$25 per 1M tokens (input/output)
└── Use when: Accuracy matters most

claude-sonnet-4-5-20250929 (Sonnet 4.5)
├── Best for: Code, general tasks, balanced performance
├── Context: 200K tokens
├── Cost: $3/$15 per 1M tokens
└── Use when: Default choice for most applications

claude-haiku-3-5-20241022 (Haiku 3.5)
├── Best for: Classification, extraction, high-volume
├── Context: 200K tokens
├── Cost: $0.25/$1.25 per 1M tokens
└── Use when: Speed and cost matter most

OpenAI

Documentation

API Docs : https://platform.openai.com/docs
Models : https://platform.openai.com/docs/models
Pricing : https://openai.com/pricing

Latest Models (December 2025)

const OPENAI_MODELS = {
  // GPT-5 series (latest)
  gpt5: 'gpt-5.2',
  gpt5Mini: 'gpt-5-mini',

  // GPT-4.1 series (recommended for most)
  gpt41: 'gpt-4.1',
  gpt41Mini: 'gpt-4.1-mini',
  gpt41Nano: 'gpt-4.1-nano',

  // Reasoning models (o-series)
  o3: 'o3',
  o3Pro: 'o3-pro',
  o4Mini: 'o4-mini',

  // Legacy but still useful
  gpt4o: 'gpt-4o',           // Still has audio support
  gpt4oMini: 'gpt-4o-mini',

  // Embeddings
  embeddingSmall: 'text-embedding-3-small',
  embeddingLarge: 'text-embedding-3-large',

  // Image generation
  dalle3: 'dall-e-3',
  gptImage: 'gpt-image-1',

  // Audio
  tts: 'tts-1',
  ttsHd: 'tts-1-hd',
  whisper: 'whisper-1',
} as const;

Usage

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Chat completion
const response = await openai.chat.completions.create({
  model: 'gpt-4.1',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

// With vision
const visionResponse = await openai.chat.completions.create({
  model: 'gpt-4.1',
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'What is in this image?' },
        { type: 'image_url', image_url: { url: 'https://...' } },
      ],
    },
  ],
});

// Embeddings
const embedding = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: 'Your text here',
});

Model Selection

o3 / o3-pro
├── Best for: Math, coding, complex multi-step reasoning
├── Context: 200K tokens
├── Cost: Premium pricing
└── Use when: Hardest problems, need chain-of-thought

gpt-4.1
├── Best for: General tasks, coding, instruction following
├── Context: 1M tokens (!)
├── Cost: Lower than GPT-4o
└── Use when: Default choice, replaces GPT-4o

gpt-4.1-mini / gpt-4.1-nano
├── Best for: High-volume, cost-sensitive
├── Context: 1M tokens
├── Cost: Very low
└── Use when: Simple tasks at scale

o4-mini
├── Best for: Fast reasoning at low cost
├── Context: 200K tokens
├── Cost: Budget reasoning
└── Use when: Need reasoning but cost-conscious

Google (Gemini)

Documentation

API Docs : https://ai.google.dev/docs
Models : https://ai.google.dev/gemini-api/docs/models/gemini
Pricing : https://ai.google.dev/pricing

Latest Models (December 2025)

const GEMINI_MODELS = {
  // Gemini 3 (Latest)
  gemini3Pro: 'gemini-3-pro-preview',
  gemini3ProImage: 'gemini-3-pro-image-preview',
  gemini3Flash: 'gemini-3-flash-preview',

  // Gemini 2.5 (Stable)
  gemini25Pro: 'gemini-2.5-pro',
  gemini25Flash: 'gemini-2.5-flash',
  gemini25FlashLite: 'gemini-2.5-flash-lite',

  // Specialized
  gemini25FlashTTS: 'gemini-2.5-flash-preview-tts',
  gemini25FlashAudio: 'gemini-2.5-flash-native-audio-preview-12-2025',

  // Previous generation
  gemini2Flash: 'gemini-2.0-flash',
} as const;

Usage

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });

const result = await model.generateContent('Hello!');
const response = result.response.text();

// With vision
const visionModel = genAI.getGenerativeModel({ model: 'gemini-2.5-pro' });
const imagePart = {
  inlineData: {
    data: base64Image,
    mimeType: 'image/jpeg',
  },
};
const result = await visionModel.generateContent(['Describe this:', imagePart]);

Model Selection

gemini-3-pro-preview
├── Best for: "Best model in the world for multimodal"
├── Context: 2M tokens
├── Cost: Premium
└── Use when: Need absolute best quality

gemini-2.5-pro
├── Best for: State-of-the-art thinking, complex tasks
├── Context: 2M tokens
├── Cost: $1.25/$5 per 1M tokens
└── Use when: Long context, complex reasoning

gemini-2.5-flash
├── Best for: Fast, balanced performance
├── Context: 1M tokens
├── Cost: $0.075/$0.30 per 1M tokens
└── Use when: Speed and cost matter

gemini-2.5-flash-lite
├── Best for: Ultra-fast, lowest cost
├── Context: 1M tokens
├── Cost: $0.04/$0.15 per 1M tokens
└── Use when: High volume, simple tasks

Eleven Labs (Voice)

Documentation

API Docs : https://elevenlabs.io/docs
Models : https://elevenlabs.io/docs/models
Pricing : https://elevenlabs.io/pricing

Latest Models (December 2025)

const ELEVENLABS_MODELS = {
  // Latest - highest quality (alpha)
  v3: 'eleven_v3',

  // Production ready
  multilingualV2: 'eleven_multilingual_v2',
  turboV2_5: 'eleven_turbo_v2_5',

  // Ultra-low latency
  flashV2_5: 'eleven_flash_v2_5',
  flashV2: 'eleven_flash_v2', // English only
} as const;

Usage

import { ElevenLabsClient } from 'elevenlabs';

const elevenlabs = new ElevenLabsClient({
  apiKey: process.env.ELEVENLABS_API_KEY,
});

// Text to speech
const audio = await elevenlabs.textToSpeech.convert('voice-id', {
  text: 'Hello, world!',
  model_id: 'eleven_turbo_v2_5',
  voice_settings: {
    stability: 0.5,
    similarity_boost: 0.75,
  },
});

// Stream audio (for real-time)
const audioStream = await elevenlabs.textToSpeech.convertAsStream('voice-id', {
  text: 'Streaming audio...',
  model_id: 'eleven_flash_v2_5',
});

Model Selection

eleven_v3 (Alpha)
├── Best for: Highest quality, emotional range
├── Latency: ~1s+ (not for real-time)
├── Languages: 74
└── Use when: Quality over speed, pre-rendered

eleven_turbo_v2_5
├── Best for: Balanced quality and speed
├── Latency: ~250-300ms
├── Languages: 32
└── Use when: Good quality with reasonable latency

eleven_flash_v2_5
├── Best for: Real-time, conversational AI
├── Latency: <75ms
├── Languages: 32
└── Use when: Live voice agents, chatbots

Replicate

Documentation

API Docs : https://replicate.com/docs
Models : https://replicate.com/explore
Pricing : https://replicate.com/pricing

Popular Models (December 2025)

const REPLICATE_MODELS = {
  // FLUX.2 (Latest - November 2025)
  flux2Pro: 'black-forest-labs/flux-2-pro',
  flux2Flex: 'black-forest-labs/flux-2-flex',
  flux2Dev: 'black-forest-labs/flux-2-dev',

  // FLUX.1 (Still excellent)
  flux11Pro: 'black-forest-labs/flux-1.1-pro',
  fluxKontext: 'black-forest-labs/flux-kontext', // Image editing
  fluxSchnell: 'black-forest-labs/flux-schnell',

  // Video
  stableVideo4D: 'stability-ai/sv4d-2.0',

  // Audio
  musicgen: 'meta/musicgen',

  // LLMs (if needed outside main providers)
  llama: 'meta/llama-3.2-90b-vision',
} as const;

Usage

import Replicate from 'replicate';

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

// Image generation with FLUX.2
const output = await replicate.run('black-forest-labs/flux-2-pro', {
  input: {
    prompt: 'A serene mountain landscape at sunset',
    aspect_ratio: '16:9',
    output_format: 'webp',
  },
});

// Image editing with Kontext
const edited = await replicate.run('black-forest-labs/flux-kontext', {
  input: {
    image: 'https://...',
    prompt: 'Change the sky to sunset colors',
  },
});

Model Selection

flux-2-pro
├── Best for: Highest quality, up to 4MP
├── Speed: ~6s
├── Cost: $0.015 + per megapixel
└── Use when: Professional quality needed

flux-2-flex
├── Best for: Fine details, typography
├── Speed: ~22s
├── Cost: $0.06 per megapixel
└── Use when: Need precise control

flux-2-dev (Open source)
├── Best for: Fast generation
├── Speed: ~2.5s
├── Cost: $0.012 per megapixel
└── Use when: Speed over quality

flux-kontext
├── Best for: Image editing with text
├── Speed: Variable
├── Cost: Per run
└── Use when: Edit existing images

Stability AI

Documentation

API Docs : https://platform.stability.ai/docs/api-reference
Models : https://stability.ai/stable-image
Pricing : https://platform.stability.ai/pricing

Latest Models (December 2025)

const STABILITY_MODELS = {
  // Image generation
  sd35Large: 'sd3.5-large',
  sd35LargeTurbo: 'sd3.5-large-turbo',
  sd3Medium: 'sd3-medium',

  // Video
  sv4d: 'sv4d-2.0', // Stable Video 4D 2.0

  // Upscaling
  upscale: 'esrgan-v1-x2plus',
} as const;

Usage

const response = await fetch(
  'https://api.stability.ai/v2beta/stable-image/generate/sd3',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
    },
    body: JSON.stringify({
      prompt: 'A futuristic city at night',
      output_format: 'webp',
      aspect_ratio: '16:9',
      model: 'sd3.5-large',
    }),
  }
);

Mistral AI

Documentation

API Docs : https://docs.mistral.ai
Models : https://docs.mistral.ai/getting-started/models
Pricing : https://mistral.ai/technology/#pricing

Latest Models (December 2025)

const MISTRAL_MODELS = {
  // Flagship
  large: 'mistral-large-latest',  // Points to 2411

  // Medium tier
  medium: 'mistral-medium-2505',  // Medium 3

  // Small/Fast
  small: 'mistral-small-2506',    // Small 3.2

  // Code specialized
  codestral: 'codestral-2508',
  devstral: 'devstral-medium-2507',

  // Reasoning (Magistral)
  magistralMedium: 'magistral-medium-2507',
  magistralSmall: 'magistral-small-2507',

  // Audio
  voxtral: 'voxtral-small-2507',

  // OCR
  ocr: 'mistral-ocr-2505',
} as const;

Usage

import MistralClient from '@mistralai/mistralai';

const client = new MistralClient(process.env.MISTRAL_API_KEY);

const response = await client.chat({
  model: 'mistral-large-latest',
  messages: [{ role: 'user', content: 'Hello!' }],
});

// Code completion with Codestral
const codeResponse = await client.chat({
  model: 'codestral-2508',
  messages: [{ role: 'user', content: 'Write a Python function to...' }],
});

Model Selection

mistral-large-latest (123B params)
├── Best for: Complex reasoning, knowledge tasks
├── Context: 128K tokens
└── Use when: Need high capability

codestral-2508
├── Best for: Code generation, 80+ languages
├── Speed: 2.5x faster than predecessor
└── Use when: Code-focused tasks

magistral-medium-2507
├── Best for: Multi-step reasoning
├── Specialty: Transparent chain-of-thought
└── Use when: Need reasoning traces

Voyage AI (Embeddings)

Documentation

API Docs : https://docs.voyageai.com
Models : https://docs.voyageai.com/docs/embeddings
Pricing : https://www.voyageai.com/pricing

Latest Models (December 2025)

const VOYAGE_MODELS = {
  // General purpose
  large2: 'voyage-large-2',
  large2Instruct: 'voyage-large-2-instruct',

  // Code specialized
  code2: 'voyage-code-2',
  code3: 'voyage-code-3',

  // Multilingual
  multilingual2: 'voyage-multilingual-2',

  // Domain specific
  law2: 'voyage-law-2',
  finance2: 'voyage-finance-2',
} as const;

Usage

const response = await fetch('https://api.voyageai.com/v1/embeddings', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${process.env.VOYAGE_API_KEY}`,
  },
  body: JSON.stringify({
    model: 'voyage-code-3',
    input: ['Your code to embed'],
  }),
});

const { data } = await response.json();
const embedding = data[0].embedding;

Quick Reference

Cost Comparison (per 1M tokens, approx.)

Provider	Cheap	Mid	Premium
Anthropic	$0.25 (Haiku)	$3 (Sonnet 4.5)	$5 (Opus 4.5)
OpenAI	$0.15 (4.1-nano)	$2 (4.1)	$15+ (o3)
Google	$0.04 (Flash-lite)	$0.08 (Flash)	$1.25 (Pro)
Mistral	$0.25 (Small)	$2.70 (Medium)	$8 (Large)

Best For Each Task

Reasoning/Analysis    → Claude Opus 4.5, o3, Gemini 3 Pro
Code Generation       → Claude Sonnet 4.5, Codestral 2508, GPT-4.1
Fast Responses        → Claude Haiku, GPT-4.1-mini, Gemini Flash
Long Context          → Gemini 2.5 Pro (2M), GPT-4.1 (1M), Claude (200K)
Vision                → GPT-4.1, Claude Sonnet, Gemini 3 Pro
Embeddings            → Voyage code-3, text-embedding-3-small
Voice Synthesis       → Eleven Labs v3/flash, OpenAI TTS
Image Generation      → FLUX.2 Pro, DALL-E 3, SD 3.5
Video Generation      → Stable Video 4D 2.0, Runway
Image Editing         → FLUX Kontext, gpt-image-1

Environment Variables Template

# .env.example (NEVER commit actual keys)

# LLMs
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AI...
MISTRAL_API_KEY=...

# Media
ELEVENLABS_API_KEY=...
REPLICATE_API_TOKEN=r8_...
STABILITY_API_KEY=sk-...

# Embeddings
VOYAGE_API_KEY=pa-...

Model Update Checklist

When models update:
□ Check official changelog/blog
□ Update model ID strings
□ Test with existing prompts
□ Compare output quality
□ Check pricing changes
□ Update context limits if changed

Sources

Weekly Installs

106

Repository

alinaqi/claude-bootstrap

GitHub Stars

538

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

gemini-cli82

opencode82

cursor79

codex77

claude-code76

github-copilot70

AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具

49,000 周安装

AI模型选择指南：Claude、GPT、Gemini等主流模型对比与使用教程（2025）

🇨🇳中文介绍

AI 模型参考技能

核心理念

模型选择矩阵

相关 Skills

Anthropic (Claude)

文档

最新模型 (2025年12月)

使用方法

模型选择

OpenAI

文档

最新模型 (2025年12月)

使用方法

模型选择

Google (Gemini)

文档

最新模型 (2025年12月)

使用方法

模型选择

Eleven Labs (语音)

文档

最新模型 (2025年12月)

使用方法

模型选择

Replicate

文档

热门模型 (2025年12月)

使用方法

模型选择

Stability AI

文档

最新模型 (2025年12月)

使用方法

Mistral AI

文档

最新模型 (2025年12月)

使用方法

模型选择

Voyage AI (嵌入)

文档

最新模型 (2025年12月)

使用方法

快速参考

成本对比 (每 100 万 tokens，约计)

各任务最佳选择

环境变量模板

模型更新检查清单

来源

🇺🇸English

AI Models Reference Skill

Philosophy

Model Selection Matrix

Anthropic (Claude)

Documentation

Latest Models (December 2025)

Usage

Model Selection

OpenAI

Documentation

Latest Models (December 2025)

Usage

Model Selection

Google (Gemini)

Documentation

Latest Models (December 2025)

Usage

Model Selection

Eleven Labs (Voice)

Documentation

Latest Models (December 2025)

Usage

Model Selection

Replicate

Documentation

Popular Models (December 2025)

Usage

Model Selection

Stability AI