Google Gemini 文件搜索设置教程 - 完全托管RAG系统，支持100+格式，集成最佳实践

google-gemini-file-search by jezweb/claude-skills

328 周安装量

652 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/jezweb/claude-skills --skill google-gemini-file-search

AI/机器学习 Node.js 云服务

🇨🇳中文介绍

Google Gemini 文件搜索设置

概述

Google Gemini 文件搜索是一个完全托管的 RAG 系统。上传文档（支持 100 多种格式：PDF、Word、Excel、代码等）并使用自然语言进行查询——自动分块、嵌入、语义搜索和引用。

此技能提供的内容：

完整的 @google/genai 文件搜索 API 设置
8 个有预防策略的文档化错误
优化检索效果的分块最佳实践
成本优化（索引 $0.15/100 万令牌，3 倍存储乘数）
Cloudflare Workers + Next.js 集成模板

先决条件

1. Google AI API 密钥

在 https://aistudio.google.com/apikey 创建 API 密钥

免费层限制：

1 GB 存储空间（所有文件搜索存储的总和）
每天 1,500 次请求
每分钟 100 万令牌

付费层定价：

索引：每 100 万输入令牌 $0.15（一次性）
存储：免费（第 1 层：10 GB，第 2 层：100 GB，第 3 层：1 TB）
查询时嵌入：免费（检索到的上下文计入输入令牌）

2. Node.js 环境

最低版本： Node.js 18+（推荐 v20+）

node --version  # 应 >=18.0.0

3. 安装 @google/genai SDK

npm install @google/genai
# 或
pnpm add @google/genai
# 或
yarn add @google/genai

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

762,600 周安装

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

105,000 周安装

Azure RBAC 权限管理工具：查找最小角色、创建自定义角色与自动化分配

101,200 周安装

Azure Data Explorer (Kusto) 查询技能：KQL数据分析、日志遥测与时间序列处理

100,500 周安装

// ❌ 错误：尝试更新文档（无此 API）
await ai.fileSearchStores.documents.update({
  name: documentName,
  customMetadata: { version: '2.0' }
})

// ✅ 正确：先删除再重新上传
const docs = await ai.fileSearchStores.documents.list({
  parent: fileStore.name
})

const oldDoc = docs.documents.find(d => d.displayName === 'manual.pdf')
if (oldDoc) {
  await ai.fileSearchStores.documents.delete({
    name: oldDoc.name,
    force: true
  })
}

await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('manual-v2.pdf'),
  config: { displayName: 'manual.pdf' }
})

// ❌ 错误：未经测试就使用默认值
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('docs.pdf')
  // 默认分块可能太大或太小
})

// ✅ 正确：为精确度配置分块
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('docs.pdf'),
  config: {
    chunkingConfig: {
      whiteSpaceConfig: {
        maxTokensPerChunk: 500,  // 更小的块 = 更精确的检索
        maxOverlapTokens: 50     // 10% 重叠防止上下文丢失
      }
    }
  }
})

// ❌ 错误：元数据字段太多
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('doc.pdf'),
  config: {
    customMetadata: {
      doc_type: 'manual',
      version: '1.0',
      author: 'John Doe',
      department: 'Engineering',
      created_date: '2025-01-01',
      // ... 还有 18 个字段 → 错误！
    }
  }
})

// ✅ 正确：使用分层键或 JSON 字符串
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('doc.pdf'),
  config: {
    customMetadata: {
      doc_type: 'manual',
      version: '1.0',
      author_dept: 'John Doe|Engineering',  // 合并相关字段
      dates: JSON.stringify({                // 或对复杂数据使用 JSON
        created: '2025-01-01',
        updated: '2025-01-15'
      })
    }
  }
})

// ❌ 错误：没有成本估算
await uploadAllDocuments(fileStore.name, './data') // 上传了 10 GB → $375 意外

// ✅ 正确：提前计算成本
const totalSize = getTotalDirectorySize('./data') // 10 GB
const estimatedTokens = (totalSize / 4) // 粗略估算：1 令牌 ≈ 4 字节
const indexingCost = (estimatedTokens / 1e6) * 0.15

console.log(`预计索引成本：$${indexingCost.toFixed(2)}`)
console.log(`预计存储空间：${(totalSize * 3) / 1e9} GB`)

// 继续前确认
const proceed = await confirm(`继续索引？成本：$${indexingCost.toFixed(2)}`)
if (proceed) {
  await uploadAllDocuments(fileStore.name, './data')
}

// ❌ 错误：假设上传是即时的
const operation = await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('large.pdf')
})
// 立即查询 → 无结果！

// ✅ 正确：轮询直到索引完成，带超时
const operation = await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('large.pdf')
})

// 带超时和回退的轮询
const MAX_POLL_TIME = 60000 // 60 秒
const POLL_INTERVAL = 1000
let elapsed = 0

while (!operation.done && elapsed < MAX_POLL_TIME) {
  await new Promise(resolve => setTimeout(resolve, POLL_INTERVAL))
  elapsed += POLL_INTERVAL

  try {
    operation = await ai.operations.get({ name: operation.name })
    console.log(`索引进度：${operation.metadata?.progress || 'processing...'}`)
  } catch (error) {
    console.warn('轮询失败，假设完成：', error)
    break
  }
}

if (operation.error) {
  throw new Error(`索引失败：${operation.error.message}`)
}

// ⚠️ 警告：对于大文件，operations.get() 可能不可靠
// 如果达到超时，手动验证文档是否存在
if (elapsed >= MAX_POLL_TIME) {
  console.warn('轮询超时 - 手动验证文档')
  const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name })
  const uploaded = docs.documents?.find(d => d.displayName === 'large.pdf')
  if (uploaded) {
    console.log('✅ 尽管轮询超时，但找到了文档')
  } else {
    throw new Error('上传失败 - 未找到文档')
  }
}

console.log('✅ 索引完成：', operation.response?.displayName)

// ❌ 错误：尝试删除包含文档的存储
await ai.fileSearchStores.delete({
  name: fileStore.name
})
// 错误：无法删除包含文档的存储

// ✅ 正确：使用强制删除
await ai.fileSearchStores.delete({
  name: fileStore.name,
  force: true  // 删除存储和所有文档
})

// 替代方案：先删除文档
const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name })
for (const doc of docs.documents || []) {
  await ai.fileSearchStores.documents.delete({
    name: doc.name,
    force: true
  })
}
await ai.fileSearchStores.delete({ name: fileStore.name })

// ❌ 错误：使用 Gemini 1.5 模型
const response = await ai.models.generateContent({
  model: 'gemini-1.5-pro',  // 不支持！
  contents: 'What is the installation procedure?',
  config: {
    tools: [{
      fileSearch: { fileSearchStoreNames: [fileStore.name] }
    }]
  }
})

// ✅ 正确：使用 Gemini 3 模型
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',  // ✅ 支持（快速，成本效益高）
  // 或
  // model: 'gemini-3-pro',   // ✅ 支持（更高质量）
  contents: 'What is the installation procedure?',
  config: {
    tools: [{
      fileSearch: { fileSearchStoreNames: [fileStore.name] }
    }]
  }
})

// ✅ 正确：升级到 v1.34.0+ 以自动修复
npm install @google/genai@latest  // v1.34.0+

await ai.fileSearchStores.uploadToFileSearchStore({
  name: storeName,
  file: new Blob([arrayBuffer], { type: 'application/pdf' }),
  config: {
    displayName: 'Safety Manual.pdf',  // ✅ 现在已保留
    customMetadata: { version: '1.0' }  // ✅ 现在已保留
  }
})

// ⚠️ v1.33.0 及更早版本的解决方法：使用可恢复上传
const uploadUrl = `https://generativelanguage.googleapis.com/upload/v1beta/${storeName}:uploadToFileSearchStore?key=${API_KEY}`

// 步骤 1：在请求体中初始化并包含 displayName
const initResponse = await fetch(uploadUrl, {
  method: 'POST',
  headers: {
    'X-Goog-Upload-Protocol': 'resumable',
    'X-Goog-Upload-Command': 'start',
    'X-Goog-Upload-Header-Content-Length': numBytes.toString(),
    'X-Goog-Upload-Header-Content-Type': 'application/pdf',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    displayName: 'Safety Manual.pdf'  // ✅ 适用于可恢复上传
  })
})

// 步骤 2：上传文件字节
const uploadUrl2 = initResponse.headers.get('X-Goog-Upload-URL')
await fetch(uploadUrl2, {
  method: 'PUT',
  headers: {
    'Content-Length': numBytes.toString(),
    'X-Goog-Upload-Offset': '0',
    'X-Goog-Upload-Command': 'upload, finalize',
    'Content-Type': 'application/pdf'
  },
  body: fileBytes
})

// ❌ 错误：结构化输出覆盖了引用
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Summarize guidelines',
  config: {
    responseMimeType: 'application/json',  // 丢失引用
    tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }]
  }
})

// ✅ 正确：两步法
// 步骤 1：获取有引用的文本响应
const textResponse = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Summarize guidelines',
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }]
  }
})

const grounding = textResponse.candidates[0].groundingMetadata

// 步骤 2：在提示词中转换为结构化格式
const jsonResponse = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: `Convert to JSON: ${textResponse.text}

Format:
{
  "summary": "...",
  "key_points": ["..."]
}`,
  config: {
    responseMimeType: 'application/json',
    responseSchema: {
      type: 'object',
      properties: {
        summary: { type: 'string' },
        key_points: { type: 'array', items: { type: 'string' } }
      }
    }
  }
})

// 合并结果
const result = {
  data: JSON.parse(jsonResponse.text),
  sources: grounding.groundingChunks
}

// ❌ 错误：组合搜索工具
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'What are the latest industry guidelines?',
  config: {
    tools: [
      { googleSearch: {} },
      { fileSearch: { fileSearchStoreNames: [storeName] } }
    ]
  }
})

// ✅ 正确：使用独立的专业代理
async function searchWeb(query: string) {
  return ai.models.generateContent({
    model: 'gemini-3-flash',
    contents: query,
    config: { tools: [{ googleSearch: {} }] }
  })
}

async function searchDocuments(query: string) {
  return ai.models.generateContent({
    model: 'gemini-3-flash',
    contents: query,
    config: { tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }] }
  })
}

// 根据查询类型编排
const needsWeb = query.includes('latest') || query.includes('current')
const response = needsWeb
  ? await searchWeb(query)
  : await searchDocuments(query)

// ❌ 错误：期望响应中包含元数据
const batchRequest = {
  metadata: { key: 'my-request-id' },
  contents: [{ parts: [{ text: 'Question?' }], role: 'user' }],
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }]
  }
}

const batchResponse = await ai.batch.create({ requests: [batchRequest] })
console.log(batchResponse.responses[0].metadata)  // ❌ undefined

// ✅ 正确：使用数组索引进行关联
const requests = [
  { metadata: { id: 'req-1' }, contents: [...] },
  { metadata: { id: 'req-2' }, contents: [...] }
]

const responses = await ai.batch.create({ requests })

// 按索引映射（不理想但有效）
responses.responses.forEach((response, i) => {
  const requestMetadata = requests[i].metadata
  console.log(`Response for ${requestMetadata.id}:`, response)
})

// 列出所有存储（分页）
const stores = await ai.fileSearchStores.list({
  pageSize: 20  // 每页最多 20 个
})

// 按显示名称查找
let targetStore = null
let pageToken = null

do {
  const page = await ai.fileSearchStores.list({ pageToken })
  targetStore = page.fileSearchStores.find(
    s => s.displayName === 'my-knowledge-base'
  )
  pageToken = page.nextPageToken
} while (!targetStore && pageToken)

if (targetStore) {
  console.log('找到现有存储：', targetStore.name)
} else {
  console.log('未找到存储，正在创建新的...')
}

const operation = await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('./docs/manual.pdf'),
  config: {
    displayName: 'Installation Manual',
    customMetadata: {
      doc_type: 'manual',
      version: '1.0',
      language: 'en'
    },
    chunkingConfig: {
      whiteSpaceConfig: {
        maxTokensPerChunk: 500,
        maxOverlapTokens: 50
      }
    }
  }
})

// 轮询直到索引完成
while (!operation.done) {
  await new Promise(resolve => setTimeout(resolve, 1000))
  operation = await ai.operations.get({ name: operation.name })
}

console.log('✅ 已索引：', operation.response.displayName)

const filePaths = [
  './docs/manual.pdf',
  './docs/faq.md',
  './docs/troubleshooting.docx'
]

// 并发上传所有文件
const uploadPromises = filePaths.map(filePath =>
  ai.fileSearchStores.uploadToFileSearchStore({
    name: fileStore.name,
    file: fs.createReadStream(filePath),
    config: {
      displayName: filePath.split('/').pop(),
      customMetadata: {
        doc_type: 'support',
        source_path: filePath
      },
      chunkingConfig: {
        whiteSpaceConfig: {
          maxTokensPerChunk: 500,
          maxOverlapTokens: 50
        }
      }
    }
  })
)

const operations = await Promise.all(uploadPromises)

// 轮询所有操作
for (const operation of operations) {
  let op = operation
  while (!op.done) {
    await new Promise(resolve => setTimeout(resolve, 1000))
    op = await ai.operations.get({ name: op.name })
  }
  console.log('✅ 已索引：', op.response.displayName)
}

const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'What are the safety precautions for installation?',
  config: {
    tools: [{
      fileSearch: {
        fileSearchStoreNames: [fileStore.name]
      }
    }]
  }
})

console.log('答案：', response.text)

// 访问引用
const grounding = response.candidates[0].groundingMetadata
if (grounding?.groundingChunks) {
  console.log('\n来源：')
  grounding.groundingChunks.forEach((chunk, i) => {
    console.log(`${i + 1}. ${chunk.retrievedContext?.title || 'Unknown'}`)
    console.log(`   URI: ${chunk.retrievedContext?.uri || 'N/A'}`)
  })
}

// 列出存储中的所有文档
const docs = await ai.fileSearchStores.documents.list({
  parent: fileStore.name,
  pageSize: 20
})

console.log(`总文档数：${docs.documents?.length || 0}`)

docs.documents?.forEach(doc => {
  console.log(`- ${doc.displayName} (${doc.name})`)
  console.log(`  元数据：`, doc.customMetadata)
})

// 获取特定文档详情
const docDetails = await ai.fileSearchStores.documents.get({
  name: docs.documents[0].name
})

console.log('文档详情：', docDetails)

// 删除文档
await ai.fileSearchStores.documents.delete({
  name: docs.documents[0].name,
  force: true
})

// ❌ 昂贵：搜索所有 10GB 文档
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Reset procedure?',
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }]
  }
})

// ✅ 更便宜：仅过滤到故障排除文档（子集）
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Reset procedure?',
  config: {
    tools: [{
      fileSearch: {
        fileSearchStoreNames: [fileStore.name],
        metadataFilter: 'doc_type="troubleshooting"'  // 减少搜索范围
      }
    }]
  }
})

🇺🇸English

Google Gemini File Search Setup

Overview

Google Gemini File Search is a fully managed RAG system. Upload documents (100+ formats: PDF, Word, Excel, code) and query with natural language—automatic chunking, embeddings, semantic search, and citations.

What This Skill Provides:

Complete @google/genai File Search API setup
8 documented errors with prevention strategies
Chunking best practices for optimal retrieval
Cost optimization ($0.15/1M tokens indexing, 3x storage multiplier)
Cloudflare Workers + Next.js integration templates

Prerequisites

1. Google AI API Key

Create an API key at https://aistudio.google.com/apikey

Free Tier Limits:

1 GB storage (total across all file search stores)
1,500 requests per day
1 million tokens per minute

Paid Tier Pricing:

Indexing: $0.15 per 1M input tokens (one-time)
Storage: Free (Tier 1: 10 GB, Tier 2: 100 GB, Tier 3: 1 TB)
Query-time embeddings: Free (retrieved context counts as input tokens)

2. Node.js Environment

Minimum Version: Node.js 18+ (v20+ recommended)

node --version  # Should be >=18.0.0

3. Install @google/genai SDK

npm install @google/genai
# or
pnpm add @google/genai
# or
yarn add @google/genai

Current Stable Version: 1.30.0+ (verify with npm view @google/genai version)

⚠️ Important: File Search API requires @google/genai v1.29.0 or later. Earlier versions do not support File Search. The API was added in v1.29.0 (November 5, 2025).

4. TypeScript Configuration (Optional but Recommended)

{
  "compilerOptions": {
    "target": "ES2020",
    "module": "ESNext",
    "moduleResolution": "node",
    "esModuleInterop": true,
    "strict": true,
    "skipLibCheck": true
  }
}

Common Errors Prevented

This skill prevents 12 common errors encountered when implementing File Search:

Error 1: Document Immutability

Symptom:

Error: Documents cannot be modified after indexing

Cause: Documents are immutable once indexed. There is no PATCH or UPDATE operation.

Prevention: Use the delete+re-upload pattern for updates:

// ❌ WRONG: Trying to update document (no such API)
await ai.fileSearchStores.documents.update({
  name: documentName,
  customMetadata: { version: '2.0' }
})

// ✅ CORRECT: Delete then re-upload
const docs = await ai.fileSearchStores.documents.list({
  parent: fileStore.name
})

const oldDoc = docs.documents.find(d => d.displayName === 'manual.pdf')
if (oldDoc) {
  await ai.fileSearchStores.documents.delete({
    name: oldDoc.name,
    force: true
  })
}

await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('manual-v2.pdf'),
  config: { displayName: 'manual.pdf' }
})

Source: https://ai.google.dev/api/file-search/documents

Error 2: Storage Quota Exceeded

Symptom:

Error: Quota exceeded. Expected 1GB limit, but 3.2GB used.

Cause: Storage calculation includes input files + embeddings + metadata. Total storage ≈ 3x input size.

Prevention: Calculate storage before upload:

// ❌ WRONG: Assuming storage = file size
const fileSize = fs.statSync('data.pdf').size // 500 MB
// Expect 500 MB usage → WRONG

// ✅ CORRECT: Account for 3x multiplier
const fileSize = fs.statSync('data.pdf').size // 500 MB
const estimatedStorage = fileSize * 3 // 1.5 GB (embeddings + metadata)
console.log(`Estimated storage: ${estimatedStorage / 1e9} GB`)

// Check if within quota before upload
if (estimatedStorage > 1e9) {
  console.warn('⚠️ File may exceed free tier 1 GB limit')
}

Source: https://blog.google/technology/developers/file-search-gemini-api/

Error 3: Incorrect Chunking Configuration

Symptom: Poor retrieval quality, irrelevant results, or context cutoff mid-sentence.

Cause: Default chunking may not be optimal for your content type.

Prevention: Use recommended chunking strategy:

// ❌ WRONG: Using defaults without testing
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('docs.pdf')
  // Default chunking may be too large or too small
})

// ✅ CORRECT: Configure chunking for precision
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('docs.pdf'),
  config: {
    chunkingConfig: {
      whiteSpaceConfig: {
        maxTokensPerChunk: 500,  // Smaller chunks = more precise retrieval
        maxOverlapTokens: 50     // 10% overlap prevents context loss
      }
    }
  }
})

Chunking Guidelines:

Technical docs/code: 500 tokens/chunk, 50 overlap
Prose/articles: 800 tokens/chunk, 80 overlap
Legal/contracts: 300 tokens/chunk, 30 overlap (high precision)

Source: https://www.philschmid.de/gemini-file-search-javascript

Error 4: Metadata Limits Exceeded

Symptom:

Error: Maximum 20 custom metadata key-value pairs allowed

Cause: Each document can have at most 20 metadata fields.

Prevention: Design compact metadata schema:

// ❌ WRONG: Too many metadata fields
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('doc.pdf'),
  config: {
    customMetadata: {
      doc_type: 'manual',
      version: '1.0',
      author: 'John Doe',
      department: 'Engineering',
      created_date: '2025-01-01',
      // ... 18 more fields → Error!
    }
  }
})

// ✅ CORRECT: Use hierarchical keys or JSON strings
await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('doc.pdf'),
  config: {
    customMetadata: {
      doc_type: 'manual',
      version: '1.0',
      author_dept: 'John Doe|Engineering',  // Combine related fields
      dates: JSON.stringify({                // Or use JSON for complex data
        created: '2025-01-01',
        updated: '2025-01-15'
      })
    }
  }
})

Source: https://ai.google.dev/api/file-search/documents

Error 5: Indexing Cost Surprises

Symptom: Unexpected bill for $375 after uploading 10 GB of documents.

Cause: Indexing costs are one-time but calculated per input token ($0.15/1M tokens).

Prevention: Estimate costs before indexing:

// ❌ WRONG: No cost estimation
await uploadAllDocuments(fileStore.name, './data') // 10 GB uploaded → $375 surprise

// ✅ CORRECT: Calculate costs upfront
const totalSize = getTotalDirectorySize('./data') // 10 GB
const estimatedTokens = (totalSize / 4) // Rough estimate: 1 token ≈ 4 bytes
const indexingCost = (estimatedTokens / 1e6) * 0.15

console.log(`Estimated indexing cost: $${indexingCost.toFixed(2)}`)
console.log(`Estimated storage: ${(totalSize * 3) / 1e9} GB`)

// Confirm before proceeding
const proceed = await confirm(`Proceed with indexing? Cost: $${indexingCost.toFixed(2)}`)
if (proceed) {
  await uploadAllDocuments(fileStore.name, './data')
}

Cost Examples:

1 GB text ≈ 250M tokens = $37.50 indexing
100 MB PDF ≈ 25M tokens = $3.75 indexing
10 MB code ≈ 2.5M tokens = $0.38 indexing

Source: https://ai.google.dev/pricing

Error 6: Not Polling Operation Status

Symptom: Query returns no results immediately after upload, or incomplete indexing.

Cause: File uploads are processed asynchronously. Must poll operation until done: true.

Prevention: Always poll operation status with timeout and fallback:

// ❌ WRONG: Assuming upload is instant
const operation = await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('large.pdf')
})
// Immediately query → No results!

// ✅ CORRECT: Poll until indexing complete with timeout
const operation = await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('large.pdf')
})

// Poll with timeout and fallback
const MAX_POLL_TIME = 60000 // 60 seconds
const POLL_INTERVAL = 1000
let elapsed = 0

while (!operation.done && elapsed < MAX_POLL_TIME) {
  await new Promise(resolve => setTimeout(resolve, POLL_INTERVAL))
  elapsed += POLL_INTERVAL

  try {
    operation = await ai.operations.get({ name: operation.name })
    console.log(`Indexing progress: ${operation.metadata?.progress || 'processing...'}`)
  } catch (error) {
    console.warn('Polling failed, assuming complete:', error)
    break
  }
}

if (operation.error) {
  throw new Error(`Indexing failed: ${operation.error.message}`)
}

// ⚠️ Warning: operations.get() can be unreliable for large files
// If timeout reached, verify document exists manually
if (elapsed >= MAX_POLL_TIME) {
  console.warn('Polling timeout - verifying document manually')
  const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name })
  const uploaded = docs.documents?.find(d => d.displayName === 'large.pdf')
  if (uploaded) {
    console.log('✅ Document found despite polling timeout')
  } else {
    throw new Error('Upload failed - document not found')
  }
}

console.log('✅ Indexing complete:', operation.response?.displayName)

Source: https://ai.google.dev/api/file-search/file-search-stores#uploadtofilesearchstore, GitHub Issue #1211

Error 7: Forgetting Force Delete

Symptom:

Error: Cannot delete store with documents. Set force=true.

Cause: Stores with documents require force: true to delete (prevents accidental deletion).

Prevention: Always use force: true when deleting non-empty stores:

// ❌ WRONG: Trying to delete store with documents
await ai.fileSearchStores.delete({
  name: fileStore.name
})
// Error: Cannot delete store with documents

// ✅ CORRECT: Use force delete
await ai.fileSearchStores.delete({
  name: fileStore.name,
  force: true  // Deletes store AND all documents
})

// Alternative: Delete documents first
const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name })
for (const doc of docs.documents || []) {
  await ai.fileSearchStores.documents.delete({
    name: doc.name,
    force: true
  })
}
await ai.fileSearchStores.delete({ name: fileStore.name })

Source: https://ai.google.dev/api/file-search/file-search-stores#delete

Error 8: Using Unsupported Models

Symptom:

Error: File Search is only supported for Gemini 3 Pro and Flash models

Cause: File Search requires Gemini 3 Pro or Gemini 3 Flash. Gemini 2.x and 1.5 models are not supported.

Prevention: Always use Gemini 3 models:

// ❌ WRONG: Using Gemini 1.5 model
const response = await ai.models.generateContent({
  model: 'gemini-1.5-pro',  // Not supported!
  contents: 'What is the installation procedure?',
  config: {
    tools: [{
      fileSearch: { fileSearchStoreNames: [fileStore.name] }
    }]
  }
})

// ✅ CORRECT: Use Gemini 3 models
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',  // ✅ Supported (fast, cost-effective)
  // OR
  // model: 'gemini-3-pro',   // ✅ Supported (higher quality)
  contents: 'What is the installation procedure?',
  config: {
    tools: [{
      fileSearch: { fileSearchStoreNames: [fileStore.name] }
    }]
  }
})

Source: https://ai.google.dev/gemini-api/docs/file-search

Error 9: displayName Not Preserved for Blob Sources (Fixed v1.34.0+)

Symptom:

groundingChunks[0].title === null  // No document source shown

Cause: In @google/genai versions prior to v1.34.0, when uploading files as Blob objects (not file paths), the SDK dropped the displayName and customMetadata configuration fields.

Prevention:

// ✅ CORRECT: Upgrade to v1.34.0+ for automatic fix
npm install @google/genai@latest  // v1.34.0+

await ai.fileSearchStores.uploadToFileSearchStore({
  name: storeName,
  file: new Blob([arrayBuffer], { type: 'application/pdf' }),
  config: {
    displayName: 'Safety Manual.pdf',  // ✅ Now preserved
    customMetadata: { version: '1.0' }  // ✅ Now preserved
  }
})

// ⚠️ WORKAROUND for v1.33.0 and earlier: Use resumable upload
const uploadUrl = `https://generativelanguage.googleapis.com/upload/v1beta/${storeName}:uploadToFileSearchStore?key=${API_KEY}`

// Step 1: Initiate with displayName in body
const initResponse = await fetch(uploadUrl, {
  method: 'POST',
  headers: {
    'X-Goog-Upload-Protocol': 'resumable',
    'X-Goog-Upload-Command': 'start',
    'X-Goog-Upload-Header-Content-Length': numBytes.toString(),
    'X-Goog-Upload-Header-Content-Type': 'application/pdf',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    displayName: 'Safety Manual.pdf'  // ✅ Works with resumable upload
  })
})

// Step 2: Upload file bytes
const uploadUrl2 = initResponse.headers.get('X-Goog-Upload-URL')
await fetch(uploadUrl2, {
  method: 'PUT',
  headers: {
    'Content-Length': numBytes.toString(),
    'X-Goog-Upload-Offset': '0',
    'X-Goog-Upload-Command': 'upload, finalize',
    'Content-Type': 'application/pdf'
  },
  body: fileBytes
})

Source: GitHub Issue #1078

Error 10: Grounding Metadata Ignored with JSON Response Mode

Symptom:

response.candidates[0].groundingMetadata === undefined
// Even though fileSearch tool is configured

Cause: When using responseMimeType: 'application/json' for structured output, the API ignores the fileSearch tool and returns no grounding metadata, even with Gemini 3 models.

Prevention:

// ❌ WRONG: Structured output overrides grounding
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Summarize guidelines',
  config: {
    responseMimeType: 'application/json',  // Loses grounding
    tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }]
  }
})

// ✅ CORRECT: Two-step approach
// Step 1: Get grounded text response
const textResponse = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Summarize guidelines',
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }]
  }
})

const grounding = textResponse.candidates[0].groundingMetadata

// Step 2: Convert to structured format in prompt
const jsonResponse = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: `Convert to JSON: ${textResponse.text}

Format:
{
  "summary": "...",
  "key_points": ["..."]
}`,
  config: {
    responseMimeType: 'application/json',
    responseSchema: {
      type: 'object',
      properties: {
        summary: { type: 'string' },
        key_points: { type: 'array', items: { type: 'string' } }
      }
    }
  }
})

// Combine results
const result = {
  data: JSON.parse(jsonResponse.text),
  sources: grounding.groundingChunks
}

Source: GitHub Issue #829

Error 11: Google Search and File Search Tools Are Mutually Exclusive

Symptom:

Error: "Search as a tool and file search tool are not supported together"
Status: INVALID_ARGUMENT

Cause: The Gemini API does not allow using googleSearch and fileSearch tools in the same request.

Prevention:

// ❌ WRONG: Combining search tools
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'What are the latest industry guidelines?',
  config: {
    tools: [
      { googleSearch: {} },
      { fileSearch: { fileSearchStoreNames: [storeName] } }
    ]
  }
})

// ✅ CORRECT: Use separate specialist agents
async function searchWeb(query: string) {
  return ai.models.generateContent({
    model: 'gemini-3-flash',
    contents: query,
    config: { tools: [{ googleSearch: {} }] }
  })
}

async function searchDocuments(query: string) {
  return ai.models.generateContent({
    model: 'gemini-3-flash',
    contents: query,
    config: { tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }] }
  })
}

// Orchestrate based on query type
const needsWeb = query.includes('latest') || query.includes('current')
const response = needsWeb
  ? await searchWeb(query)
  : await searchDocuments(query)

Source: GitHub Issue #435, Google Codelabs

Error 12: Batch API Missing Response Metadata (Community-sourced)

Symptom: Cannot correlate batch responses with requests when using metadata field.

Cause: When using Batch API with InlinedRequest that includes a metadata field, the corresponding InlinedResponse does not return the metadata.

Prevention:

// ❌ WRONG: Expecting metadata in response
const batchRequest = {
  metadata: { key: 'my-request-id' },
  contents: [{ parts: [{ text: 'Question?' }], role: 'user' }],
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }]
  }
}

const batchResponse = await ai.batch.create({ requests: [batchRequest] })
console.log(batchResponse.responses[0].metadata)  // ❌ undefined

// ✅ CORRECT: Use array index to correlate
const requests = [
  { metadata: { id: 'req-1' }, contents: [...] },
  { metadata: { id: 'req-2' }, contents: [...] }
]

const responses = await ai.batch.create({ requests })

// Map by index (not ideal but works)
responses.responses.forEach((response, i) => {
  const requestMetadata = requests[i].metadata
  console.log(`Response for ${requestMetadata.id}:`, response)
})

Community Verification: Maintainer confirmed, internal bug filed.

Source: GitHub Issue #1191

Setup Instructions

Step 1: Initialize Client

import { GoogleGenAI } from '@google/genai'
import fs from 'fs'

// Initialize client with API key
const ai = new GoogleGenAI({
  apiKey: process.env.GOOGLE_API_KEY
})

// Verify API key is set
if (!process.env.GOOGLE_API_KEY) {
  throw new Error('GOOGLE_API_KEY environment variable is required')
}

Step 2: Create File Search Store

// Create a store (container for documents)
const fileStore = await ai.fileSearchStores.create({
  config: {
    displayName: 'my-knowledge-base',  // Human-readable name
    // Optional: Add store-level metadata
    customMetadata: {
      project: 'customer-support',
      environment: 'production'
    }
  }
})

console.log('Created store:', fileStore.name)
// Output: fileSearchStores/abc123xyz...

Finding Existing Stores:

// List all stores (paginated)
const stores = await ai.fileSearchStores.list({
  pageSize: 20  // Max 20 per page
})

// Find by display name
let targetStore = null
let pageToken = null

do {
  const page = await ai.fileSearchStores.list({ pageToken })
  targetStore = page.fileSearchStores.find(
    s => s.displayName === 'my-knowledge-base'
  )
  pageToken = page.nextPageToken
} while (!targetStore && pageToken)

if (targetStore) {
  console.log('Found existing store:', targetStore.name)
} else {
  console.log('Store not found, creating new one...')
}

Step 3: Upload Documents

Single File Upload:

const operation = await ai.fileSearchStores.uploadToFileSearchStore({
  name: fileStore.name,
  file: fs.createReadStream('./docs/manual.pdf'),
  config: {
    displayName: 'Installation Manual',
    customMetadata: {
      doc_type: 'manual',
      version: '1.0',
      language: 'en'
    },
    chunkingConfig: {
      whiteSpaceConfig: {
        maxTokensPerChunk: 500,
        maxOverlapTokens: 50
      }
    }
  }
})

// Poll until indexing complete
while (!operation.done) {
  await new Promise(resolve => setTimeout(resolve, 1000))
  operation = await ai.operations.get({ name: operation.name })
}

console.log('✅ Indexed:', operation.response.displayName)

Batch Upload (Concurrent):

const filePaths = [
  './docs/manual.pdf',
  './docs/faq.md',
  './docs/troubleshooting.docx'
]

// Upload all files concurrently
const uploadPromises = filePaths.map(filePath =>
  ai.fileSearchStores.uploadToFileSearchStore({
    name: fileStore.name,
    file: fs.createReadStream(filePath),
    config: {
      displayName: filePath.split('/').pop(),
      customMetadata: {
        doc_type: 'support',
        source_path: filePath
      },
      chunkingConfig: {
        whiteSpaceConfig: {
          maxTokensPerChunk: 500,
          maxOverlapTokens: 50
        }
      }
    }
  })
)

const operations = await Promise.all(uploadPromises)

// Poll all operations
for (const operation of operations) {
  let op = operation
  while (!op.done) {
    await new Promise(resolve => setTimeout(resolve, 1000))
    op = await ai.operations.get({ name: op.name })
  }
  console.log('✅ Indexed:', op.response.displayName)
}

Step 4: Query with File Search

Basic Query:

const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'What are the safety precautions for installation?',
  config: {
    tools: [{
      fileSearch: {
        fileSearchStoreNames: [fileStore.name]
      }
    }]
  }
})

console.log('Answer:', response.text)

// Access citations
const grounding = response.candidates[0].groundingMetadata
if (grounding?.groundingChunks) {
  console.log('\nSources:')
  grounding.groundingChunks.forEach((chunk, i) => {
    console.log(`${i + 1}. ${chunk.retrievedContext?.title || 'Unknown'}`)
    console.log(`   URI: ${chunk.retrievedContext?.uri || 'N/A'}`)
  })
}

Query with Metadata Filtering:

const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'How do I reset the device?',
  config: {
    tools: [{
      fileSearch: {
        fileSearchStoreNames: [fileStore.name],
        // Filter to only search troubleshooting docs in English, version 1.0
        metadataFilter: 'doc_type="troubleshooting" AND language="en" AND version="1.0"'
      }
    }]
  }
})

console.log('Answer:', response.text)

Metadata Filter Syntax:

AND: key1="value1" AND key2="value2"
OR: key1="value1" OR key1="value2"
Parentheses: (key1="a" OR key1="b") AND key2="c"

Step 5: List and Manage Documents

// List all documents in store
const docs = await ai.fileSearchStores.documents.list({
  parent: fileStore.name,
  pageSize: 20
})

console.log(`Total documents: ${docs.documents?.length || 0}`)

docs.documents?.forEach(doc => {
  console.log(`- ${doc.displayName} (${doc.name})`)
  console.log(`  Metadata:`, doc.customMetadata)
})

// Get specific document details
const docDetails = await ai.fileSearchStores.documents.get({
  name: docs.documents[0].name
})

console.log('Document details:', docDetails)

// Delete document
await ai.fileSearchStores.documents.delete({
  name: docs.documents[0].name,
  force: true
})

Step 6: Cleanup

// Delete entire store (force deletes all documents)
await ai.fileSearchStores.delete({
  name: fileStore.name,
  force: true
})

console.log('✅ Store deleted')

Recommended Chunking Strategies

Chunking configuration significantly impacts retrieval quality. Adjust based on content type:

Technical Documentation

chunkingConfig: {
  whiteSpaceConfig: {
    maxTokensPerChunk: 500,   // Smaller chunks for precise code/API lookup
    maxOverlapTokens: 50      // 10% overlap
  }
}

Best for: API docs, SDK references, code examples, configuration guides

Prose and Articles

chunkingConfig: {
  whiteSpaceConfig: {
    maxTokensPerChunk: 800,   // Larger chunks preserve narrative flow
    maxOverlapTokens: 80      // 10% overlap
  }
}

Best for: Blog posts, news articles, product descriptions, marketing materials

Legal and Contracts

chunkingConfig: {
  whiteSpaceConfig: {
    maxTokensPerChunk: 300,   // Very small chunks for high precision
    maxOverlapTokens: 30      // 10% overlap
  }
}

Best for: Legal documents, contracts, regulations, compliance docs

FAQ and Support

chunkingConfig: {
  whiteSpaceConfig: {
    maxTokensPerChunk: 400,   // Medium chunks (1-2 Q&A pairs)
    maxOverlapTokens: 40      // 10% overlap
  }
}

Best for: FAQs, troubleshooting guides, how-to articles

General Rule: Maintain 10% overlap (overlap = chunk size / 10) to prevent context loss at chunk boundaries.

Metadata Best Practices

Design metadata schema for filtering and organization:

Example: Customer Support Knowledge Base

customMetadata: {
  doc_type: 'faq' | 'manual' | 'troubleshooting' | 'guide',
  product: 'widget-pro' | 'widget-lite',
  version: '1.0' | '2.0',
  language: 'en' | 'es' | 'fr',
  category: 'installation' | 'configuration' | 'maintenance',
  priority: 'critical' | 'normal' | 'low',
  last_updated: '2025-01-15',
  author: 'support-team'
}

Query Example:

metadataFilter: 'product="widget-pro" AND (doc_type="troubleshooting" OR doc_type="faq") AND language="en"'

Example: Legal Document Repository

customMetadata: {
  doc_type: 'contract' | 'regulation' | 'case-law' | 'policy',
  jurisdiction: 'US' | 'EU' | 'UK',
  practice_area: 'employment' | 'corporate' | 'ip' | 'tax',
  effective_date: '2025-01-01',
  status: 'active' | 'archived',
  confidentiality: 'public' | 'internal' | 'privileged'
}

Example: Code Documentation

customMetadata: {
  doc_type: 'api-reference' | 'tutorial' | 'example' | 'changelog',
  language: 'javascript' | 'python' | 'java' | 'go',
  framework: 'react' | 'nextjs' | 'express' | 'fastapi',
  version: '1.2.0',
  difficulty: 'beginner' | 'intermediate' | 'advanced'
}

Tips:

Use consistent key naming (snake_case or camelCase)
Limit to most important filterable fields (20 max)
Use enums/constants for values (easier filtering)
Include version and date fields for time-based filtering

Cost Optimization

1. Deduplicate Before Upload

// Track uploaded file hashes to avoid duplicates
const uploadedHashes = new Set<string>()

async function uploadWithDeduplication(filePath: string) {
  const fileHash = await getFileHash(filePath)

  if (uploadedHashes.has(fileHash)) {
    console.log(`Skipping duplicate: ${filePath}`)
    return
  }

  await ai.fileSearchStores.uploadToFileSearchStore({
    name: fileStore.name,
    file: fs.createReadStream(filePath)
  })

  uploadedHashes.add(fileHash)
}

2. Compress Large Files

// Convert images to text before indexing (OCR)
// Compress PDFs (remove images, use text-only)
// Use markdown instead of Word docs (smaller size)

3. Use Metadata Filtering to Reduce Query Scope

// ❌ EXPENSIVE: Search all 10GB of documents
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Reset procedure?',
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }]
  }
})

// ✅ CHEAPER: Filter to only troubleshooting docs (subset)
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Reset procedure?',
  config: {
    tools: [{
      fileSearch: {
        fileSearchStoreNames: [fileStore.name],
        metadataFilter: 'doc_type="troubleshooting"'  // Reduces search scope
      }
    }]
  }
})

4. Choose Flash Over Pro for Cost Savings

// Gemini 3 Flash is 10x cheaper than Pro for queries
// Use Flash unless you need Pro's advanced reasoning

// Development/testing: Use Flash
model: 'gemini-3-flash'

// Production (high-stakes answers): Use Pro
model: 'gemini-3-pro'

5. Monitor Storage Usage

// List stores and estimate storage
const stores = await ai.fileSearchStores.list()

for (const store of stores.fileSearchStores || []) {
  const docs = await ai.fileSearchStores.documents.list({
    parent: store.name
  })

  console.log(`Store: ${store.displayName}`)
  console.log(`Documents: ${docs.documents?.length || 0}`)
  // Estimate storage (3x input size)
  console.log(`Estimated storage: ~${(docs.documents?.length || 0) * 10} MB`)
}

Testing & Verification

Verify Store Creation

const store = await ai.fileSearchStores.get({
  name: fileStore.name
})

console.assert(store.displayName === 'my-knowledge-base', 'Store name mismatch')
console.log('✅ Store created successfully')

Verify Document Indexing

const docs = await ai.fileSearchStores.documents.list({
  parent: fileStore.name
})

console.assert(docs.documents?.length > 0, 'No documents indexed')
console.log(`✅ ${docs.documents?.length} documents indexed`)

Verify Query Functionality

const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'What is this knowledge base about?',
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }]
  }
})

console.assert(response.text.length > 0, 'Empty response')
console.log('✅ Query successful:', response.text.substring(0, 100) + '...')

Verify Citations

const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Provide a specific answer with citations.',
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }]
  }
})

const grounding = response.candidates[0].groundingMetadata
console.assert(
  grounding?.groundingChunks?.length > 0,
  'No grounding/citations returned'
)
console.log(`✅ ${grounding?.groundingChunks?.length} citations returned`)

Integration Examples

Streaming Support

File Search supports streaming responses with generateContentStream():

// ✅ Streaming works with File Search (v1.34.0+)
const stream = await ai.models.generateContentStream({
  model: 'gemini-3-flash',
  contents: 'Summarize the document',
  config: {
    tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }]
  }
})

for await (const chunk of stream) {
  process.stdout.write(chunk.text)
}

// Access grounding after stream completes
const grounding = stream.candidates[0].groundingMetadata

Note: Early SDK versions (pre-v1.34.0) may have had streaming issues. Use v1.34.0+ for reliable streaming support.

Source: GitHub Issue #1221

Working Templates

This skill includes 3 working templates in the templates/ directory:

Template 1: basic-node-rag

Minimal Node.js/TypeScript example demonstrating:

Create file search store
Upload multiple documents
Query with natural language
Display citations

Use when: Learning File Search, prototyping, simple CLI tools

Run:

cd templates/basic-node-rag
npm install
npm run dev

Template 2: cloudflare-worker-rag

Cloudflare Workers integration showing:

Edge API for document upload
Edge API for semantic search
Integration with R2 for document storage
Hybrid architecture (Gemini File Search + Cloudflare edge)

Use when: Building global edge applications, integrating with Cloudflare stack

Deploy:

cd templates/cloudflare-worker-rag
npm install
npx wrangler deploy

Template 3: nextjs-docs-search

Full-stack Next.js application featuring:

Document upload UI with drag-and-drop
Real-time search interface
Citation rendering with source links
Metadata filtering UI

Use when: Building production documentation sites, knowledge bases

Run:

cd templates/nextjs-docs-search
npm install
npm run dev

References

Official Documentation:

File Search Overview: https://ai.google.dev/gemini-api/docs/file-search
API Reference (Stores): https://ai.google.dev/api/file-search/file-search-stores
API Reference (Documents): https://ai.google.dev/api/file-search/documents
Blog Announcement: https://blog.google/technology/developers/file-search-gemini-api/
Pricing: https://ai.google.dev/pricing

Tutorials:

JavaScript/TypeScript Guide: https://www.philschmid.de/gemini-file-search-javascript
SDK Repository: https://github.com/googleapis/js-genai

Bundled Resources in This Skill:

references/api-reference.md - Complete API documentation
references/chunking-best-practices.md - Detailed chunking strategies
references/pricing-calculator.md - Cost estimation guide
references/migration-from-openai.md - Migration guide from OpenAI Files API
scripts/create-store.ts - CLI tool to create stores
scripts/upload-batch.ts - Batch upload script
scripts/query-store.ts - Interactive query tool
scripts/cleanup.ts - Cleanup script

Working Templates:

templates/basic-node-rag/ - Minimal Node.js example
templates/cloudflare-worker-rag/ - Edge deployment example
templates/nextjs-docs-search/ - Full-stack Next.js app

Skill Version: 1.1.0 Last Verified: 2026-01-21 Package Version: @google/genai ^1.38.0 (minimum 1.29.0 required) Token Savings: ~67% Errors Prevented: 12 Changes: Added 4 new errors from community research (displayName Blob issue, grounding with JSON mode, tool conflicts, batch API metadata), enhanced polling timeout pattern with fallback verification, added streaming support note

Weekly Installs

328

Repository

jezweb/claude-skills

GitHub Stars

652

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

claude-code270

gemini-cli220

opencode215

antigravity203

cursor202

codex186

Google Gemini 文件搜索设置教程 - 完全托管RAG系统，支持100+格式，集成最佳实践

🇨🇳中文介绍

Google Gemini 文件搜索设置

概述

先决条件

1. Google AI API 密钥

2. Node.js 环境

3. 安装 @google/genai SDK

相关 Skills

4. TypeScript 配置（可选但推荐）

预防的常见错误

错误 1：文档不可变性

错误 2：存储配额超出

错误 3：分块配置不正确

错误 4：超出元数据限制

错误 5：索引成本意外

错误 6：未轮询操作状态

错误 7：忘记强制删除

错误 8：使用不支持的模型

错误 9：Blob 源的 displayName 未保留（v1.34.0+ 已修复）

错误 10：JSON 响应模式下忽略引用元数据

错误 11：Google 搜索和文件搜索工具互斥

错误 12：批处理 API 缺少响应元数据（社区提供）

设置说明

步骤 1：初始化客户端

步骤 2：创建文件搜索存储

步骤 3：上传文档

步骤 4：使用文件搜索进行查询

步骤 5：列出和管理文档

步骤 6：清理

推荐的分块策略

技术文档

散文和文章

法律和合同

常见问题解答和支持

元数据最佳实践

示例：客户支持知识库

示例：法律文档库

示例：代码文档

成本优化

1. 上传前去重

2. 压缩大文件

3. 使用元数据过滤减少查询范围

4. 为节省成本选择 Flash 而非 Pro

5. 监控存储使用情况

测试与验证

验证存储创建

验证文档索引

验证查询功能

🇺🇸English

Google Gemini File Search Setup

Overview

Prerequisites

1. Google AI API Key

2. Node.js Environment

3. Install @google/genai SDK

4. TypeScript Configuration (Optional but Recommended)

Common Errors Prevented

Error 1: Document Immutability

Error 2: Storage Quota Exceeded

Error 3: Incorrect Chunking Configuration

Error 4: Metadata Limits Exceeded

Error 5: Indexing Cost Surprises

Error 6: Not Polling Operation Status

Error 7: Forgetting Force Delete

Error 8: Using Unsupported Models

Error 9: displayName Not Preserved for Blob Sources (Fixed v1.34.0+)

Error 10: Grounding Metadata Ignored with JSON Response Mode

Error 11: Google Search and File Search Tools Are Mutually Exclusive

Error 12: Batch API Missing Response Metadata (Community-sourced)

Setup Instructions

Step 1: Initialize Client

Step 2: Create File Search Store

Step 3: Upload Documents

Step 4: Query with File Search

Step 5: List and Manage Documents

Step 6: Cleanup

Recommended Chunking Strategies

Technical Documentation

Prose and Articles