mistral-ocr by parlamento-ai/parlamento-ai
npx skills add https://github.com/parlamento-ai/parlamento-ai --skill mistral-ocr使用 Mistral 专用的 OCR API 从图像和 PDF 中提取文本。无需外部依赖。
此技能需要 Mistral API 密钥。如果您还没有,请按照 reference/getting-started.md 中的指南操作。
用户必须提供其 Mistral API 密钥。如果不可用,请询问。
选项 1(推荐用于 AI 代理): 用户直接在消息中提供密钥:
"Use this Mistral key: aBc123XyZ..."
"Convert this PDF to markdown, my API key is aBc123XyZ..."
选项 2: 环境变量 $MISTRAL_API_KEY
选项 3: Claude Code 设置 (~/.claude/settings.json)
如果没有可用的密钥,请引导用户前往 console.mistral.ai 获取。
对所有文档处理使用专用的 OCR 端点:
POST https://api.mistral.ai/v1/ocr
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
模型: mistral-ocr-latest
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}
}'
支持 JPG、PNG、WEBP、GIF:
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "image_url",
"image_url": "https://example.com/image.jpg"
}
}'
对于本地 PDF 或图像,编码为 base64 并使用数据 URL。
始终使用 curl(适用于所有平台,包括通过 Git Bash 的 Windows):
# 对于本地 PDF
BASE64=$(base64 -w0 document.pdf)
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "document_url",
"document_url": "data:application/pdf;base64,'"$BASE64"'"
}
}'
# 对于本地图像(PNG、JPG 等)
BASE64=$(base64 -w0 image.png)
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "image_url",
"image_url": "data:image/png;base64,'"$BASE64"'"
}
}'
MIME 类型:
data:application/pdf;base64,...data:image/png;base64,...data:image/jpeg;base64,...data:image/webp;base64,...对于发票、表单、表格 - 在后续请求中要求 JSON 或使用 Document AI 注释。
API 直接返回 markdown:
{
"pages": [
{
"index": 0,
"markdown": "# Document Title\n\nExtracted content here...",
"images": [],
"tables": [],
"dimensions": {"dpi": 200, "height": 842, "width": 595}
}
],
"model": "mistral-ocr-latest",
"usage_info": {"pages_processed": 1, "doc_size_bytes": 12345}
}
# 跨平台临时目录
TMPDIR="${TMPDIR:-${TEMP:-/tmp}}"
# 步骤 1:将文件编码为 base64
base64 -w0 "document.pdf" > "$TMPDIR/b64.txt"
# 步骤 2:创建 JSON 请求文件
echo '{"model":"mistral-ocr-latest","document":{"type":"document_url","document_url":"data:application/pdf;base64,'$(cat "$TMPDIR/b64.txt")'"}}' > "$TMPDIR/request.json"
# 步骤 3:使用 -d @file 调用 API(使用实际密钥,而非变量)
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer YOUR_API_KEY_HERE" \
-H "Content-Type: application/json" \
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"
# 步骤 4:使用 node 提取 markdown(不要使用 jq - 并非所有系统都可用)
node -e "const fs=require('fs'); const r=JSON.parse(fs.readFileSync('$TMPDIR/response.json')); console.log(r.pages.map(p=>p.markdown).join('\n\n---\n\n'))"
4. 使用 Write 工具保存到 .md 文件 5. 向用户确认文件位置
-d @file 作为请求体(处理大文件)${TMPDIR:-${TEMP:-/tmp}} 作为临时文件(适用于所有系统)当用户说:
| 用户请求 | 操作 |
|---|---|
| "Convert this PDF to markdown" | OCR 该 PDF,保存为 .md 文件 |
| "Extract text from this image" | OCR 该图像,返回文本 |
| "Give me a .md of this document" | OCR 并保存为 .md 文件 |
| "What does this PDF say?" | OCR 并总结内容 |
| "OCR this receipt" | 提取文本,可选择结构化为 JSON |
| 错误 | 原因 | 解决方案 |
|---|---|---|
| 401 Unauthorized | API 密钥无效 | 验证密钥,引导至 getting-started.md |
| 400 Bad Request | 文档无效 | 检查格式和 URL 可访问性 |
| 3310 File fetch error | URL 无法访问 | 对本地文件使用 base64 |
| Rate limit | 请求过多 | 等待并重试 |
| 格式 | 支持情况 |
|---|---|
| ✅ 直接支持(无需转换) | |
| PNG | ✅ 直接支持 |
| JPG/JPEG | ✅ 直接支持 |
| WEBP | ✅ 直接支持 |
| GIF | ✅ 直接支持 |
无需外部依赖! 与其他 OCR 解决方案不同,Mistral OCR 直接处理 PDF,无需 pdftoppm、ImageMagick 或其他任何工具。
截至 2025 年,Mistral OCR 定价:
查看当前费率请访问 mistral.ai/pricing
技能由 Parlamento AI 提供
每周安装数
122
仓库
GitHub 星标数
10
首次出现
2026 年 1 月 29 日
安全审计
安装于
opencode103
gemini-cli95
codex95
cursor89
github-copilot89
kimi-cli75
Extract text from images and PDFs using Mistral's dedicated OCR API. No external dependencies required.
This skill requires a Mistral API key. If you don't have one, follow the guide in reference/getting-started.md.
The user must provide their Mistral API key. Ask for it if not available.
Option 1 (Recommended for AI agents): User provides key directly in message:
"Use this Mistral key: aBc123XyZ..."
"Convert this PDF to markdown, my API key is aBc123XyZ..."
Option 2: Environment variable $MISTRAL_API_KEY
Option 3: Claude Code settings (~/.claude/settings.json)
If no key is available, guide the user to get one at console.mistral.ai.
Use the dedicated OCR endpoint for all document processing:
POST https://api.mistral.ai/v1/ocr
Model: mistral-ocr-latest
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}
}'
Works with JPG, PNG, WEBP, GIF:
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "image_url",
"image_url": "https://example.com/image.jpg"
}
}'
For local PDFs or images, encode as base64 and use a data URL.
ALWAYS use curl (works on all platforms including Windows via Git Bash):
# For local PDF
BASE64=$(base64 -w0 document.pdf)
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "document_url",
"document_url": "data:application/pdf;base64,'"$BASE64"'"
}
}'
# For local images (PNG, JPG, etc.)
BASE64=$(base64 -w0 image.png)
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "image_url",
"image_url": "data:image/png;base64,'"$BASE64"'"
}
}'
MIME types:
data:application/pdf;base64,...data:image/png;base64,...data:image/jpeg;base64,...data:image/webp;base64,...For invoices, forms, tables - ask for JSON in a follow-up or use Document AI annotations.
The API returns markdown directly:
{
"pages": [
{
"index": 0,
"markdown": "# Document Title\n\nExtracted content here...",
"images": [],
"tables": [],
"dimensions": {"dpi": 200, "height": 842, "width": 595}
}
],
"model": "mistral-ocr-latest",
"usage_info": {"pages_processed": 1, "doc_size_bytes": 12345}
}
# Cross-platform temp directory
TMPDIR="${TMPDIR:-${TEMP:-/tmp}}"
# Step 1: Encode file to base64
base64 -w0 "document.pdf" > "$TMPDIR/b64.txt"
# Step 2: Create JSON request file
echo '{"model":"mistral-ocr-latest","document":{"type":"document_url","document_url":"data:application/pdf;base64,'$(cat "$TMPDIR/b64.txt")'"}}' > "$TMPDIR/request.json"
# Step 3: Call API with -d @file (use actual key, not variable)
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer YOUR_API_KEY_HERE" \
-H "Content-Type: application/json" \
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"
# Step 4: Extract markdown with node (NOT jq - not available on all systems)
node -e "const fs=require('fs'); const r=JSON.parse(fs.readFileSync('$TMPDIR/response.json')); console.log(r.pages.map(p=>p.markdown).join('\n\n---\n\n'))"
4. Save to .md file using Write tool 5. Confirm file location to user
-d @file for request body (handles large files)${TMPDIR:-${TEMP:-/tmp}} for temp files (works on all systems)When the user says:
| User Request | Action |
|---|---|
| "Convert this PDF to markdown" | OCR the PDF, save as .md file |
| "Extract text from this image" | OCR the image, return text |
| "Give me a .md of this document" | OCR and save as .md file |
| "What does this PDF say?" | OCR and summarize content |
| "OCR this receipt" | Extract text, optionally structure as JSON |
| Error | Cause | Solution |
|---|---|---|
| 401 Unauthorized | Invalid API key | Verify key, guide to getting-started.md |
| 400 Bad Request | Invalid document | Check format and URL accessibility |
| 3310 File fetch error | URL not accessible | Use base64 for local files |
| Rate limit | Too many requests | Wait and retry |
| Format | Support |
|---|---|
| ✅ Direct (no conversion) | |
| PNG | ✅ Direct |
| JPG/JPEG | ✅ Direct |
| WEBP | ✅ Direct |
| GIF | ✅ Direct |
No external dependencies required! Unlike other OCR solutions, Mistral OCR handles PDFs directly without needing pdftoppm, ImageMagick, or any other tools.
As of 2025, Mistral OCR pricing:
Check current rates at mistral.ai/pricing
Skill byParlamento AI
Weekly Installs
122
Repository
GitHub Stars
10
First Seen
Jan 29, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykFail
Installed on
opencode103
gemini-cli95
codex95
cursor89
github-copilot89
kimi-cli75
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
65,000 周安装