paddleocr-text-recognition by aidenwu0209/paddleocr-skills
npx skills add https://github.com/aidenwu0209/paddleocr-skills --skill paddleocr-text-recognition在以下情况下调用此技能:
不要在以下情况下使用此技能:
强制性限制 - 不得违反
python scripts/ocr_caller.py如果脚本执行失败(API 未配置、网络错误等):
识别输入源:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
--file-url--file-path 参数--file-path执行 OCR:
python scripts/ocr_caller.py --file-url "用户提供的URL" --pretty
或者对于本地文件:
python scripts/ocr_caller.py --file-path "文件路径" --pretty
将结果保存到文件(推荐):
python scripts/ocr_caller.py --file-url "URL" --output result.json --pretty
解析 JSON 响应:
ok 字段:true 表示成功,false 表示错误text 字段包含所有识别出的文本ok 为 false,显示 error.message向用户呈现结果:
关键:始终向用户显示完整的识别文本。不要截断或总结 OCR 结果。
text 字段包含完整的文本内容text 字段的完整内容,无论它有多长正确做法:
我已从图像中提取了文本。以下是完整内容:
[在此处显示完整文本]
错误做法:
我在图像中找到了一些文本。这是一个预览:
"The quick brown fox..."(已截断)
URL OCR:
python scripts/ocr_caller.py --file-url "https://example.com/invoice.jpg" --pretty
本地文件 OCR:
python scripts/ocr_caller.py --file-path "./document.pdf" --pretty
脚本输出 JSON 结构如下:
{
"ok": true,
"text": "所有识别出的文本在此...",
"result": { ... },
"error": null
}
关键字段:
ok:成功时为 true,错误时为 falsetext:完整的识别文本result:原始 API 响应(用于调试)error:如果 ok 为 false,则包含错误详情当 API 未配置时:
错误将显示:
CONFIG_ERROR: PADDLEOCR_OCR_API_URL 未配置。请在此获取您的 API:https://paddleocr.com
配置工作流程:
向用户显示确切的错误信息(包括 URL)
告知用户提供凭据:
请访问上面的 URL 以获取您的 PADDLEOCR_OCR_API_URL 和 PADDLEOCR_ACCESS_TOKEN。
获取后,请发送给我,我将自动进行配置。
当用户提供凭据时(接受任何格式):
PADDLEOCR_OCR_API_URL=https://xxx.paddleocr.com/ocr, PADDLEOCR_ACCESS_TOKEN=abc123...这是我的 API:https://xxx 和 token:abc123从用户消息中解析凭据:
自动配置:
python scripts/configure.py --api-url "解析出的URL" --token "解析出的TOKEN"
如果配置成功:
如果配置失败:
身份验证失败:
API_ERROR: 身份验证失败 (403)。请检查您的 token。
超出配额:
API_ERROR: API 速率限制已超出 (429)
未检测到文本:
text 字段为空如果识别质量较差,建议:
如需深入了解 OCR 系统,请参考:
references/output_schema.md - 输出格式规范references/provider_api.md - 提供商 API 合约注意:模型版本和功能由您的 API 端点(PADDLEOCR_OCR_API_URL)决定。
要验证技能是否正常工作:
python scripts/smoke_test.py
这将测试配置和 API 连接性。
每周安装数
707
仓库
GitHub 星标数
5
首次出现
2026年2月9日
安全审计
安装于
opencode696
codex694
gemini-cli694
kimi-cli693
amp693
github-copilot693
Invoke this skill in the following situations:
Do not use this skill in the following situations:
MANDATORY RESTRICTIONS - DO NOT VIOLATE
python scripts/ocr_caller.pyIf the script execution fails (API not configured, network error, etc.):
Identify the input source :
--file-url parameter--file-path parameter--file-pathExecute OCR :
python scripts/ocr_caller.py --file-url "URL provided by user" --pretty
Or for local files:
python scripts/ocr_caller.py --file-path "file path" --pretty
Save result to file (recommended):
python scripts/ocr_caller.py --file-url "URL" --output result.json --pretty
3. Parse JSON response :
* Check the `ok` field: `true` means success, `false` means error
* Extract text: `text` field contains all recognized text
* Handle errors: If `ok` is false, display `error.message`
4. Present results to user :
* Display extracted text in a readable format
* If the text is empty, the image may contain no text
CRITICAL : Always display the COMPLETE recognized text to the user. Do NOT truncate or summarize the OCR results.
text fieldtext content to the user, no matter how long it isCorrect approach :
I've extracted the text from the image. Here's the complete content:
[Display the entire text here]
Incorrect approach :
I found some text in the image. Here's a preview:
"The quick brown fox..." (truncated)
URL OCR :
python scripts/ocr_caller.py --file-url "https://example.com/invoice.jpg" --pretty
Local File OCR :
python scripts/ocr_caller.py --file-path "./document.pdf" --pretty
The script outputs JSON structure as follows:
{
"ok": true,
"text": "All recognized text here...",
"result": { ... },
"error": null
}
Key fields :
ok: true for success, false for errortext: Complete recognized textresult: Raw API response (for debugging)error: Error details if ok is falseWhen API is not configured :
The error will show:
CONFIG_ERROR: PADDLEOCR_OCR_API_URL not configured. Get your API at: https://paddleocr.com
Configuration workflow :
Show the exact error message to user (including the URL)
Tell user to provide credentials :
Please visit the URL above to get your PADDLEOCR_OCR_API_URL and PADDLEOCR_ACCESS_TOKEN.
Once you have them, send them to me and I'll configure it automatically.
When user provides credentials (accept any format):
PADDLEOCR_OCR_API_URL=https://xxx.paddleocr.com/ocr, PADDLEOCR_ACCESS_TOKEN=abc123...Here's my API: https://xxx and token: abc123Parse credentials from user's message :
Configure automatically :
python scripts/configure.py --api-url "PARSED_URL" --token "PARSED_TOKEN"
Authentication failed :
API_ERROR: Authentication failed (403). Check your token.
Quota exceeded :
API_ERROR: API rate limit exceeded (429)
No text detected :
text field is emptyIf recognition quality is poor, suggest:
For in-depth understanding of the OCR system, refer to:
references/output_schema.md - Output format specificationreferences/provider_api.md - Provider API contractNote : Model version and capabilities are determined by your API endpoint (PADDLEOCR_OCR_API_URL).
To verify the skill is working properly:
python scripts/smoke_test.py
This tests configuration and API connectivity.
Weekly Installs
707
Repository
GitHub Stars
5
First Seen
Feb 9, 2026
Security Audits
Gen Agent Trust HubPassSocketFailSnykFail
Installed on
opencode696
codex694
gemini-cli694
kimi-cli693
amp693
github-copilot693
76,500 周安装
If configuration succeeds :
If configuration fails :