PDF转Markdown工具：自动检测原生文本与扫描文档，支持OCR转换

pdf-to-markdown by andreadellacorte/groove

129 周安装量

4 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/andreadellacorte/groove --skill pdf-to-markdown

Node.js 自动化文件管理

🇨🇳中文介绍

[IMPORTANT] 在开始前，使用 TaskCreate 将所有工作分解为小任务——包括每个文件读取任务。这可以防止因处理长文件而丢失上下文。对于简单任务，AI 必须询问用户是否跳过。

快速摘要

目标： 将 PDF 文件转换为格式良好的 Markdown，并自动检测原生文本与扫描文档。目前仅实现了原生文本转换；OCR 功能计划中。

工作流程：

自动检测 — 确定 PDF 是否包含原生文本或需要 OCR
转换 — 使用输入路径和可选模式/输出标志运行 scripts/convert.cjs
输出 — 返回包含成功状态、页数和输出路径的 JSON

关键规则：

使用 --mode auto（默认）让工具决定使用原生转换还是 OCR
对扫描的 PDF 进行 OCR 需要额外的 tesseract.js 设置
复杂的多栏布局可能无法完美保留结构

pdf-to-markdown

将 PDF 文件转换为 Markdown 格式，并自动检测原生文本与扫描文档。

无需 npm install

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

FlyClaw：零登录航班聚合查询工具，Python实现多源航班信息与价格搜索

4,000,000 周安装

Azure RBAC 权限管理工具：查找最小角色、创建自定义角色与自动化分配

135,700 周安装

GitHub Actions 官方文档查询助手 - 精准解答 CI/CD 工作流问题

45,200 周安装

Skills CLI 使用指南：AI Agent 技能包管理器安装与管理教程

44,900 周安装

此技能仅使用 Node.js 运行；仓库中没有 node_modules。它首次使用时通过 npx 运行 @opendocsg/pdf2md（由 npx 缓存）。可选：在技能目录中运行 npm install 以便无需 npx 即可更快地运行。

要求： Node.js ≥18，npx（包含在 npm 中）。

# 基本转换（自动检测原生或扫描）
node .agents/skills/pdf-to-markdown/scripts/convert.cjs --input ./document.pdf

# 指定输出路径
node .agents/skills/pdf-to-markdown/scripts/convert.cjs -i ./doc.pdf -o ./output.md

# 强制原生模式（跳过 OCR 检测）
node .agents/skills/pdf-to-markdown/scripts/convert.cjs -i ./doc.pdf --mode native

选项	简写	描述	默认值
`--input`	`-i`	输入 PDF 文件路径	(必填)
`--output`	`-o`	输出 markdown 文件路径	`{input}.md`
`--mode`	`-m`	转换模式：`auto`、`native`、`ocr`	`auto`
`--help`	`-h`	显示帮助信息

自动检测： 自动判断 PDF 是否包含原生文本或需要 OCR
原生 PDF： 使用 @opendocsg/pdf2md 进行快速提取
表格： 基本表格结构保留
跨平台： 在 Windows、macOS、Linux 上运行
无系统依赖： 纯 JavaScript 实现

检查 PDF 第一页是否有可提取的文本。如果找到文本，则使用原生提取，否则回退到 OCR 警告。

快速直接文本提取。最适合包含可选文本（非扫描图像）的 PDF。

OCR（扫描的 PDF）- 即将推出

用于扫描文档。当前未实现 - 如果 PDF 看起来是扫描件，技能会通知您。

成功时返回 JSON：

{
    "success": true,
    "input": "/path/to/input.pdf",
    "output": "/path/to/output.md",
    "stats": {
        "pages": 5,
        "mode": "native"
    }
}

复杂的多栏布局可能无法保留结构
扫描 PDF 的 OCR 准确度取决于图像质量
数学公式可能无法完美转换
首次运行 OCR 会下载语言数据（约 15MB）

OCR 设置（可选）

OCR 模式已集成到技能中但尚未实现。如果您想准备环境或自行扩展，请安装 OCR 依赖项以便 Node 可以解析它们：

cd .agents/skills/pdf-to-markdown
npm install tesseract.js pdfjs-dist canvas

注意： canvas 包在某些系统上可能需要构建工具。

重要任务规划说明（必须遵守）

始终规划并将工作分解为许多小的待办任务
始终添加一个最终的审查待办任务，以验证工作质量并识别修复/改进之处

🇺🇸English

[IMPORTANT] Use TaskCreate to break ALL work into small tasks BEFORE starting — including tasks for each file read. This prevents context loss from long files. For simple tasks, AI MUST ask user whether to skip.

Quick Summary

Goal: Convert PDF files to well-formatted Markdown with auto-detection of native text vs scanned documents. Only native-text conversion is implemented; OCR is planned.

Workflow:

Auto-Detect — Determine if PDF has native text or needs OCR
Convert — Run scripts/convert.cjs with input path and optional mode/output flags
Output — Returns JSON with success status, page count, and output path

Key Rules:

Use --mode auto (default) to let the tool decide native vs OCR
OCR for scanned PDFs requires additional tesseract.js setup
Complex multi-column layouts may not preserve structure perfectly

pdf-to-markdown

Convert PDF files to Markdown format with automatic detection of native text vs scanned documents.

No npm install required

The skill runs with Node.js only ; no node_modules in the repo. It uses npx to run @opendocsg/pdf2md on first use (cached by npx). Optional: run npm install in the skill directory for faster runs without npx.

Requirements: Node.js ≥18, npx (included with npm).

Quick Start

# Basic conversion (auto-detect native vs scanned)
node .agents/skills/pdf-to-markdown/scripts/convert.cjs --input ./document.pdf

# Specify output path
node .agents/skills/pdf-to-markdown/scripts/convert.cjs -i ./doc.pdf -o ./output.md

# Force native mode (skip OCR detection)
node .agents/skills/pdf-to-markdown/scripts/convert.cjs -i ./doc.pdf --mode native

CLI Options

Option	Short	Description	Default
`--input`	`-i`	Input PDF file path	(required)
`--output`	`-o`	Output markdown file path	`{input}.md`
`--mode`	`-m`

Features

Auto-Detection: Automatically determines if PDF has native text or requires OCR
Native PDFs: Fast extraction using @opendocsg/pdf2md
Tables: Basic table structure preservation
Cross-Platform: Works on Windows, macOS, Linux
No System Dependencies: Pure JavaScript implementation

Conversion Modes

Auto (Default)

Checks if PDF has extractable text on first page. Uses native extraction if text found, otherwise falls back to OCR warning.

Native

Fast direct text extraction. Best for PDFs with selectable text (not scanned images).

OCR (Scanned PDFs) - Coming Soon

For scanned documents. Currently not implemented - the skill will notify you if a PDF appears to be scanned.

Output

Returns JSON on success:

{
    "success": true,
    "input": "/path/to/input.pdf",
    "output": "/path/to/output.md",
    "stats": {
        "pages": 5,
        "mode": "native"
    }
}

Limitations

Complex multi-column layouts may not preserve structure
Scanned PDF OCR accuracy depends on image quality
Mathematical formulas may not convert perfectly
First-run OCR downloads language data (~15MB)

OCR Setup (Optional)

OCR mode is wired into the skill but not yet implemented. If you want to prepare your environment or extend it yourself, install the OCR dependencies so Node can resolve them:

cd .agents/skills/pdf-to-markdown
npm install tesseract.js pdfjs-dist canvas

Note: The canvas package may require build tools on some systems.

IMPORTANT Task Planning Notes (MUST FOLLOW)

Always plan and break work into many small todo tasks
Always add a final review todo task to verify work quality and identify fixes/enhancements

Weekly Installs

Repository

andreadellacorte/groove

GitHub Stars

First Seen

7 days ago

Security Audits

Gen Agent Trust HubPass SocketWarn SnykWarn

Installed on

gemini-cli40

github-copilot40

codex40

kimi-cli40

cursor40

amp40

通过 LiteLLM 代理让 Claude Code 对接 GitHub Copilot 运行 | 高级变通方案指南

44,900 周安装

PDF转Markdown工具：自动检测原生文本与扫描文档，支持OCR转换

🇨🇳中文介绍

快速摘要

pdf-to-markdown

无需 npm install

相关 Skills

快速开始

CLI 选项

功能

转换模式

自动（默认）

原生

OCR（扫描的 PDF）- 即将推出

输出

限制

OCR 设置（可选）

🇺🇸English

Quick Summary

pdf-to-markdown

No npm install required

Quick Start

CLI Options

Features

Conversion Modes

Auto (Default)

Native

OCR (Scanned PDFs) - Coming Soon

Output

Limitations

OCR Setup (Optional)

最新 Skills