document-converter-suite by dkyazzentwatwa/chatgpt-skills
npx skills add https://github.com/dkyazzentwatwa/chatgpt-skills --skill document-converter-suite提供在 8 种文档格式 之间尽力而为的转换工作流:
办公格式:PDF、Word (DOCX)、PowerPoint (PPTX)、Excel (XLSX) 文本格式:纯文本 (TXT)、CSV、Markdown (MD)、HTML
使用 pypdf、python-docx、python-pptx、openpyxl、reportlab、mistune、beautifulsoup4 和 Pillow。
优先考虑 可靠的提取 + 重建(文本、标题、项目符号、基本表格),而非像素级完美的布局。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
当请求涉及以下情况时使用:
支持的转换路径:总共 64 条(8×8 矩阵)- 参见 references/conversion_matrix.md
避免承诺视觉保真度。强调输出是 干净且结构化的,而非完全相同。
scripts/convert.py。scripts/batch_convert.py。--max-pages、--max-chars--max-rows、--max-cols运行:
python scripts/convert.py <input-file> --to <pdf|docx|pptx|xlsx|txt|csv|md|html>
示例:
# 办公格式转换
python scripts/convert.py report.pdf --to docx
python scripts/convert.py deck.pptx --to pdf --out deck_export.pdf
python scripts/convert.py data.xlsx --to pptx --max-rows 40 --max-cols 12
# 文本格式转换
python scripts/convert.py documentation.md --to docx
python scripts/convert.py data.csv --to xlsx
python scripts/convert.py report.docx --to html
python scripts/convert.py notes.txt --to md
运行:
python scripts/batch_convert.py <input-dir> --to <pdf|docx|pptx|xlsx|txt|csv|md|html>
示例:
python scripts/batch_convert.py ./inbox --to docx --recursive
python scripts/batch_convert.py ./inbox --to pdf --outdir ./out --recursive --overwrite
python scripts/batch_convert.py ./markdown-docs --to html --pattern "*.md"
python scripts/batch_convert.py ./data --to xlsx --pattern "*.csv"
遵循以下默认设置(如果用户可能期望奇迹,请大声说明):
pypdf 提取文本;无 OCR;每页成为一个部分/幻灯片块。mistune 解析;保留标题、列表、表格、代码块。
beautifulsoup4 解析;提取语义结构。
image_handler.py 为未来的图像支持提供基于哈希的去重功能从以下位置加载额外详细信息:
references/conversion_matrix.md - 完整的 8×8 转换矩阵references/limitations.md - 特定格式的局限性和边缘情况scripts/convert.py:单文件 CLI 转换器scripts/batch_convert.py:目录的批量转换器scripts/lib/*:内部读取器/写入器和转换编排每周安装次数
201
代码仓库
GitHub 星标数
36
首次出现
2026年1月24日
安全审计
安装于
opencode174
cursor166
gemini-cli163
codex162
github-copilot152
amp139
Provide a best-effort conversion workflow between 8 document formats :
Office Formats : PDF, Word (DOCX), PowerPoint (PPTX), Excel (XLSX) Text Formats : Plain Text (TXT), CSV, Markdown (MD), HTML
Uses pypdf, python-docx, python-pptx, openpyxl, reportlab, mistune, beautifulsoup4, and Pillow.
Prefer reliable extraction + rebuild (text, headings, bullets, basic tables) over pixel-perfect layout.
Use when the request involves:
Supported conversion paths : 64 total (8×8 matrix) - see references/conversion_matrix.md
Avoid promising visual fidelity. Emphasize that output is clean and structured , not identical.
scripts/convert.py.scripts/batch_convert.py.--max-pages, --max-chars--max-rows, --max-colsRun:
python scripts/convert.py <input-file> --to <pdf|docx|pptx|xlsx|txt|csv|md|html>
Examples:
# Office format conversions
python scripts/convert.py report.pdf --to docx
python scripts/convert.py deck.pptx --to pdf --out deck_export.pdf
python scripts/convert.py data.xlsx --to pptx --max-rows 40 --max-cols 12
# Text format conversions
python scripts/convert.py documentation.md --to docx
python scripts/convert.py data.csv --to xlsx
python scripts/convert.py report.docx --to html
python scripts/convert.py notes.txt --to md
Run:
python scripts/batch_convert.py <input-dir> --to <pdf|docx|pptx|xlsx|txt|csv|md|html>
Examples:
python scripts/batch_convert.py ./inbox --to docx --recursive
python scripts/batch_convert.py ./inbox --to pdf --outdir ./out --recursive --overwrite
python scripts/batch_convert.py ./markdown-docs --to html --pattern "*.md"
python scripts/batch_convert.py ./data --to xlsx --pattern "*.csv"
Follow these defaults (and say them out loud if the user might be expecting magic):
pypdf; no OCR; each page becomes a section/slide block.mistune; headings, lists, tables, code blocks preserved.
beautifulsoup4; semantic structure extracted.
image_handler.py provides hash-based deduplication for future image supportLoad extra detail from:
references/conversion_matrix.md - Full 8×8 conversion matrixreferences/limitations.md - Format-specific limitations and edge casesscripts/convert.py: single-file CLI converterscripts/batch_convert.py: batch converter for directoriesscripts/lib/*: internal readers/writers and conversion orchestrationWeekly Installs
201
Repository
GitHub Stars
36
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode174
cursor166
gemini-cli163
codex162
github-copilot152
amp139
通过 LiteLLM 代理让 Claude Code 对接 GitHub Copilot 运行 | 高级变通方案指南
31,600 周安装