translate-book-parallel：使用并行子代理翻译整本书籍（PDF/DOCX/EPUB）的Claude Code技能

translate-book-parallel by aradotso/trending-skills

180 周安装量

10 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/aradotso/trending-skills --skill translate-book-parallel

内容创作自动化自然语言处理

🇨🇳中文介绍

翻译书籍（并行子代理）

技能来自 ara.so — Daily 2026 Skills 集合。

一个 Claude Code 技能，使用并行子代理将整本书籍（PDF/DOCX/EPUB）翻译成任何语言。每个文本块都拥有独立的上下文窗口——避免了困扰单会话翻译的截断和上下文累积问题。

流程概述

Input (PDF/DOCX/EPUB)
  │
  ▼
Calibre ebook-convert → HTMLZ → HTML → Markdown
  │
  ▼
分割成块（每块约6000字符）
  │  manifest.json 跟踪 SHA-256 哈希值
  ▼
并行子代理（默认8个并发）
  │  每个：读取块 → 翻译 → 写入 output_chunk*.md
  ▼
验证（清单哈希检查，1:1 源↔输出匹配）
  │
  ▼
合并 → Pandoc → HTML（带目录） → Calibre → DOCX / EPUB / PDF

先决条件

# 1. Calibre（提供 ebook-convert）
# macOS
brew install --cask calibre
# Linux
sudo apt-get install calibre
# 或从 https://calibre-ebook.com/ 下载

# 2. Pandoc
brew install pandoc        # macOS
sudo apt-get install pandoc # Linux

# 3. Python 依赖项
pip install pypandoc beautifulsoup4

验证所有工具是否可用：

ebook-convert --version
pandoc --version
python3 -c "import pypandoc; print('pypandoc ok')"

安装

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

在 Claude Code 中使用

技能安装后，在 Claude Code 中使用自然语言：

translate /path/to/book.pdf to Chinese



translate ~/Downloads/mybook.epub to Japanese



/translate-book translate /path/to/book.docx to French

该技能会自动编排整个流程。

代码	语言
`zh`	中文
`en`	英文
`ja`	日文
`ko`	韩文
`fr`	法文
`de`	德文
`es`	西班牙文

语言代码可扩展——在技能定义中添加新的代码。

手动运行流程步骤

步骤 1：转换为 Markdown 块

python3 scripts/convert.py /path/to/book.pdf --olang zh

这会在 {book_name}_temp/ 目录内生成：

chunk0001.md, chunk0002.md, ... （源文本块，每块约6000字符）
manifest.json （用于验证的 SHA-256 哈希值）

对于 EPUB 输入

python3 scripts/convert.py /path/to/book.epub --olang ja

对于 DOCX 输入

python3 scripts/convert.py /path/to/book.docx --olang fr

步骤 2：翻译（并行子代理）

此步骤由技能处理——它每批启动 8 个并发子代理，每个子代理独立翻译一个文本块：

# 每个子代理接收的任务完全相同：
Read chunk0042.md → translate to target language → write output_chunk0042.md

可恢复： 已翻译的文本块（有效的 output_chunk*.md 文件）在重新运行时会被跳过。

步骤 3：合并并构建所有格式

python3 scripts/merge_and_build.py \
  --temp-dir book_name_temp \
  --title "《Book Title in Target Language》"

在合并之前，会进行验证检查：

每个源文本块都有对应的输出文件（1:1 匹配）
源文本块的哈希值与 manifest.json 匹配（无过时输出）
没有输出文件为空

文件	描述
`output.md`	合并后的翻译 Markdown
`book.html`	带浮动目录的网页版本
`book.docx`	Word 文档
`book.epub`	电子书格式
`book.pdf`	可打印的 PDF

translate-book/
├── SKILL.md                    # Claude Code 技能定义（编排器）
├── scripts/
│   ├── convert.py              # PDF/DOCX/EPUB → 通过 Calibre HTMLZ 转换为 Markdown 块
│   ├── manifest.py             # SHA-256 文本块跟踪和合并验证
│   ├── merge_and_build.py      # 合并块 → HTML → DOCX/EPUB/PDF
│   ├── calibre_html_publish.py # 用于格式转换的 Calibre 包装器
│   ├── template.html           # 带浮动目录的网页 HTML 模板
│   └── template_ebook.html     # 电子书 HTML 模板
└── README.md

清单验证的工作原理

# scripts/manifest.py（概念性用法）

# 在 convert.py 期间——记录源哈希值
manifest = {
    "chunk0001.md": "sha256:abc123...",
    "chunk0002.md": "sha256:def456...",
    # ...
}

# 在 merge_and_build.py 期间——合并前验证
# 1. 检查每个文本块是否有对应的 output_chunk
# 2. 重新计算源文本块的哈希值并与清单比较
# 3. 如果任何哈希值不匹配（过时/损坏的输出）则拒绝
# 4. 如果任何输出文件为空则拒绝

如果验证失败，脚本会自动删除过时的 output.md 并从有效的文本块输出重新合并。

实际示例：翻译技术书籍

# 1. 安装技能
npx skills add deusyu/translate-book -a claude-code -g

# 2. 在工作目录中打开 Claude Code
cd ~/books

# 3. 在 Claude Code 中说：
# "translate clean-code.pdf to Chinese"

# Claude Code 将：
# - 运行 convert.py 分割成块
# - 每批启动 8 个并行子代理
# - 每个子代理翻译一个文本块
# - 通过清单验证所有输出
# - 合并并构建所有格式

# 4. 输出出现在：
ls clean-code_temp/
# chunk0001.md  chunk0002.md  ...  （源文件）
# output_chunk0001.md  ...         （已翻译）
# manifest.json
# output.md
# book.html
# book.docx
# book.epub
# book.pdf

恢复中断的翻译

# 如果翻译中断，只需重新运行相同的命令：
# "translate clean-code.pdf to Chinese"

# 该技能会检测到现有的 output_chunk*.md 文件
# 并自动跳过已翻译的文本块。
# 仅重试缺失或失败的文本块。

翻译后更改输出元数据

如果您需要更新标题、作者、模板或图像资源而无需重新翻译：

# 仅删除最终工件（保留已翻译的文本块）
cd book_name_temp/
rm -f output.md book*.html book.docx book.epub book.pdf

# 重新运行合并步骤
python3 ../scripts/merge_and_build.py \
  --temp-dir . \
  --title "《New Title》"

请勿删除文本块文件——这些是您翻译的内容。仅在更改元数据时删除最终工件。

问题	解决方案
`Calibre ebook-convert not found`	安装 Calibre；确保 `ebook-convert` 在 `$PATH` 中
`Manifest validation failed`	源文本块已更改——重新运行 `convert.py`
`Missing source chunk`	源文件被删除——重新运行 `convert.py` 以重新生成
翻译不完整	重新运行技能——从最后一个有效文本块恢复
更改了标题/模板但输出未变	删除 `output.md`, `book*.html`, `book.docx`, `book.epub`, `book.pdf` 然后重新运行 `merge_and_build.py`
`output.md exists but manifest invalid`	脚本自动删除过时输出并重新合并
PDF 生成失败	验证 Calibre 是否支持 PDF 输出；尝试 `ebook-convert --help`
输出块为空	重试失败的文本块；检查 API 速率限制

诊断文本块问题

# 检查哪些文本块缺少翻译
ls book_temp/chunk*.md | wc -l          # 源文本块总数
ls book_temp/output_chunk*.md | wc -l   # 目前已翻译的文本块数

# 查找缺失的输出块
for f in book_temp/chunk*.md; do
  base=$(basename "$f" .md)
  out="book_temp/output_${base}.md"
  if [ ! -f "$out" ] || [ ! -s "$out" ]; then
    echo "Missing: $out"
  fi
done

# 检查清单
cat book_temp/manifest.json | python3 -m json.tool | head -30

文本块大小： 默认每块约6000字符。较小的块 = 更高的并行度但更多的 API 调用。
并发数： 默认每批 8 个并行子代理。如果遇到速率限制，请在 SKILL.md 中调整。
语言： 在 SKILL.md 的技能触发器和翻译提示中添加新的语言代码。
模板： 自定义 scripts/template.html 和 scripts/template_ebook.html 以获得不同的 HTML/电子书样式。

每个文本块独立的上下文 —— 每个子代理都重新开始，防止长书籍的上下文溢出
基于哈希的完整性 —— SHA-256 跟踪在合并前捕获过时或损坏的翻译文本块
可恢复的文本块粒度 —— 绝不重新翻译已完成的内容
格式无关的输入 —— Calibre 在流程开始前处理 PDF/DOCX/EPUB 的标准化
多种输出格式 —— 单一流程同时生成 HTML、DOCX、EPUB 和 PDF

🇺🇸English

Translate Book (Parallel Subagents)

Skill by ara.so — Daily 2026 Skills collection.

A Claude Code skill that translates entire books (PDF/DOCX/EPUB) into any language using parallel subagents. Each chunk gets an isolated context window — preventing truncation and context accumulation that plague single-session translation.

Pipeline Overview

Input (PDF/DOCX/EPUB)
  │
  ▼
Calibre ebook-convert → HTMLZ → HTML → Markdown
  │
  ▼
Split into chunks (~6000 chars each)
  │  manifest.json tracks SHA-256 hashes
  ▼
Parallel subagents (8 concurrent by default)
  │  each: read chunk → translate → write output_chunk*.md
  ▼
Validate (manifest hash check, 1:1 source↔output match)
  │
  ▼
Merge → Pandoc → HTML (with TOC) → Calibre → DOCX / EPUB / PDF

Prerequisites

# 1. Calibre (provides ebook-convert)
# macOS
brew install --cask calibre
# Linux
sudo apt-get install calibre
# Or download from https://calibre-ebook.com/

# 2. Pandoc
brew install pandoc        # macOS
sudo apt-get install pandoc # Linux

# 3. Python dependencies
pip install pypandoc beautifulsoup4

Verify all tools are available:

ebook-convert --version
pandoc --version
python3 -c "import pypandoc; print('pypandoc ok')"

Installation

Option A: npx (recommended)

npx skills add deusyu/translate-book -a claude-code -g

Option B: ClawHub

clawhub install translate-book

Option C: Git clone

git clone https://github.com/deusyu/translate-book.git ~/.claude/skills/translate-book

Usage in Claude Code

Once the skill is installed, use natural language inside Claude Code:

translate /path/to/book.pdf to Chinese



translate ~/Downloads/mybook.epub to Japanese



/translate-book translate /path/to/book.docx to French

The skill orchestrates the full pipeline automatically.

Supported Languages

Code	Language
`zh`	Chinese
`en`	English
`ja`	Japanese
`ko`	Korean
`fr`	French
`de`	German
`es`

Language codes are extensible — add new ones in the skill definition.

Running Pipeline Steps Manually

Step 1: Convert to Markdown Chunks

python3 scripts/convert.py /path/to/book.pdf --olang zh

This produces inside {book_name}_temp/:

chunk0001.md, chunk0002.md, ... (source chunks, ~6000 chars each)
manifest.json (SHA-256 hashes for validation)

For EPUB input

python3 scripts/convert.py /path/to/book.epub --olang ja

For DOCX input

python3 scripts/convert.py /path/to/book.docx --olang fr

Step 2: Translate (Parallel Subagents)

The skill handles this step — it launches 8 concurrent subagents per batch, each translating one chunk independently:

# Each subagent receives exactly this task:
Read chunk0042.md → translate to target language → write output_chunk0042.md

Resumable: Already-translated chunks (valid output_chunk*.md files) are skipped on re-run.

Step 3: Merge and Build All Formats

python3 scripts/merge_and_build.py \
  --temp-dir book_name_temp \
  --title "《Book Title in Target Language》"

Before merging, validation checks:

Every source chunk has a matching output file (1:1)
Source chunk hashes match manifest.json (no stale outputs)
No output files are empty

Outputs produced:

File	Description
`output.md`	Merged translated Markdown
`book.html`	Web version with floating TOC
`book.docx`	Word document
`book.epub`	E-book format
`book.pdf`	Print-ready PDF

Project Structure

translate-book/
├── SKILL.md                    # Claude Code skill definition (orchestrator)
├── scripts/
│   ├── convert.py              # PDF/DOCX/EPUB → Markdown chunks via Calibre HTMLZ
│   ├── manifest.py             # SHA-256 chunk tracking and merge validation
│   ├── merge_and_build.py      # Merge chunks → HTML → DOCX/EPUB/PDF
│   ├── calibre_html_publish.py # Calibre wrapper for format conversion
│   ├── template.html           # Web HTML template with floating TOC
│   └── template_ebook.html     # Ebook HTML template
└── README.md

How Manifest Validation Works

# scripts/manifest.py (conceptual usage)

# During convert.py — records source hashes
manifest = {
    "chunk0001.md": "sha256:abc123...",
    "chunk0002.md": "sha256:def456...",
    # ...
}

# During merge_and_build.py — validates before merging
# 1. Check every chunk has a corresponding output_chunk
# 2. Re-hash source chunks and compare against manifest
# 3. Reject if any hash mismatches (stale/corrupt output)
# 4. Reject if any output file is empty

If validation fails, the script auto-deletes stale output.md and re-merges from valid chunk outputs.

Real-World Example: Translate a Technical Book

# 1. Install the skill
npx skills add deusyu/translate-book -a claude-code -g

# 2. Open Claude Code in your working directory
cd ~/books

# 3. Say in Claude Code:
# "translate clean-code.pdf to Chinese"

# Claude Code will:
# - Run convert.py to split into chunks
# - Launch 8 parallel subagents per batch
# - Each subagent translates one chunk
# - Validate all outputs via manifest
# - Merge and build all formats

# 4. Outputs appear in:
ls clean-code_temp/
# chunk0001.md  chunk0002.md  ...  (source)
# output_chunk0001.md  ...         (translated)
# manifest.json
# output.md
# book.html
# book.docx
# book.epub
# book.pdf

Resuming an Interrupted Translation

# If translation is interrupted, just re-run the same command:
# "translate clean-code.pdf to Chinese"

# The skill detects existing output_chunk*.md files
# and skips already-translated chunks automatically.
# Only missing or failed chunks are retried.

Changing Output Metadata After Translation

If you need to update the title, author, template, or image assets without re-translating:

# Delete only the final artifacts (keeps translated chunks)
cd book_name_temp/
rm -f output.md book*.html book.docx book.epub book.pdf

# Re-run merge step
python3 ../scripts/merge_and_build.py \
  --temp-dir . \
  --title "《New Title》"

Do NOT delete chunk files — those are your translated content. Only delete final artifacts when changing metadata.

Troubleshooting

Problem	Solution
`Calibre ebook-convert not found`	Install Calibre; ensure `ebook-convert` is in `$PATH`
`Manifest validation failed`	Source chunks changed — re-run `convert.py`
`Missing source chunk`	Source file deleted — re-run `convert.py` to regenerate
Incomplete translation	Re-run the skill — resumes from last valid chunk

Diagnosing Chunk Issues

# Check which chunks are missing translation
ls book_temp/chunk*.md | wc -l          # total source chunks
ls book_temp/output_chunk*.md | wc -l   # translated chunks so far

# Find missing output chunks
for f in book_temp/chunk*.md; do
  base=$(basename "$f" .md)
  out="book_temp/output_${base}.md"
  if [ ! -f "$out" ] || [ ! -s "$out" ]; then
    echo "Missing: $out"
  fi
done

# Check manifest
cat book_temp/manifest.json | python3 -m json.tool | head -30

Configuration Tips

Chunk size: ~6000 chars per chunk is the default. Smaller chunks = more parallelism but more API calls.
Concurrency: Default is 8 parallel subagents per batch. Adjust in SKILL.md if hitting rate limits.
Languages: Add new language codes to the skill triggers and translation prompt in SKILL.md.
Templates: Customize scripts/template.html and scripts/template_ebook.html for different HTML/ebook styling.

Key Design Principles

Isolated context per chunk — each subagent starts fresh, preventing context overflow on long books
Hash-based integrity — SHA-256 tracking catches stale or corrupt translated chunks before merging
Resumable at chunk granularity — never re-translate what's already done
Format-agnostic input — Calibre handles PDF/DOCX/EPUB normalization before the pipeline begins
Multiple output formats — single pipeline produces HTML, DOCX, EPUB, and PDF simultaneously

Weekly Installs

180

Repository

aradotso/trending-skills

GitHub Stars

First Seen

4 days ago

Security Audits

Gen Agent Trust HubPass SocketWarn SnykPass

Installed on

github-copilot180

codex180

warp180

kimi-cli180

amp180

cline180

通过 LiteLLM 代理让 Claude Code 对接 GitHub Copilot 运行 | 高级变通方案指南

31,600 周安装