pdf by skillcreatorai/ai-agent-skills
npx skills add https://github.com/skillcreatorai/ai-agent-skills --skill pdffrom pypdf import PdfReader, PdfWriter
# 读取 PDF
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")
# 提取文本
text = ""
for page in reader.pages:
text += page.extract_text()
from pypdf import PdfWriter, PdfReader
writer = PdfWriter()
for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]:
reader = PdfReader(pdf_file)
for page in reader.pages:
writer.add_page(page)
with open("merged.pdf", "wb") as output:
writer.write(output)
reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages):
writer = PdfWriter()
writer.add_page(page)
with open(f"page_{i+1}.pdf", "wb") as output:
writer.write(output)
import pdfplumber
import pandas as pd
with pdfplumber.open("document.pdf") as pdf:
all_tables = []
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
if table:
df = pd.DataFrame(table[1:], columns=table[0])
all_tables.append(df)
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas
c = canvas.Canvas("hello.pdf", pagesize=letter)
width, height = letter
c.drawString(100, height - 100, "Hello World!")
c.save()
# 提取文本 (poppler-utils)
pdftotext input.pdf output.txt
# 合并 PDF (qpdf)
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf
# 拆分页面
qpdf input.pdf --pages . 1-5 -- pages1-5.pdf
| 任务 | 最佳工具 | 命令/代码 |
|---|---|---|
| 合并 PDF | pypdf | writer.add_page(page) |
| 拆分 PDF | pypdf | 每页一个文件 |
| 提取文本 | pdfplumber | page.extract_text() |
| 提取表格 | pdfplumber | page.extract_tables() |
| 创建 PDF | reportlab | Canvas 或 Platypus |
| 扫描 PDF 的 OCR | pytesseract | 先转换为图像 |
每周安装量
124
代码仓库
GitHub 星标数
953
首次出现
2026年1月20日
安全审计
安装于
opencode99
gemini-cli97
codex92
claude-code82
github-copilot82
cursor81
from pypdf import PdfReader, PdfWriter
# Read a PDF
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")
# Extract text
text = ""
for page in reader.pages:
text += page.extract_text()
from pypdf import PdfWriter, PdfReader
writer = PdfWriter()
for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]:
reader = PdfReader(pdf_file)
for page in reader.pages:
writer.add_page(page)
with open("merged.pdf", "wb") as output:
writer.write(output)
reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages):
writer = PdfWriter()
writer.add_page(page)
with open(f"page_{i+1}.pdf", "wb") as output:
writer.write(output)
import pdfplumber
import pandas as pd
with pdfplumber.open("document.pdf") as pdf:
all_tables = []
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
if table:
df = pd.DataFrame(table[1:], columns=table[0])
all_tables.append(df)
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas
c = canvas.Canvas("hello.pdf", pagesize=letter)
width, height = letter
c.drawString(100, height - 100, "Hello World!")
c.save()
# Extract text (poppler-utils)
pdftotext input.pdf output.txt
# Merge PDFs (qpdf)
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf
# Split pages
qpdf input.pdf --pages . 1-5 -- pages1-5.pdf
| Task | Best Tool | Command/Code |
|---|---|---|
| Merge PDFs | pypdf | writer.add_page(page) |
| Split PDFs | pypdf | One page per file |
| Extract text | pdfplumber | page.extract_text() |
| Extract tables | pdfplumber | page.extract_tables() |
| Create PDFs | reportlab | Canvas or Platypus |
| OCR scanned PDFs | pytesseract | Convert to image first |
Weekly Installs
124
Repository
GitHub Stars
953
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubFailSocketPassSnykPass
Installed on
opencode99
gemini-cli97
codex92
claude-code82
github-copilot82
cursor81
Skills CLI 使用指南:AI Agent 技能包管理器安装与管理教程
36,300 周安装