powerpoint-automation by aktsmm/agent-skills
npx skills add https://github.com/aktsmm/agent-skills --skill powerpoint-automation使用 Orchestrator-Workers 模式进行 AI 驱动的 PPTX 生成。
从网络文章:
从以下网址创建 15 页幻灯片演示文稿:https://zenn.dev/example/article
从现有 PPTX:
将此演示文稿翻译成日文:input/presentation.pptx
TRIAGE → PLAN → PREPARE_TEMPLATE → EXTRACT → TRANSLATE → BUILD → REVIEW → DONE
| 阶段 | 脚本/代理 | 描述 |
|---|---|---|
| EXTRACT |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
extract_images.py |
| 内容 → content.json |
| BUILD | create_from_template.py | 生成 PPTX |
| REVIEW | PPTX Reviewer | 质量检查 |
→ 完整参考请查看 references/SCRIPTS.md
| 脚本 | 用途 |
|---|---|
create_from_template.py | 从 content.json 生成 PPTX (主脚本) |
reconstruct_analyzer.py | 转换 PPTX → content.json |
extract_images.py | 从 PPTX/网页提取图像 |
validate_content.py | 验证 content.json 模式 |
validate_pptx.py | 检测文本溢出 |
所有代理都通过这种中间格式进行通信:
{
"slides": [
{ "type": "title", "title": "Title", "subtitle": "Sub" },
{ "type": "content", "title": "Topic", "items": ["Point 1"] }
]
}
| 模板 | 用途 | 布局 |
|---|---|---|
assets/template.pptx | 默认 (Japanese, 16:9) | 4 种布局 |
| 索引 | 名称 | 类别 | 用途 |
|---|---|---|---|
| 0 | 标题幻灯片 | title | 演示文稿开头 |
| 1 | 标题和内容 | content | 标准内容 |
| 2 | 1_标题和内容 | content | 标准内容 (备用版) |
| 3 | 节标题 | section | 节分隔 |
使用示例:
python scripts/create_from_template.py assets/template.pptx content.json output.pptx --config assets/template_layouts.json
如果模板 PPTX 包含多个幻灯片母版,可能会导致输出不稳定。
检查方法:
python scripts/create_from_template.py assets/template.pptx --list-layouts
解决方法:
template_layouts.jsonpython scripts/analyze_template.py assets/template.pptx
如果要点列表需要层次结构(缩进),请使用 bullets 格式而不是 items 格式(items 是扁平显示):
{"type": "content", "bullets": [
{"text": "项目1", "level": 0},
{"text": "详情1", "level": 1},
{"text": "项目2", "level": 0}
]}
→ 定义请查看 references/agents/
| 代理 | 用途 |
|---|---|
| Orchestrator | 流水线协调 |
| Localizer | 翻译 (EN ↔ JA) |
| PPTX Reviewer | 最终质量检查 |
参考 URL 在附录幻灯片中必须使用 "标题 - URL" 格式:
VPN Gateway 的新功能 - https://learn.microsoft.com/ja-jp/azure/vpn-gateway/whats-new
→ 详情请查看 references/content-guidelines.md
| 文件 | 内容 |
|---|---|
| SCRIPTS.md | 脚本文档 |
| USE_CASES.md | 工作流程示例 |
| content-guidelines.md | URL 格式、要点 |
| agents/ | 代理定义 |
| schemas/ | JSON 模式 |
向幻灯片添加 Azure/Microsoft 技术内容时,请遵循与 QA 相同的验证工作流程:
[内容请求] → [研究员] → [审核员] → [PPTX 更新]
↓ ↓
使用 Docs MCP 搜索 内容验证
microsoft_docs_search / microsoft_docs_fetch 收集官方信息在生成 PPTX 之前,检查文件是否被锁定:
# 检查文件是否被锁定
$path = "path/to/file.pptx"
try { [IO.File]::OpenWrite($path).Close(); "文件可写" }
catch { "文件被锁定 - 请先关闭 PowerPoint" }
创建网络/架构图时,请使用 PowerPoint 形状 而不是 ASCII 艺术文本框。ASCII 艺术在演示模式下不可读。
from pptx.enum.shapes import MSO_SHAPE
from pptx.dml.color import RGBColor
from pptx.util import Cm, Pt
# 配色方案
AZURE_BLUE = RGBColor(0, 120, 212)
LIGHT_BLUE = RGBColor(232, 243, 255)
ONPREM_GREEN = RGBColor(16, 124, 65)
LIGHT_GREEN = RGBColor(232, 248, 237)
# 外框 (Azure VNet)
box = slide.shapes.add_shape(MSO_SHAPE.ROUNDED_RECTANGLE, left, top, w, h)
box.fill.solid()
box.fill.fore_color.rgb = LIGHT_BLUE
box.line.color.rgb = AZURE_BLUE
# 虚线连接器 (隧道)
conn = slide.shapes.add_connector(1, x1, y1, x2, y2) # 1 = 直线
conn.line.color.rgb = AZURE_BLUE
conn.line.dash_style = 2 # 虚线
Cm() 进行定位(而不是 Inches())—— 在基于公制的幻灯片上更容易理解❌ 在文本框中使用 ASCII 艺术(演示模式下不可读)
❌ 由于间距不足导致形状重叠
❌ 将标签放在其父容器之外
❌ 不使用辅助函数而使用绝对 EMU 值
批量向演示文稿中的所有 URL 添加超链接和页面标题:
import re
url_pattern = re.compile(r'(https?://[^\s\))]+)')
# 1. 构建 URL→标题映射(使用 MCP docs_search 或 fetch_webpage)
URL_TITLES = {
'https://learn.microsoft.com/.../whats-new': 'Azure VPN Gateway 的新功能',
...
}
# 2. 遍历所有文本运行并添加超链接
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for para in shape.text_frame.paragraphs:
for run in para.runs:
urls = url_pattern.findall(run.text)
for url in urls:
if not (run.hyperlink and run.hyperlink.address):
run.hyperlink.address = url.rstrip('/')
# 如果缺少标题则添加前缀
title = URL_TITLES.get(url.rstrip('/'))
if title and title not in run.text:
run.text = f'{title}\n{url}'
hlink_count = sum(
1 for slide in prs.slides
for shape in slide.shapes if shape.has_text_frame
for para in shape.text_frame.paragraphs
for run in para.runs
if run.hyperlink and run.hyperlink.address
)
print(f'超链接数量: {hlink_count}')
如果
run.hyperlink.address不起作用(例如,现有 PPTX 更改布局后),直接插入 XML 元素a:hlinkClick更可靠。
from lxml import etree
from pptx.oxml.ns import qn
from pptx.dml.color import RGBColor
import re
url_pattern = re.compile(r'(https?://[^\s\))」、。]+)')
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for para in shape.text_frame.paragraphs:
for run in para.runs:
if run._r.find(qn('a:hlinkClick')) is not None:
continue # 已有链接
urls = url_pattern.findall(run.text)
for url in urls:
url_clean = url.rstrip('.,;:')
# 添加外部关系
rel = slide.part.relate_to(
url_clean,
'http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink',
is_external=True)
# 获取或创建 rPr 元素
rPr = run._r.find(qn('a:rPr'))
if rPr is None:
rPr = etree.SubElement(run._r, qn('a:rPr'))
t_elem = run._r.find(qn('a:t'))
if t_elem is not None:
run._r.remove(rPr)
run._r.insert(0, rPr)
# 添加 hlinkClick
hlinkClick = etree.SubElement(rPr, qn('a:hlinkClick'))
hlinkClick.set(qn('r:id'), rel)
# 视觉样式
run.font.underline = True
run.font.color.rgb = RGBColor(0x00, 0x78, 0xD4)
python-pptx 有时会留下未解析的主题令牌(+mn-ea, +mj-lt),导致字体回退。通过 ZIP 级别的字符串替换来修复:
import zipfile, re, shutil
FONT_JA = 'BIZ UDPゴシック'
FONT_LATIN = 'BIZ UDPGothic'
tmp = out + '.tmp'
shutil.copy2(out, tmp)
with zipfile.ZipFile(tmp, 'r') as zin:
with zipfile.ZipFile(out, 'w', zipfile.ZIP_DEFLATED) as zout:
for item in zin.infolist():
data = zin.read(item.filename)
if item.filename.endswith('.xml'):
content = data.decode('utf-8')
content = content.replace('+mn-ea', FONT_JA)
content = content.replace('+mj-ea', FONT_JA)
content = content.replace('+mn-lt', FONT_LATIN)
content = content.replace('+mj-lt', FONT_LATIN)
content = re.sub(
r'(<a:ea typeface=")[^"]*(")',
f'\\g<1>{FONT_JA}\\2', content
)
data = content.encode('utf-8')
zout.writestr(item, data)
os.remove(tmp)
⚠️ 务必在
prs.save()之后 执行此操作,而不是之前。
PowerPoint 节作为扩展存储在 ppt/presentation.xml 中。python-pptx 没有原生的节 API。
import re, uuid, zipfile
SECTION_URI = '{521415D9-36F7-43E2-AB2F-B90AF26B5E84}'
P14_NS = 'http://schemas.microsoft.com/office/powerpoint/2010/main'
# 从 ZIP 读取 presentation.xml
with zipfile.ZipFile(pptx_path) as z:
pres_xml = z.read('ppt/presentation.xml').decode('utf-8')
# 确保声明了 p14 命名空间
if f'xmlns:p14="{P14_NS}"' not in pres_xml:
pres_xml = pres_xml.replace('<p:presentation',
f'<p:presentation xmlns:p14="{P14_NS}"', 1)
# 提取幻灯片 ID
slide_ids = re.findall(r'<p:sldId id="(\d+)"', pres_xml)
# 定义节: (名称, 起始幻灯片_基于0)
sections = [("封面", 0), ("正文", 2), ("附录", 15)]
# 构建节 XML
section_parts = []
for idx, (name, start) in enumerate(sections):
end = sections[idx+1][1] if idx+1 < len(sections) else len(slide_ids)
refs = ''.join(f'<p14:sldId id="{slide_ids[i]}"/>'
for i in range(start, min(end, len(slide_ids))))
sec_id = '{' + str(uuid.uuid4()).upper() + '}'
section_parts.append(
f'<p14:section name="{name}" id="{sec_id}">'
f'<p14:sldIdLst>{refs}</p14:sldIdLst></p14:section>'
)
# 插入到 extLst
new_ext = (f'<p:ext uri="{SECTION_URI}">'
f'<p14:sectionLst xmlns:p14="{P14_NS}">'
+ ''.join(section_parts)
+ '</p14:sectionLst></p:ext>')
# 写回 ZIP
{521415D9-36F7-43E2-AB2F-B90AF26B5E84} 特定于演示者的 PowerPoint 版本;某些版本使用不同的 URIpython-pptx 不直接安全支持布局交换。使用 添加-移动-隐藏-清理 模式:
add_slide(target_layout) — 在末尾添加新幻灯片placeholder_format.idx == 0)sldIdLst XML 操作将新幻灯片移动到旧幻灯片的位置(反向顺序)show='0',删除形状)# 步骤 3: 将新幻灯片(最后一张)移动到旧幻灯片之前
sldIdLst = prs.part._element.find(qn('p:sldIdLst'))
slides_list = list(sldIdLst)
new_el = slides_list[-1]
old_el = list(sldIdLst)[target_idx]
sldIdLst.remove(new_el)
sldIdLst.insert(list(sldIdLst).index(old_el), new_el)
# 步骤 4: 隐藏旧幻灯片(现在位于 target_idx + 1)
old_slide._element.set('show', '0')
for shape in list(old_slide.shapes):
shape._element.getparent().remove(shape._element)
| 模式 | 问题 | 结果 |
|---|---|---|
rel._target = new_layout.part 未进行 ZIP 去重 | 重复的 ZIP 条目损坏布局 | PowerPoint 修复对话框 |
prs.part.drop_rel(rId) 用于删除幻灯片 | ZIP 中存在孤立 XML | Duplicate name 警告 → 损坏 |
show='0' 同时索引发生偏移 | 隐藏了错误的幻灯片 | 内容静默消失 |
| 更改布局但保留空占位符 | 可见的幽灵文本("テキストを入力") | 不专业的外观 |
rel._target 更改布局(结合 ZIP 去重的安全模式)L12 : 如果结合使用 ZIP 去重(LAST 优先),
rel._target方式可以安全运行。python-pptx 的save()会产生重复条目,但可以通过后处理解决。
from collections import Counter
import zipfile
# 1. 更改布局关系
blank_part = layout_parts['Blank']
for rel in slide.part.rels.values():
if 'slideLayout' in rel.reltype:
rel._target = blank_part
break
# 2. 保存(将产生重复的 ZIP 条目)
prs.save(raw_path)
# 3. ZIP 去重:保留重复条目的 LAST 条目(包含更新的关系)
with zipfile.ZipFile(raw_path, 'r') as zin:
items = zin.infolist()
counts = Counter(i.filename for i in items)
dups = {n for n, c in counts.items() if c > 1}
last_idx = {}
for idx, item in enumerate(items):
if item.filename in dups:
last_idx[item.filename] = idx
seen = set()
with zipfile.ZipFile(final_path, 'w', zipfile.ZIP_DEFLATED) as zout:
for idx, item in enumerate(items):
if item.filename in dups:
if idx == last_idx[item.filename]:
zout.writestr(item, zin.read(item.filename))
elif item.filename not in seen:
seen.add(item.filename)
zout.writestr(item, zin.read(item.filename))
⚠️ 如果 FIRST 优先,则旧的 rels XML 会残留,布局更改不会生效。务必 LAST 优先。
L13 : 向现有 PPTX 添加幻灯片时,使用
Title and Content或Section Title布局可能会导致空的占位符("テキストを入力"、"タイトルを入力")以幽灵形式显示。
解决方案 : 新幻灯片使用 Blank 布局,标题则填入现有占位符或手动放置。
# 策略: 用实际标题填充占位符,移除空的占位符
ns_p = '{http://schemas.openxmlformats.org/presentationml/2006/main}'
for shape in slide.shapes:
ph_elem = shape._element.find(f'.//{ns_p}ph')
if ph_elem is None:
continue
ph_type = ph_elem.get('type', 'body')
if ph_type == 'title' and not shape.text_frame.text.strip():
# 用实际标题文本填充
shape.text_frame.text = slide_title
for run in shape.text_frame.paragraphs[0].runs:
run.font.size = Pt(28)
run.font.bold = True
elif not shape.text_frame.text.strip():
# 移除空占位符
shape._element.getparent().remove(shape._element)
L14 : 要使新添加幻灯片的标题位置与现有幻灯片对齐,需要测量参考幻灯片的位置并应用到所有幻灯片。
# 1. 测量参考幻灯片(例如,第 4 张幻灯片)
ref_slide = prs.slides[3]
for shape in ref_slide.shapes:
ph = shape._element.find(f'.//{ns_p}ph')
if ph is not None and ph.get('type') == 'title':
REF_LEFT = shape.left # 588263
REF_TOP = shape.top # 457200
REF_WIDTH = shape.width # 11018520
REF_HEIGHT = shape.height # 553998
break
# 2. 应用到所有新幻灯片
for slide in new_slides:
title_ph.left = REF_LEFT
title_ph.top = REF_TOP
title_ph.width = REF_WIDTH
title_ph.height = REF_HEIGHT
L15 : 在重构现有 PPTX 时,要保留原始幻灯片的布局,需要创建一个以标题文本为键的映射。
# 构建原始布局映射
prs_orig = Presentation('original.pptx')
orig_layouts = {}
for slide in prs_orig.slides:
for shape in slide.shapes:
if shape.has_text_frame and shape.text_frame.text.strip():
title = shape.text_frame.text.replace('\n', ' ')[:50]
orig_layouts[title] = slide.slide_layout.name
break
# 应用: ORIG 幻灯片保持原始布局,NEW 幻灯片使用 Blank
for slide in prs_edit.slides:
title = get_slide_title(slide)
if title in orig_layouts:
restore_layout(slide, orig_layouts[title])
else:
set_layout(slide, 'Blank')
在保存后,使用 单独的脚本/步骤 删除隐藏的幻灯片,并按 反向索引顺序 进行:
# 清理步骤(与插入步骤分开)
prs = Presentation(saved_file)
sldIdLst = prs.part._element.find(qn('p:sldIdLst'))
for i, slide in enumerate(prs.slides):
if slide._element.get('show') == '0':
# 在删除前验证是否确实为空
has_content = any(
para.text.strip()
for shape in slide.shapes if shape.has_text_frame
for para in shape.text_frame.paragraphs
)
if has_content:
del slide._element.attrib['show'] # 恢复,不删除
# 删除空的隐藏幻灯片(反向顺序)
for idx in reversed(empty_hidden_indices):
el = list(sldIdLst)[idx]
rId = el.get(qn('r:id'))
sldIdLst.remove(el)
prs.part.drop_rel(rId)
prs.save(output_new_name) # 始终保存到新的文件名
⚠️
create_from_template.py不处理footer_url。需要进行后处理。
| 项目 | 处理 |
|---|---|
footer_url | 在幻灯片底部添加链接文本框 |
| 要点中的 URL | 转换为超链接 |
| 参考 URL | 在附录中为 URL 添加链接 |
PowerPoint 会锁定打开的文件。同名保存会导致 PermissionError,因此务必使用不同的名称保存:
prs.save('file_withURL.pptx')
| 处理 | 后缀 |
|---|---|
| 已添加 URL | _withURL |
| 最终版本 | _final |
| 修复版本 | _fixed |
L9 :
Presentation()的默认占位符基于 4:3 (25.4cm)。即使使用slide_width = Cm(33.867)更改为 16:9,占位符位置仍保持 4:3 → 所有幻灯片都会左对齐显示。
prs = Presentation()
prs.slide_width = Cm(33.867) # 16:9
prs.slide_height = Cm(19.05)
SW = prs.slide_width
# 使用 Blank 布局(无占位符)
slide = prs.slides.add_slide(prs.slide_layouts[6])
# 基于 SW 进行居中放置
margin = Cm(3)
tb = slide.shapes.add_textbox(margin, Cm(5), SW - margin * 2, Cm(3))
p = tb.text_frame.paragraphs[0]
p.text = "居中的标题"
p.alignment = PP_ALIGN.CENTER
❌ 在 16:9 幻灯片上使用布局 0-5(占位符基于 25.4cm 左对齐)
❌ 更改 slide_width 后未调整占位符位置就使用
✅ 使用 Blank 布局 + add_textbox() 进行基于 SW 的对称边距放置
✅ 如果模板 PPTX 本身是以 16:9 创建的,则布局 0-5 也可以
L10 : 如果在 git add 之后 才添加
.gitattributes的*.pptx binary,CRLF/编码转换会损坏二进制文件(混入 UTF-8 替换字符EF BF BD)。
with open('template.pptx', 'rb') as f:
data = f.read()
count = data.count(b'\xef\xbf\xbd')
print(f'UTF-8 替换字符数量: {count}') # 非 0 则表示损坏
# 使用 python-pptx 重新生成空模板
from pptx import Presentation
prs = Presentation()
prs.slide_width = Cm(33.867) # 16:9
prs.slide_height = Cm(19.05)
prs.save('template_new.pptx')
# → 会自动生成 11 种布局(注意 4:3 占位符)
.gitattributes 应在 首次提交前 设置.gitignore 的排除项与二进制管理的一致性L11 : python-pptx 官方不支持 MP4 嵌入。但 PPTX 是 ZIP 格式,因此可以使用
lxml+zipfile直接操作来实现嵌入。
p:pic 注入 a:videoFile + p14:media<Default Extension="mp4" ContentType="video/mp4"/>ppt/media/<p:pic>
<p:nvPicPr>
<p:cNvPr id="100" name="Video 1">
<a:hlinkClick r:id="" action="ppaction://media"/>
</p:cNvPr>
<p:cNvPicPr><a:picLocks noChangeAspect="1"/></p:cNvPicPr>
<p:nvPr>
<a:videoFile r:link="rId10"/>
<p:extLst>
<p:ext uri="{DAA4B4D4-6D71-4841-9C94-3DE7FCFB9230}">
<p14:media r:embed="rId11"/>
</p:ext>
</p:extLst>
</p:nvPr>
</p:nvPicPr>
<p:blipFill>
<a:blip r:embed="rId12"/> <!-- 海报图像 -->
<a:stretch><a:fillRect/></a:stretch>
</p:blipFill>
<p:spPr>
<a:xfrm>
<a:off x="720000" y="1260000"/>
<a:ext cx="10752120" cy="5058000"/>
</a:xfrm>
<a:prstGeom prst="rect"><a:avLst/></a:prstGeom>
</p:spPr>
</p:pic>
<Relationship Id="rId10" Type=".../relationships/video"
Target="../media/video.mp4" TargetMode="Internal"/>
<Relationship Id="rId11" Type=".../2007/relationships/media"
Target="../media/video.mp4"/>
<Relationship Id="rId12" Type=".../relationships/image"
Target="../media/poster.png"/>
L17 : python-pptx 的
save()会重新序列化所有 XML。在此过程中,"→'(属性引号)、\r\n→\n(换行)会发生转换。仅此就可能导致布局背景(图像 blipFill)的渲染损坏,使幻灯片完全变白。
<p:bg> 通过 <a:blipFill r:embed="rId2"> 引用背景图像使用 python-pptx 编辑幻灯片内容后,从原始 PPTX 中按字节恢复布局/母版/主题/媒体文件。
import zipfile
from collections import Counter
orig_files = {}
with zipfile.ZipFile('original.pptx', 'r') as z:
for item in z.infolist():
fn = item.filename
if any(p in fn for p in ['slideLayout', 'slideMaster', 'theme', '/media/']):
orig_files[fn] = z.read(fn)
with zipfile.ZipFile('edited.pptx', 'r') as zin:
with zipfile.ZipFile('output.pptx', 'w', zipfile.ZIP_DEFLATED) as zout:
seen = set()
for item in zin.infolist():
fn = item.filename
if fn in seen:
continue
seen.add(fn)
if fn in orig_files:
zout.writestr(item, orig_files[fn]) # 字节级原始文件
else:
zout.writestr(item, zin.read(fn))
for fn, data in orig_files.items():
if fn not in seen:
zout.writestr(fn, data)
seen.add(fn)
python-pptx 有时会从
[Content_Types].xml中遗漏 SVG 条目。如果主题/母版引用了 SVG,显示会损坏。
ct_data = z.read('[Content_Types].xml').decode('utf-8')
if 'image/svg+xml' not in ct_data:
ct_data = ct_data.replace(
'</Types>',
'<Default Extension="svg" ContentType="image/svg+xml"/></Types>'
)
text_frame.clear()有时会导致与 python-pptx 内部段落列表不一致。改为直接操作txBody的<a:p>元素。
from lxml import etree
from pptx.oxml.ns import qn
def set_textbox_content(shape, lines):
"""通过 XML 操作安全地重写文本框内容。
lines: (文本, 粗体, 大小_磅) 元组的列表。
"""
txBody = shape._element.find(qn('p:txBody'))
if txBody is None:
txBody = shape._element.find(qn('a:txBody'))
# 移除现有段落
for old_p in txBody.findall(qn('a:p')):
txBody.remove(old_p)
# 添加新段落
for text, bold, size in lines:
p = etree.SubElement(txBody, qn('a:p'))
r = etree.SubElement(p, qn('a:r'))
rPr = etree.SubElement(r, qn('a:rPr'))
rPr.set('lang', 'ja-JP')
rPr.set('sz', str(int(size * 100)))
if bold:
rPr.set('b', '1')
solidFill = etree.SubElement(rPr, qn('a:solidFill'))
srgbClr = etree.SubElement(solidFill, qn('a:srgbClr'))
srgbClr.set('val', '333333')
t = etree.SubElement(r, qn('a:t'))
t.text = text
1. 使用 python-pptx 编辑幻灯片内容 (sp, txBody)
2. prs.save('raw.pptx')
3. ZIP 重构: 从原始文件恢复 layout/master/theme/media
4. 在 [Content_Types].xml 中补充 SVG 等缺失
AI-powered PPTX generation using Orchestrator-Workers pattern.
From Web Article:
Create a 15-slide presentation from: https://zenn.dev/example/article
From Existing PPTX:
Translate this presentation to Japanese: input/presentation.pptx
TRIAGE → PLAN → PREPARE_TEMPLATE → EXTRACT → TRANSLATE → BUILD → REVIEW → DONE
| Phase | Script/Agent | Description |
|---|---|---|
| EXTRACT | extract_images.py | Content → content.json |
| BUILD | create_from_template.py | Generate PPTX |
| REVIEW | PPTX Reviewer | Quality check |
→ references/SCRIPTS.md for complete reference
| Script | Purpose |
|---|---|
create_from_template.py | Generate PPTX from content.json (main) |
reconstruct_analyzer.py | Convert PPTX → content.json |
extract_images.py | Extract images from PPTX/web |
validate_content.py | Validate content.json schema |
validate_pptx.py | Detect text overflow |
All agents communicate via this intermediate format:
{
"slides": [
{ "type": "title", "title": "Title", "subtitle": "Sub" },
{ "type": "content", "title": "Topic", "items": ["Point 1"] }
]
}
→ references/schemas/content.schema.json
| Template | Purpose | Layouts |
|---|---|---|
assets/template.pptx | デフォルト (Japanese, 16:9) | 4 layouts |
| Index | Name | Category | 用途 |
|---|---|---|---|
| 0 | タイトル スライド | title | プレゼン冒頭 |
| 1 | タイトルとコンテンツ | content | 標準コンテンツ |
| 2 | 1_タイトルとコンテンツ | content | 標準コンテンツ(別版) |
| 3 | セクション見出し | section | セクション区切り |
使用例:
python scripts/create_from_template.py assets/template.pptx content.json output.pptx --config assets/template_layouts.json
テンプレートPPTXに複数のスライドマスターが含まれている場合、出力が不安定になることがあります。
確認方法:
python scripts/create_from_template.py assets/template.pptx --list-layouts
対処法:
template_layouts.json を再生成python scripts/analyze_template.py assets/template.pptx
箇条書きに階層構造(インデント)を持たせる場合は items ではなく bullets 形式を使用(items はフラット表示になる):
{"type": "content", "bullets": [
{"text": "項目1", "level": 0},
{"text": "詳細1", "level": 1},
{"text": "項目2", "level": 0}
]}
→ references/agents/ for definitions
| Agent | Purpose |
|---|---|
| Orchestrator | Pipeline coordination |
| Localizer | Translation (EN ↔ JA) |
| PPTX Reviewer | Final quality check |
Reference URLs must use "Title - URL" format for APPENDIX slides:
VPN Gateway の新機能 - https://learn.microsoft.com/ja-jp/azure/vpn-gateway/whats-new
→ references/content-guidelines.md for details
| File | Content |
|---|---|
| SCRIPTS.md | Script documentation |
| USE_CASES.md | Workflow examples |
| content-guidelines.md | URL format, bullets |
| agents/ | Agent definitions |
| schemas/ | JSON schemas |
When adding Azure/Microsoft technical content to slides, follow the same verification workflow as QA:
[Content Request] → [Researcher] → [Reviewer] → [PPTX Update]
↓ ↓
Docs MCP 検索 内容検証
microsoft_docs_search / microsoft_docs_fetch to gather official informationBefore generating PPTX, check if the file is locked:
# Check if file is locked
$path = "path/to/file.pptx"
try { [IO.File]::OpenWrite($path).Close(); "File is writable" }
catch { "File is LOCKED - close PowerPoint first" }
When creating network/architecture diagrams, use PowerPoint shapes instead of ASCII art text boxes. ASCII art is unreadable in presentation mode.
from pptx.enum.shapes import MSO_SHAPE
from pptx.dml.color import RGBColor
from pptx.util import Cm, Pt
# Color scheme
AZURE_BLUE = RGBColor(0, 120, 212)
LIGHT_BLUE = RGBColor(232, 243, 255)
ONPREM_GREEN = RGBColor(16, 124, 65)
LIGHT_GREEN = RGBColor(232, 248, 237)
# Outer frame (Azure VNet)
box = slide.shapes.add_shape(MSO_SHAPE.ROUNDED_RECTANGLE, left, top, w, h)
box.fill.solid()
box.fill.fore_color.rgb = LIGHT_BLUE
box.line.color.rgb = AZURE_BLUE
# Dashed connector (tunnel)
conn = slide.shapes.add_connector(1, x1, y1, x2, y2) # 1 = straight
conn.line.color.rgb = AZURE_BLUE
conn.line.dash_style = 2 # dash
Cm() for positioning (not Inches()) — easier to reason about on metric-based slides❌ ASCII art in textboxes (unreadable in presentation mode)
❌ Overlapping shapes due to insufficient spacing
❌ Placing labels outside their parent containers
❌ Using absolute EMU values without helper functions
Batch-add hyperlinks and page titles to all URLs in a presentation:
import re
url_pattern = re.compile(r'(https?://[^\s\))]+)')
# 1. Build URL→Title map (use MCP docs_search or fetch_webpage)
URL_TITLES = {
'https://learn.microsoft.com/.../whats-new': 'Azure VPN Gateway の新機能',
...
}
# 2. Iterate all runs and add hyperlinks
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for para in shape.text_frame.paragraphs:
for run in para.runs:
urls = url_pattern.findall(run.text)
for url in urls:
if not (run.hyperlink and run.hyperlink.address):
run.hyperlink.address = url.rstrip('/')
# Prepend title if missing
title = URL_TITLES.get(url.rstrip('/'))
if title and title not in run.text:
run.text = f'{title}\n{url}'
hlink_count = sum(
1 for slide in prs.slides
for shape in slide.shapes if shape.has_text_frame
for para in shape.text_frame.paragraphs
for run in para.runs
if run.hyperlink and run.hyperlink.address
)
print(f'Hyperlinks: {hlink_count}')
run.hyperlink.addressが機能しない場合(既存 PPTX のレイアウト変更後など)、 XML 要素a:hlinkClickを直接挿入する方が確実。
from lxml import etree
from pptx.oxml.ns import qn
from pptx.dml.color import RGBColor
import re
url_pattern = re.compile(r'(https?://[^\s\))」、。]+)')
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for para in shape.text_frame.paragraphs:
for run in para.runs:
if run._r.find(qn('a:hlinkClick')) is not None:
continue # Already has link
urls = url_pattern.findall(run.text)
for url in urls:
url_clean = url.rstrip('.,;:')
# Add external relationship
rel = slide.part.relate_to(
url_clean,
'http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink',
is_external=True)
# Get or create rPr element
rPr = run._r.find(qn('a:rPr'))
if rPr is None:
rPr = etree.SubElement(run._r, qn('a:rPr'))
t_elem = run._r.find(qn('a:t'))
if t_elem is not None:
run._r.remove(rPr)
run._r.insert(0, rPr)
# Add hlinkClick
hlinkClick = etree.SubElement(rPr, qn('a:hlinkClick'))
hlinkClick.set(qn('r:id'), rel)
# Visual styling
run.font.underline = True
run.font.color.rgb = RGBColor(0x00, 0x78, 0xD4)
python-pptx sometimes leaves theme tokens (+mn-ea, +mj-lt) unresolved, causing font fallback. Fix via ZIP-level string replacement:
import zipfile, re, shutil
FONT_JA = 'BIZ UDPゴシック'
FONT_LATIN = 'BIZ UDPGothic'
tmp = out + '.tmp'
shutil.copy2(out, tmp)
with zipfile.ZipFile(tmp, 'r') as zin:
with zipfile.ZipFile(out, 'w', zipfile.ZIP_DEFLATED) as zout:
for item in zin.infolist():
data = zin.read(item.filename)
if item.filename.endswith('.xml'):
content = data.decode('utf-8')
content = content.replace('+mn-ea', FONT_JA)
content = content.replace('+mj-ea', FONT_JA)
content = content.replace('+mn-lt', FONT_LATIN)
content = content.replace('+mj-lt', FONT_LATIN)
content = re.sub(
r'(<a:ea typeface=")[^"]*(")',
f'\\g<1>{FONT_JA}\\2', content
)
data = content.encode('utf-8')
zout.writestr(item, data)
os.remove(tmp)
⚠️ Always do this after
prs.save(), not before.
PowerPoint sections are stored as an extension in ppt/presentation.xml. python-pptx has no native section API.
import re, uuid, zipfile
SECTION_URI = '{521415D9-36F7-43E2-AB2F-B90AF26B5E84}'
P14_NS = 'http://schemas.microsoft.com/office/powerpoint/2010/main'
# Read presentation.xml from ZIP
with zipfile.ZipFile(pptx_path) as z:
pres_xml = z.read('ppt/presentation.xml').decode('utf-8')
# Ensure p14 namespace is declared
if f'xmlns:p14="{P14_NS}"' not in pres_xml:
pres_xml = pres_xml.replace('<p:presentation',
f'<p:presentation xmlns:p14="{P14_NS}"', 1)
# Extract slide IDs
slide_ids = re.findall(r'<p:sldId id="(\d+)"', pres_xml)
# Define sections: (name, start_slide_0based)
sections = [("表紙", 0), ("本編", 2), ("Appendix", 15)]
# Build section XML
section_parts = []
for idx, (name, start) in enumerate(sections):
end = sections[idx+1][1] if idx+1 < len(sections) else len(slide_ids)
refs = ''.join(f'<p14:sldId id="{slide_ids[i]}"/>'
for i in range(start, min(end, len(slide_ids))))
sec_id = '{' + str(uuid.uuid4()).upper() + '}'
section_parts.append(
f'<p14:section name="{name}" id="{sec_id}">'
f'<p14:sldIdLst>{refs}</p14:sldIdLst></p14:section>'
)
# Insert into extLst
new_ext = (f'<p:ext uri="{SECTION_URI}">'
f'<p14:sectionLst xmlns:p14="{P14_NS}">'
+ ''.join(section_parts)
+ '</p14:sectionLst></p:ext>')
# Write back to ZIP
{521415D9-36F7-43E2-AB2F-B90AF26B5E84} is specific to the presenter's PowerPoint version; some versions use different URIspython-pptx does NOT safely support direct layout swapping. Use the add-move-hide-cleanup pattern:
add_slide(target_layout) — new slide at the endplaceholder_format.idx == 0)sldIdLst XML manipulation (reverse order)show='0', remove shapes)# Step 3: Move new slide (last) before old slide
sldIdLst = prs.part._element.find(qn('p:sldIdLst'))
slides_list = list(sldIdLst)
new_el = slides_list[-1]
old_el = list(sldIdLst)[target_idx]
sldIdLst.remove(new_el)
sldIdLst.insert(list(sldIdLst).index(old_el), new_el)
# Step 4: Hide old slide (now at target_idx + 1)
old_slide._element.set('show', '0')
for shape in list(old_slide.shapes):
shape._element.getparent().remove(shape._element)
| Pattern | Problem | Result |
|---|---|---|
rel._target = new_layout.part without ZIP dedup | Duplicate ZIP entries corrupt layout | PowerPoint repair dialog |
prs.part.drop_rel(rId) for slide deletion | Orphan XML in ZIP | Duplicate name warning → corruption |
show='0' while indices shift | Wrong slides hidden | Content silently disappears |
| Changing layout but keeping empty placeholders | Ghost text ("テキストを入力") visible | Unprofessional appearance |
rel._target (Safe Pattern with ZIP Dedup)L12 :
rel._target方式は ZIP dedup(LAST 優先)を併用すれば安全に動作する。 python-pptx のsave()が重複エントリを生むが、後処理で解決可能。
from collections import Counter
import zipfile
# 1. Change layout relationship
blank_part = layout_parts['Blank']
for rel in slide.part.rels.values():
if 'slideLayout' in rel.reltype:
rel._target = blank_part
break
# 2. Save (will have duplicate ZIP entries)
prs.save(raw_path)
# 3. Dedup ZIP: keep LAST entry for duplicates (has updated rels)
with zipfile.ZipFile(raw_path, 'r') as zin:
items = zin.infolist()
counts = Counter(i.filename for i in items)
dups = {n for n, c in counts.items() if c > 1}
last_idx = {}
for idx, item in enumerate(items):
if item.filename in dups:
last_idx[item.filename] = idx
seen = set()
with zipfile.ZipFile(final_path, 'w', zipfile.ZIP_DEFLATED) as zout:
for idx, item in enumerate(items):
if item.filename in dups:
if idx == last_idx[item.filename]:
zout.writestr(item, zin.read(item.filename))
elif item.filename not in seen:
seen.add(item.filename)
zout.writestr(item, zin.read(item.filename))
⚠️ FIRST 優先だと変更前の rels XML が残り、レイアウト変更が反映されない。必ず LAST 優先 。
L13 : 既存 PPTX にスライドを追加する際、
Title and ContentやSection Titleレイアウトを使うと 空のプレースホルダー(「テキストを入力」「タイトルを入力」)がゴースト表示される。
解決策 : 新規スライドは Blank レイアウトを使い、タイトルは既存プレースホルダーに値を入れるか手動配置する。
# Strategy: Fill placeholder with actual title, remove empty ones
ns_p = '{http://schemas.openxmlformats.org/presentationml/2006/main}'
for shape in slide.shapes:
ph_elem = shape._element.find(f'.//{ns_p}ph')
if ph_elem is None:
continue
ph_type = ph_elem.get('type', 'body')
if ph_type == 'title' and not shape.text_frame.text.strip():
# Fill with actual title text
shape.text_frame.text = slide_title
for run in shape.text_frame.paragraphs[0].runs:
run.font.size = Pt(28)
run.font.bold = True
elif not shape.text_frame.text.strip():
# Remove empty placeholder
shape._element.getparent().remove(shape._element)
L14 : 新規追加スライドのタイトル位置を既存スライドと揃えるには、 基準スライドの位置を計測して全スライドに適用する。
# 1. Measure reference slide (e.g., slide 4)
ref_slide = prs.slides[3]
for shape in ref_slide.shapes:
ph = shape._element.find(f'.//{ns_p}ph')
if ph is not None and ph.get('type') == 'title':
REF_LEFT = shape.left # 588263
REF_TOP = shape.top # 457200
REF_WIDTH = shape.width # 11018520
REF_HEIGHT = shape.height # 553998
break
# 2. Apply to all new slides
for slide in new_slides:
title_ph.left = REF_LEFT
title_ph.top = REF_TOP
title_ph.width = REF_WIDTH
title_ph.height = REF_HEIGHT
L15 : 既存 PPTX を再構成する際、オリジナルスライドのレイアウトを保持するには タイトルテキストをキーにしたマッピングを作成する。
# Build original layout map
prs_orig = Presentation('original.pptx')
orig_layouts = {}
for slide in prs_orig.slides:
for shape in slide.shapes:
if shape.has_text_frame and shape.text_frame.text.strip():
title = shape.text_frame.text.replace('\n', ' ')[:50]
orig_layouts[title] = slide.slide_layout.name
break
# Apply: ORIG slides keep original layout, NEW slides use Blank
for slide in prs_edit.slides:
title = get_slide_title(slide)
if title in orig_layouts:
restore_layout(slide, orig_layouts[title])
else:
set_layout(slide, 'Blank')
Delete hidden slides in a separate script/pass after saving, in reverse index order :
# Cleanup pass (separate from insertion)
prs = Presentation(saved_file)
sldIdLst = prs.part._element.find(qn('p:sldIdLst'))
for i, slide in enumerate(prs.slides):
if slide._element.get('show') == '0':
# Verify truly empty before deleting
has_content = any(
para.text.strip()
for shape in slide.shapes if shape.has_text_frame
for para in shape.text_frame.paragraphs
)
if has_content:
del slide._element.attrib['show'] # Restore, not delete
# Delete empty hidden slides (reverse order)
for idx in reversed(empty_hidden_indices):
el = list(sldIdLst)[idx]
rId = el.get(qn('r:id'))
sldIdLst.remove(el)
prs.part.drop_rel(rId)
prs.save(output_new_name) # Always save to NEW filename
⚠️
create_from_template.pydoes not processfooter_url. Post-processing required.
| Item | Processing |
|---|---|
footer_url | Add linked textbox at slide bottom |
| URLs in bullets | Convert to hyperlinks |
| Reference URLs | Linkify URLs in Appendix |
PowerPoint locks open files.同名保存は PermissionError になるため、必ず別名で保存:
prs.save('file_withURL.pptx')
| Processing | Suffix |
|---|---|
| URL added | _withURL |
| Final version | _final |
| Fixed version | _fixed |
L9 :
Presentation()のデフォルトプレースホルダは 4:3 (25.4cm) 基準。slide_width = Cm(33.867)で 16:9 に変更しても プレースホルダ位置は 4:3 のまま → 全スライドが左寄りに表示される。
prs = Presentation()
prs.slide_width = Cm(33.867) # 16:9
prs.slide_height = Cm(19.05)
SW = prs.slide_width
# Blank layout (プレースホルダなし) を使う
slide = prs.slides.add_slide(prs.slide_layouts[6])
# SW 基準で中央配置
margin = Cm(3)
tb = slide.shapes.add_textbox(margin, Cm(5), SW - margin * 2, Cm(3))
p = tb.text_frame.paragraphs[0]
p.text = "Centered Title"
p.alignment = PP_ALIGN.CENTER
❌ Layout 0-5 を 16:9 スライドで使う(プレースホルダが 25.4cm 基準で左寄り)
❌ slide_width 変更後にプレースホルダ位置を未調整のまま使う
✅ Blank レイアウト + add_textbox() で SW 基準の対称マージン配置
✅ テンプレート PPTX 自体が 16:9 で作成されていれば Layout 0-5 も OK
L10 :
.gitattributesの*.pptx binaryが git add 後 に追加された場合、 CRLF/エンコーディング変換でバイナリが破壊される(UTF-8 replacement charEF BF BDが混入)。
with open('template.pptx', 'rb') as f:
data = f.read()
count = data.count(b'\xef\xbf\xbd')
print(f'UTF-8 replacement chars: {count}') # 0 以外なら破損
# python-pptx で空テンプレートを再生成
from pptx import Presentation
prs = Presentation()
prs.slide_width = Cm(33.867) # 16:9
prs.slide_height = Cm(19.05)
prs.save('template_new.pptx')
# → 11 layouts が自動生成される(4:3 プレースホルダ注意)
.gitattributes は 最初のコミット前 に設定する.gitignore による除外とバイナリ管理の整合性を確認L11 : python-pptx は公式に MP4 埋め込み非対応。 しかし PPTX は ZIP なので
lxml+zipfileで直接操作すれば埋め込み可能。
p:pic に a:videoFile + p14:media を注入<Default Extension="mp4" ContentType="video/mp4"/> を追加ppt/media/ に MP4 ファイルとポスター画像を格納<p:pic>
<p:nvPicPr>
<p:cNvPr id="100" name="Video 1">
<a:hlinkClick r:id="" action="ppaction://media"/>
</p:cNvPr>
<p:cNvPicPr><a:picLocks noChangeAspect="1"/></p:cNvPicPr>
<p:nvPr>
<a:videoFile r:link="rId10"/>
<p:extLst>
<p:ext uri="{DAA4B4D4-6D71-4841-9C94-3DE7FCFB9230}">
<p14:media r:embed="rId11"/>
</p:ext>
</p:extLst>
</p:nvPr>
</p:nvPicPr>
<p:blipFill>
<a:blip r:embed="rId12"/> <!-- poster image -->
<a:stretch><a:fillRect/></a:stretch>
</p:blipFill>
<p:spPr>
<a:xfrm>
<a:off x="720000" y="1260000"/>
<a:ext cx="10752120" cy="5058000"/>
</a:xfrm>
<a:prstGeom prst="rect"><a:avLst/></a:prstGeom>
</p:spPr>
</p:pic>
<Relationship Id="rId10" Type=".../relationships/video"
Target="../media/video.mp4" TargetMode="Internal"/>
<Relationship Id="rId11" Type=".../2007/relationships/media"
Target="../media/video.mp4"/>
<Relationship Id="rId12" Type=".../relationships/image"
Target="../media/poster.png"/>
L17 : python-pptx の
save()は全 XML を再シリアライズする。 その際"→'(属性クォート)、\r\n→\n(改行)に変換される。 これだけでレイアウト背景(画像 blipFill)の描画が壊れ、スライドが真っ白 になることがある。
<p:bg> が <a:blipFill r:embed="rId2"> で背景画像を参照python-pptx でスライド内容を編集した後、レイアウト/マスター/テーマ/メディアファイルはオリジナル PPTX からバイト単位で復元 する。
import zipfile
from collections import Counter
orig_files = {}
with zipfile.ZipFile('original.pptx', 'r') as z:
for item in z.infolist():
fn = item.filename
if any(p in fn for p in ['slideLayout', 'slideMaster', 'theme', '/media/']):
orig_files[fn] = z.read(fn)
with zipfile.ZipFile('edited.pptx', 'r') as zin:
with zipfile.ZipFile('output.pptx', 'w', zipfile.ZIP_DEFLATED) as zout:
seen = set()
for item in zin.infolist():
fn = item.filename
if fn in seen:
continue
seen.add(fn)
if fn in orig_files:
zout.writestr(item, orig_files[fn]) # byte-for-byte original
else:
zout.writestr(item, zin.read(fn))
for fn, data in orig_files.items():
if fn not in seen:
zout.writestr(fn, data)
seen.add(fn)
python-pptx は
[Content_Types].xmlから SVG エントリを落とすことがある。 テーマ/マスターが SVG を参照している場合、表示が壊れる。
ct_data = z.read('[Content_Types].xml').decode('utf-8')
if 'image/svg+xml' not in ct_data:
ct_data = ct_data.replace(
'</Types>',
'<Default Extension="svg" ContentType="image/svg+xml"/></Types>'
)
text_frame.clear()は python-pptx 内部の段落リストとの不整合を起こすことがある。 代わりにtxBodyの<a:p>要素を直接操作する。
from lxml import etree
from pptx.oxml.ns import qn
def set_textbox_content(shape, lines):
"""Safe textbox rewrite via XML manipulation.
lines: list of (text, bold, size_pt) tuples.
"""
txBody = shape._element.find(qn('p:txBody'))
if txBody is None:
txBody = shape._element.find(qn('a:txBody'))
# Remove existing paragraphs
for old_p in txBody.findall(qn('a:p')):
txBody.remove(old_p)
# Add new paragraphs
for text, bold, size in lines:
p = etree.SubElement(txBody, qn('a:p'))
r = etree.SubElement(p, qn('a:r'))
rPr = etree.SubElement(r, qn('a:rPr'))
rPr.set('lang', 'ja-JP')
rPr.set('sz', str(int(size * 100)))
if bold:
rPr.set('b', '1')
solidFill = etree.SubElement(rPr, qn('a:solidFill'))
srgbClr = etree.SubElement(solidFill, qn('a:srgbClr'))
srgbClr.set('val', '333333')
t = etree.SubElement(r, qn('a:t'))
t.text = text
1. python-pptx でスライド内容 (sp, txBody) を編集
2. prs.save('raw.pptx')
3. ZIP 再構築: layout/master/theme/media を原本から復元
4. [Content_Types].xml で SVG 等の欠落を補完
5. ZIP dedup (LAST 優先) で重複エントリを除去
6. output.pptx を別名で保存
content.json generated and validatedWeekly Installs
471
Repository
GitHub Stars
9
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketWarnSnykWarn
Installed on
opencode429
gemini-cli408
codex407
cursor400
github-copilot380
amp328
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
40,200 周安装
Trigger.dev 实时功能:从前端/后端实时订阅任务运行,流式传输数据
734 周安装
PPTX 文件处理全攻略:Python 脚本创建、编辑、分析 .pptx 文件内容与结构
735 周安装
Dokie AI PPT:AI驱动的专业演示文稿设计工具,支持HTML创意动效
737 周安装
PRD生成器:AI驱动产品需求文档工具,快速创建清晰可执行PRD
737 周安装
Devcontainer 设置技能:一键创建预配置开发容器,集成 Claude Code 和语言工具
739 周安装
Plankton代码质量工具:Claude Code自动格式化与Linter强制执行系统
741 周安装