comfyui-video-pipeline by mckruz/comfyui-expert
npx skills add https://github.com/mckruz/comfyui-expert --skill comfyui-video-pipeline协调三个引擎进行视频生成,根据需求和可用资源选择最佳引擎。
VIDEO REQUEST
|
|-- 需要电影级质量?
| |-- 是 + 24GB+ 显存 → Wan 2.2 MoE 14B
| |-- 是 + 8GB 显存 → Wan 2.2 1.3B
|
|-- 需要长视频(>10 秒)?
| |-- 是 → FramePack(6GB 显存可生成 60 秒)
|
|-- 需要快速迭代?
| |-- 是 → AnimateDiff Lightning(4-8 步)
|
|-- 需要相机/运动控制?
| |-- 是 → AnimateDiff V3 + Motion LoRAs
|
|-- 需要首尾帧控制?
| |-- 是 → Wan 2.2 MoE(独家功能)
|
|-- 默认 → Wan 2.2(最佳通用质量)
前提条件:
wan2.1_i2v_720p_14b_bf16.safetensors 放入 models/diffusion_models/umt5_xxl_fp8_e4m3fn_scaled.safetensors 放入 models/clip/广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
open_clip_vit_h_14.safetensors 放入 models/clip_vision/wan_2.1_vae.safetensors 放入 models/vae/设置:
| 参数 | 值 | 备注 |
|---|---|---|
| 分辨率 | 1280x720(横屏)或 720x1280(竖屏) | 原生训练分辨率 |
| 帧数 | 81(16fps 下约 5 秒) | 4 的倍数 + 1 |
| 步数 | 30-50 | 越高 = 质量越好 |
| CFG | 5-7 | |
| 采样器 | uni_pc | Wan 推荐 |
| 调度器 | normal |
帧数指南:
| 时长 | 帧数(16fps) |
|---|---|
| 1 秒 | 17 |
| 3 秒 | 49 |
| 5 秒 | 81 |
| 10 秒 | 161 |
显存优化:
与 I2V 相同,但使用 wan2.1_t2v_14b_bf16.safetensors 和 EmptySD3LatentImage 代替图像条件。
Wan 2.2 MoE 允许指定第一帧和最后一帧,实现精确的视频规划:
显存使用量与视频长度无关——仅需 6GB 显存即可生成 30fps 的 60 秒视频。
工作原理:
| 参数 | 值 | 备注 |
|---|---|---|
| 分辨率 | 640x384 到 1280x720 | 取决于显存 |
| 时长 | 最长 60 秒 | 显存无关 |
| 质量 | 高(与 Wan 相当) | 使用相同的基础模型 |
| 参数 | 值(标准) | 值(Lightning) |
|---|---|---|
| 运动模块 | v3_sd15_mm.ckpt | animatediff_lightning_4step.safetensors |
| 步数 | 20-25 | 4-8 |
| CFG | 7-8 | 1.5-2.0 |
| 采样器 | euler_ancestral | lcm |
| 分辨率 | 512x512 | 512x512 |
| 上下文长度 | 16 | 16 |
| 上下文重叠 | 4 | 4 |
| LoRA | 运动 |
|---|---|
| v2_lora_ZoomIn | 相机放大 |
| v2_lora_ZoomOut | 相机缩小 |
| v2_lora_PanLeft | 相机左移 |
| v2_lora_PanRight | 相机右移 |
| v2_lora_TiltUp | 相机上仰 |
| v2_lora_TiltDown | 相机下俯 |
| v2_lora_RollingClockwise | 相机顺时针旋转 |
在任何视频生成之后:
将帧数加倍或四倍,以获得更流畅的运动:
输入(16fps) → RIFE 2x → 输出(32fps)
输入(16fps) → RIFE 4x → 输出(64fps)
使用 rife47 或 rife49 模型。
对每一帧应用 FaceDetailer:
减少帧之间的时间不一致性。
保持跨帧一致的色彩分级。
通过 VHS Video Combine 最终输出:
frame_rate: 16(原生)或 24/30(插值后)
format: "video/h264-mp4"
crf: 19(高质量)到 23(文件更小)
用于角色对话的完整管线:
1. 生成音频 → comfyui-voice-pipeline
2. 生成基础视频 → 本技能(Wan I2V 或 AnimateDiff)
- 提示词:"{角色},自然地说话,轻微的头部移动"
- 时长:匹配音频长度
3. 应用唇形同步 → Wav2Lip 或 LatentSync
4. 增强面部 → FaceDetailer + CodeFormer
5. 最终输出 → video-assembly
在将视频标记为完成之前:
references/workflows.md - Wan 和 AnimateDiff 的工作流模板references/models.md - 视频模型下载链接references/research-log.md - 最新的视频生成进展state/inventory.json - 可用的视频模型每周安装量
107
仓库
GitHub 星标
25
首次出现
2026年2月24日
安全审计
安装于
gemini-cli104
codex104
kimi-cli104
cursor104
opencode104
github-copilot104
Orchestrates video generation across three engines, selecting the best one based on requirements and available resources.
VIDEO REQUEST
|
|-- Need film-level quality?
| |-- Yes + 24GB+ VRAM → Wan 2.2 MoE 14B
| |-- Yes + 8GB VRAM → Wan 2.2 1.3B
|
|-- Need long video (>10 seconds)?
| |-- Yes → FramePack (60 seconds on 6GB)
|
|-- Need fast iteration?
| |-- Yes → AnimateDiff Lightning (4-8 steps)
|
|-- Need camera/motion control?
| |-- Yes → AnimateDiff V3 + Motion LoRAs
|
|-- Need first+last frame control?
| |-- Yes → Wan 2.2 MoE (exclusive feature)
|
|-- Default → Wan 2.2 (best general quality)
Prerequisites:
wan2.1_i2v_720p_14b_bf16.safetensors in models/diffusion_models/umt5_xxl_fp8_e4m3fn_scaled.safetensors in models/clip/open_clip_vit_h_14.safetensors in models/clip_vision/wan_2.1_vae.safetensors in models/vae/Settings:
| Parameter | Value | Notes |
|---|---|---|
| Resolution | 1280x720 (landscape) or 720x1280 (portrait) | Native training resolution |
| Frames | 81 (~5 seconds at 16fps) | Multiples of 4 + 1 |
| Steps | 30-50 | Higher = better quality |
| CFG | 5-7 | |
| Sampler | uni_pc | Recommended for Wan |
| Scheduler | normal |
Frame count guide:
| Duration | Frames (16fps) |
|---|---|
| 1 second | 17 |
| 3 seconds | 49 |
| 5 seconds | 81 |
| 10 seconds | 161 |
VRAM optimization:
Same as I2V but uses wan2.1_t2v_14b_bf16.safetensors and EmptySD3LatentImage instead of image conditioning.
Wan 2.2 MoE allows specifying both the first and last frame, enabling precise video planning:
VRAM usage is invariant to video length - generates 60-second videos at 30fps on just 6GB VRAM.
How it works:
| Parameter | Value | Notes |
|---|---|---|
| Resolution | 640x384 to 1280x720 | Depends on VRAM |
| Duration | Up to 60 seconds | VRAM-invariant |
| Quality | High (comparable to Wan) | Uses same base models |
| Parameter | Value (Standard) | Value (Lightning) |
|---|---|---|
| Motion Module | v3_sd15_mm.ckpt | animatediff_lightning_4step.safetensors |
| Steps | 20-25 | 4-8 |
| CFG | 7-8 | 1.5-2.0 |
| Sampler | euler_ancestral | lcm |
| Resolution | 512x512 | 512x512 |
| Context Length | 16 | 16 |
| Context Overlap | 4 | 4 |
| LoRA | Motion |
|---|---|
| v2_lora_ZoomIn | Camera zooms in |
| v2_lora_ZoomOut | Camera zooms out |
| v2_lora_PanLeft | Camera pans left |
| v2_lora_PanRight | Camera pans right |
| v2_lora_TiltUp | Camera tilts up |
| v2_lora_TiltDown | Camera tilts down |
| v2_lora_RollingClockwise | Camera rolls clockwise |
After any video generation:
Doubles or quadruples frame count for smoother motion:
Input (16fps) → RIFE 2x → Output (32fps)
Input (16fps) → RIFE 4x → Output (64fps)
Use rife47 or rife49 model.
Apply FaceDetailer to each frame:
Reduces temporal inconsistencies between frames.
Maintain consistent color grading across frames.
Final output via VHS Video Combine:
frame_rate: 16 (native) or 24/30 (after interpolation)
format: "video/h264-mp4"
crf: 19 (high quality) to 23 (smaller file)
Complete pipeline for character dialogue:
1. Generate audio → comfyui-voice-pipeline
2. Generate base video → This skill (Wan I2V or AnimateDiff)
- Prompt: "{character}, talking naturally, slight head movement"
- Duration: match audio length
3. Apply lip-sync → Wav2Lip or LatentSync
4. Enhance faces → FaceDetailer + CodeFormer
5. Final output → video-assembly
Before marking video as complete:
references/workflows.md - Workflow templates for Wan and AnimateDiffreferences/models.md - Video model download linksreferences/research-log.md - Latest video generation advancesstate/inventory.json - Available video modelsWeekly Installs
107
Repository
GitHub Stars
25
First Seen
Feb 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
gemini-cli104
codex104
kimi-cli104
cursor104
opencode104
github-copilot104
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
49,000 周安装