Stable Diffusion图像生成指南：使用HuggingFace Diffusers库进行AI绘图与图像转换

stable-diffusion-image-generation by davila7/claude-code-templates

601 周安装量

23,500 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/davila7/claude-code-templates --skill stable-diffusion-image-generation

AI/机器学习内容创作开发

🇨🇳中文介绍

Stable Diffusion 图像生成

使用 HuggingFace Diffusers 库进行 Stable Diffusion 图像生成的综合指南。

何时使用 Stable Diffusion

在以下情况下使用 Stable Diffusion：

根据文本描述生成图像
执行图像到图像的转换（风格迁移、增强）
图像修复（填充被遮罩的区域）
图像外绘（扩展图像边界）
创建现有图像的变体
构建自定义的图像生成工作流

主要特性：

文本到图像：根据自然语言提示生成图像
图像到图像：在文本引导下转换现有图像
图像修复：用上下文感知的内容填充遮罩区域
ControlNet：添加空间条件控制（边缘、姿态、深度）
LoRA 支持：高效的微调和风格适配
多模型支持：SD 1.5、SDXL、SD 3.0、Flux

在以下情况下使用替代方案：

DALL-E 3：用于无需 GPU 的基于 API 的生成
Midjourney：用于艺术化、风格化的输出
Imagen：用于 Google Cloud 集成
Leonardo.ai：用于基于网络的创意工作流

快速开始

安装

pip install diffusers transformers accelerate torch
pip install xformers  # 可选：内存高效的注意力机制

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

基础文本到图像

from diffusers import DiffusionPipeline
import torch

# 加载 pipeline（自动检测模型类型）
pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
)
pipe.to("cuda")

# 生成图像
image = pipe(
    "A serene mountain landscape at sunset, highly detailed",
    num_inference_steps=50,
    guidance_scale=7.5
).images[0]

image.save("output.png")

使用 SDXL（更高画质）

from diffusers import AutoPipelineForText2Image
import torch

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# 启用内存优化
pipe.enable_model_cpu_offload()

image = pipe(
    prompt="A futuristic city with flying cars, cinematic lighting",
    height=1024,
    width=1024,
    num_inference_steps=30
).images[0]

Diffusers 围绕三个核心组件构建：

Pipeline (编排)
├── Model (神经网络)
│   ├── UNet / Transformer (噪声预测)
│   ├── VAE (潜在编码/解码)
│   └── Text Encoder (CLIP/T5)
└── Scheduler (去噪算法)

Pipeline 推理流程

文本提示 → 文本编码器 → 文本嵌入
                                    ↓
随机噪声 → [去噪循环] ← Scheduler
                      ↓
               预测噪声
                      ↓
              VAE 解码器 → 最终图像

Pipelines 编排完整的工作流：

Pipeline	用途
`StableDiffusionPipeline`	文本到图像 (SD 1.x/2.x)
`StableDiffusionXLPipeline`	文本到图像 (SDXL)
`StableDiffusion3Pipeline`	文本到图像 (SD 3.0)
`FluxPipeline`	文本到图像 (Flux 模型)
`StableDiffusionImg2ImgPipeline`	图像到图像
`StableDiffusionInpaintPipeline`	图像修复

Schedulers 控制去噪过程：

Scheduler	步数	质量	使用场景
`EulerDiscreteScheduler`	20-50	良好	默认选择
`EulerAncestralDiscreteScheduler`	20-50	良好	更多变化
`DPMSolverMultistepScheduler`	15-25	优秀	快速，高质量
`DDIMScheduler`	50-100	良好	确定性
`LCMScheduler`	4-8	良好	非常快
`UniPCMultistepScheduler`	15-25	优秀	快速收敛

from diffusers import DPMSolverMultistepScheduler

# 切换为更快的生成
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
    pipe.scheduler.config
)

# 现在用更少的步数生成
image = pipe(prompt, num_inference_steps=20).images[0]

参数	默认值	描述
`prompt`	必需	期望图像的文本描述
`negative_prompt`	None	图像中要避免的内容
`num_inference_steps`	50	去噪步数（越多 = 质量越好）
`guidance_scale`	7.5	提示词遵循度（通常 7-12）
`height`, `width`	512/1024	输出尺寸（8 的倍数）
`generator`	None	用于可重现性的 Torch 生成器
`num_images_per_prompt`	1	批处理大小

import torch

generator = torch.Generator(device="cuda").manual_seed(42)

image = pipe(
    prompt="A cat wearing a top hat",
    generator=generator,
    num_inference_steps=50
).images[0]

image = pipe(
    prompt="Professional photo of a dog in a garden",
    negative_prompt="blurry, low quality, distorted, ugly, bad anatomy",
    guidance_scale=7.5
).images[0]

在文本引导下转换现有图像：

from diffusers import AutoPipelineForImage2Image
from PIL import Image

pipe = AutoPipelineForImage2Image.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

init_image = Image.open("input.jpg").resize((512, 512))

image = pipe(
    prompt="A watercolor painting of the scene",
    image=init_image,
    strength=0.75,  # 转换程度 (0-1)
    num_inference_steps=50
).images[0]

填充遮罩区域：

from diffusers import AutoPipelineForInpainting
from PIL import Image

pipe = AutoPipelineForInpainting.from_pretrained(
    "runwayml/stable-diffusion-inpainting",
    torch_dtype=torch.float16
).to("cuda")

image = Image.open("photo.jpg")
mask = Image.open("mask.png")  # 白色 = 修复区域

result = pipe(
    prompt="A red car parked on the street",
    image=image,
    mask_image=mask,
    num_inference_steps=50
).images[0]

添加空间条件控制以实现精确控制：

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch

# 加载用于边缘条件控制的 ControlNet
controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_canny",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

# 使用 Canny 边缘图像作为控制条件
control_image = get_canny_image(input_image)

image = pipe(
    prompt="A beautiful house in the style of Van Gogh",
    image=control_image,
    num_inference_steps=30
).images[0]

可用的 ControlNets

ControlNet	输入类型	使用场景
`canny`	边缘图	保留结构
`openpose`	姿态骨架	人体姿态
`depth`	深度图	3D 感知生成
`normal`	法线图	表面细节
`mlsd`	线段	建筑线条
`scribble`	粗略草图	草图到图像

加载微调后的风格适配器：

from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

# 加载 LoRA 权重
pipe.load_lora_weights("path/to/lora", weight_name="style.safetensors")

# 使用 LoRA 风格生成
image = pipe("A portrait in the trained style").images[0]

# 调整 LoRA 强度
pipe.fuse_lora(lora_scale=0.8)

# 卸载 LoRA
pipe.unload_lora_weights()

# 加载多个 LoRAs
pipe.load_lora_weights("lora1", adapter_name="style")
pipe.load_lora_weights("lora2", adapter_name="character")

# 为每个设置权重
pipe.set_adapters(["style", "character"], adapter_weights=[0.7, 0.5])

image = pipe("A portrait").images[0]

# 模型 CPU 卸载 - 在不使用时将模型移至 CPU
pipe.enable_model_cpu_offload()

# 顺序 CPU 卸载 - 更激进，更慢
pipe.enable_sequential_cpu_offload()

# 通过分块计算注意力来减少内存
pipe.enable_attention_slicing()

# 或指定分块大小
pipe.enable_attention_slicing("max")

xFormers 内存高效注意力

# 需要 xformers 包
pipe.enable_xformers_memory_efficient_attention()

大图像的 VAE 切片

# 对大图像进行分块解码潜在表示
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()

# FP16 (GPU 推荐)
pipe = DiffusionPipeline.from_pretrained(
    "model-id",
    torch_dtype=torch.float16,
    variant="fp16"
)

# BF16 (更好的精度，需要 Ampere+ GPU)
pipe = DiffusionPipeline.from_pretrained(
    "model-id",
    torch_dtype=torch.bfloat16
)

from diffusers import UNet2DConditionModel, AutoencoderKL

# 加载自定义 VAE
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse")

# 在 pipeline 中使用
pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    vae=vae,
    torch_dtype=torch.float16
)

高效生成多张图像：

# 多个提示词
prompts = [
    "A cat playing piano",
    "A dog reading a book",
    "A bird painting a picture"
]

images = pipe(prompts, num_inference_steps=30).images

# 每个提示词生成多张图像
images = pipe(
    "A beautiful sunset",
    num_images_per_prompt=4,
    num_inference_steps=30
).images

工作流 1：高质量生成

from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
import torch

# 1. 加载优化后的 SDXL
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

# 2. 使用高质量设置生成
image = pipe(
    prompt="A majestic lion in the savanna, golden hour lighting, 8k, detailed fur",
    negative_prompt="blurry, low quality, cartoon, anime, sketch",
    num_inference_steps=30,
    guidance_scale=7.5,
    height=1024,
    width=1024
).images[0]

工作流 2：快速原型设计

from diffusers import AutoPipelineForText2Image, LCMScheduler
import torch

# 使用 LCM 进行 4-8 步生成
pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16
).to("cuda")

# 加载用于快速生成的 LCM LoRA
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.fuse_lora()

# 约 1 秒内生成
image = pipe(
    "A beautiful landscape",
    num_inference_steps=4,
    guidance_scale=1.0
).images[0]

CUDA 内存不足：

# 启用内存优化
pipe.enable_model_cpu_offload()
pipe.enable_attention_slicing()
pipe.enable_vae_slicing()

# 或使用更低精度
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

黑色/噪声图像：

# 检查 VAE 配置
# 如果需要，绕过安全检查器
pipe.safety_checker = None

# 确保正确的 dtype 一致性
pipe = pipe.to(dtype=torch.float16)

生成速度慢：

# 使用更快的调度器
from diffusers import DPMSolverMultistepScheduler
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

# 减少步数
image = pipe(prompt, num_inference_steps=20).images[0]

高级用法 - 自定义管道、微调、部署
故障排除 - 常见问题及解决方案

🇺🇸English

Stable Diffusion Image Generation

Comprehensive guide to generating images with Stable Diffusion using the HuggingFace Diffusers library.

When to use Stable Diffusion

Use Stable Diffusion when:

Generating images from text descriptions
Performing image-to-image translation (style transfer, enhancement)
Inpainting (filling in masked regions)
Outpainting (extending images beyond boundaries)
Creating variations of existing images
Building custom image generation workflows

Key features:

Text-to-Image : Generate images from natural language prompts
Image-to-Image : Transform existing images with text guidance
Inpainting : Fill masked regions with context-aware content
ControlNet : Add spatial conditioning (edges, poses, depth)
LoRA Support : Efficient fine-tuning and style adaptation
Multiple Models : SD 1.5, SDXL, SD 3.0, Flux support

Use alternatives instead:

DALL-E 3 : For API-based generation without GPU
Midjourney : For artistic, stylized outputs
Imagen : For Google Cloud integration
Leonardo.ai : For web-based creative workflows

Quick start

Installation

pip install diffusers transformers accelerate torch
pip install xformers  # Optional: memory-efficient attention

Basic text-to-image

from diffusers import DiffusionPipeline
import torch

# Load pipeline (auto-detects model type)
pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
)
pipe.to("cuda")

# Generate image
image = pipe(
    "A serene mountain landscape at sunset, highly detailed",
    num_inference_steps=50,
    guidance_scale=7.5
).images[0]

image.save("output.png")

Using SDXL (higher quality)

from diffusers import AutoPipelineForText2Image
import torch

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# Enable memory optimization
pipe.enable_model_cpu_offload()

image = pipe(
    prompt="A futuristic city with flying cars, cinematic lighting",
    height=1024,
    width=1024,
    num_inference_steps=30
).images[0]

Architecture overview

Three-pillar design

Diffusers is built around three core components:

Pipeline (orchestration)
├── Model (neural networks)
│   ├── UNet / Transformer (noise prediction)
│   ├── VAE (latent encoding/decoding)
│   └── Text Encoder (CLIP/T5)
└── Scheduler (denoising algorithm)

Pipeline inference flow

Text Prompt → Text Encoder → Text Embeddings
                                    ↓
Random Noise → [Denoising Loop] ← Scheduler
                      ↓
               Predicted Noise
                      ↓
              VAE Decoder → Final Image

Core concepts

Pipelines

Pipelines orchestrate complete workflows:

Pipeline	Purpose
`StableDiffusionPipeline`	Text-to-image (SD 1.x/2.x)
`StableDiffusionXLPipeline`	Text-to-image (SDXL)
`StableDiffusion3Pipeline`	Text-to-image (SD 3.0)
`FluxPipeline`	Text-to-image (Flux models)
`StableDiffusionImg2ImgPipeline`	Image-to-image
`StableDiffusionInpaintPipeline`

Schedulers

Schedulers control the denoising process:

Scheduler	Steps	Quality	Use Case
`EulerDiscreteScheduler`	20-50	Good	Default choice
`EulerAncestralDiscreteScheduler`	20-50	Good	More variation
`DPMSolverMultistepScheduler`	15-25	Excellent	Fast, high quality
`DDIMScheduler`	50-100	Good	Deterministic

Swapping schedulers

from diffusers import DPMSolverMultistepScheduler

# Swap for faster generation
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
    pipe.scheduler.config
)

# Now generate with fewer steps
image = pipe(prompt, num_inference_steps=20).images[0]

Generation parameters

Key parameters

Parameter	Default	Description
`prompt`	Required	Text description of desired image
`negative_prompt`	None	What to avoid in the image
`num_inference_steps`	50	Denoising steps (more = better quality)
`guidance_scale`	7.5	Prompt adherence (7-12 typical)
`height`, `width`

Reproducible generation

import torch

generator = torch.Generator(device="cuda").manual_seed(42)

image = pipe(
    prompt="A cat wearing a top hat",
    generator=generator,
    num_inference_steps=50
).images[0]

Negative prompts

image = pipe(
    prompt="Professional photo of a dog in a garden",
    negative_prompt="blurry, low quality, distorted, ugly, bad anatomy",
    guidance_scale=7.5
).images[0]

Image-to-image

Transform existing images with text guidance:

from diffusers import AutoPipelineForImage2Image
from PIL import Image

pipe = AutoPipelineForImage2Image.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

init_image = Image.open("input.jpg").resize((512, 512))

image = pipe(
    prompt="A watercolor painting of the scene",
    image=init_image,
    strength=0.75,  # How much to transform (0-1)
    num_inference_steps=50
).images[0]

Inpainting

Fill masked regions:

from diffusers import AutoPipelineForInpainting
from PIL import Image

pipe = AutoPipelineForInpainting.from_pretrained(
    "runwayml/stable-diffusion-inpainting",
    torch_dtype=torch.float16
).to("cuda")

image = Image.open("photo.jpg")
mask = Image.open("mask.png")  # White = inpaint region

result = pipe(
    prompt="A red car parked on the street",
    image=image,
    mask_image=mask,
    num_inference_steps=50
).images[0]

ControlNet

Add spatial conditioning for precise control:

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch

# Load ControlNet for edge conditioning
controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_canny",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

# Use Canny edge image as control
control_image = get_canny_image(input_image)

image = pipe(
    prompt="A beautiful house in the style of Van Gogh",
    image=control_image,
    num_inference_steps=30
).images[0]

Available ControlNets

ControlNet	Input Type	Use Case
`canny`	Edge maps	Preserve structure
`openpose`	Pose skeletons	Human poses
`depth`	Depth maps	3D-aware generation
`normal`	Normal maps	Surface details
`mlsd`	Line segments	Architectural lines

LoRA adapters

Load fine-tuned style adapters:

from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

# Load LoRA weights
pipe.load_lora_weights("path/to/lora", weight_name="style.safetensors")

# Generate with LoRA style
image = pipe("A portrait in the trained style").images[0]

# Adjust LoRA strength
pipe.fuse_lora(lora_scale=0.8)

# Unload LoRA
pipe.unload_lora_weights()

Multiple LoRAs

# Load multiple LoRAs
pipe.load_lora_weights("lora1", adapter_name="style")
pipe.load_lora_weights("lora2", adapter_name="character")

# Set weights for each
pipe.set_adapters(["style", "character"], adapter_weights=[0.7, 0.5])

image = pipe("A portrait").images[0]

Memory optimization

Enable CPU offloading

# Model CPU offload - moves models to CPU when not in use
pipe.enable_model_cpu_offload()

# Sequential CPU offload - more aggressive, slower
pipe.enable_sequential_cpu_offload()

Attention slicing

# Reduce memory by computing attention in chunks
pipe.enable_attention_slicing()

# Or specific chunk size
pipe.enable_attention_slicing("max")

xFormers memory-efficient attention

# Requires xformers package
pipe.enable_xformers_memory_efficient_attention()

VAE slicing for large images

# Decode latents in tiles for large images
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()

Model variants

Loading different precisions

# FP16 (recommended for GPU)
pipe = DiffusionPipeline.from_pretrained(
    "model-id",
    torch_dtype=torch.float16,
    variant="fp16"
)

# BF16 (better precision, requires Ampere+ GPU)
pipe = DiffusionPipeline.from_pretrained(
    "model-id",
    torch_dtype=torch.bfloat16
)

Loading specific components

from diffusers import UNet2DConditionModel, AutoencoderKL

# Load custom VAE
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse")

# Use with pipeline
pipe = DiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    vae=vae,
    torch_dtype=torch.float16
)

Batch generation

Generate multiple images efficiently:

# Multiple prompts
prompts = [
    "A cat playing piano",
    "A dog reading a book",
    "A bird painting a picture"
]

images = pipe(prompts, num_inference_steps=30).images

# Multiple images per prompt
images = pipe(
    "A beautiful sunset",
    num_images_per_prompt=4,
    num_inference_steps=30
).images

Common workflows

Workflow 1: High-quality generation

from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
import torch

# 1. Load SDXL with optimizations
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

# 2. Generate with quality settings
image = pipe(
    prompt="A majestic lion in the savanna, golden hour lighting, 8k, detailed fur",
    negative_prompt="blurry, low quality, cartoon, anime, sketch",
    num_inference_steps=30,
    guidance_scale=7.5,
    height=1024,
    width=1024
).images[0]

Workflow 2: Fast prototyping

from diffusers import AutoPipelineForText2Image, LCMScheduler
import torch

# Use LCM for 4-8 step generation
pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16
).to("cuda")

# Load LCM LoRA for fast generation
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.fuse_lora()

# Generate in ~1 second
image = pipe(
    "A beautiful landscape",
    num_inference_steps=4,
    guidance_scale=1.0
).images[0]

Common issues

CUDA out of memory:

# Enable memory optimizations
pipe.enable_model_cpu_offload()
pipe.enable_attention_slicing()
pipe.enable_vae_slicing()

# Or use lower precision
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

Black/noise images:

# Check VAE configuration
# Use safety checker bypass if needed
pipe.safety_checker = None

# Ensure proper dtype consistency
pipe = pipe.to(dtype=torch.float16)

Slow generation:

# Use faster scheduler
from diffusers import DPMSolverMultistepScheduler
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

# Reduce steps
image = pipe(prompt, num_inference_steps=20).images[0]

References

Advanced Usage - Custom pipelines, fine-tuning, deployment
Troubleshooting - Common issues and solutions

Resources

Documentation : https://huggingface.co/docs/diffusers
Repository : https://github.com/huggingface/diffusers
Model Hub : https://huggingface.co/models?library=diffusers
Discord : https://discord.gg/diffusers

Weekly Installs

371

Repository

davila7/claude-…emplates

GitHub Stars

22.6K

First Seen

Jan 21, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

opencode317

gemini-cli304

cursor293

codex286

github-copilot269

amp223

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

103,800 周安装

Stable Diffusion图像生成指南：使用HuggingFace Diffusers库进行AI绘图与图像转换

🇨🇳中文介绍

Stable Diffusion 图像生成

何时使用 Stable Diffusion

快速开始

安装

相关 Skills

基础文本到图像

使用 SDXL（更高画质）

架构概述

三支柱设计

Pipeline 推理流程

核心概念

Pipelines

Schedulers

切换调度器

生成参数

关键参数

可重现的生成

负面提示词

图像到图像

图像修复

ControlNet

可用的 ControlNets

LoRA 适配器

多个 LoRAs

内存优化

启用 CPU 卸载

注意力切片

xFormers 内存高效注意力

大图像的 VAE 切片

模型变体

加载不同精度

加载特定组件

批量生成

常见工作流

工作流 1：高质量生成

工作流 2：快速原型设计

常见问题

参考资料

资源

🇺🇸English

Stable Diffusion Image Generation

When to use Stable Diffusion

Quick start

Installation

Basic text-to-image

Using SDXL (higher quality)

Architecture overview

Three-pillar design

Pipeline inference flow

Core concepts

Pipelines

Schedulers

Swapping schedulers

Generation parameters

Key parameters

Reproducible generation

Negative prompts

Image-to-image

Inpainting

ControlNet

Available ControlNets

LoRA adapters

Multiple LoRAs

Memory optimization

Enable CPU offloading

Attention slicing

xFormers memory-efficient attention

VAE slicing for large images

Model variants

Loading different precisions

Loading specific components

Batch generation

Common workflows

Workflow 1: High-quality generation

Workflow 2: Fast prototyping

Common issues

References

Resources