Stable Diffusion 实战教程：从安装到图像生成

张

张建站

2026/5/22 0:27:14

10分钟阅读

Stable Diffusion 实战教程从安装到图像生成前言Stable Diffusion 是当前最流行的开源图像生成模型之一。它能够根据文字描述生成高质量的图像在创意设计、游戏开发等领域有广泛应用。我在多个项目中使用过 Stable Diffusion从简单的图像生成到风格迁移。今天分享完整的实战指南。环境准备# 创建虚拟环境 conda create -n sd python3.10 conda activate sd # 安装依赖 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 pip install diffusers transformers accelerate safetensors pip install gradio # 用于可视化基础使用文本到图像from diffusers import StableDiffusionPipeline import torch # 加载模型 pipe StableDiffusionPipeline.from_pretrained( runwayml/stable-diffusion-v1-5, torch_dtypetorch.float16 ).to(cuda) # 生成图像 prompt a beautiful sunset over the ocean, golden hour, photorealistic image pipe(prompt).images[0] # 保存图像 image.save(sunset.png)控制生成参数def generate_image( prompt: str, negative_prompt: str None, num_inference_steps: int 50, guidance_scale: float 7.5, seed: int None ) - Image: 生成图像 generator torch.Generator(cuda).manual_seed(seed) if seed else None image pipe( promptprompt, negative_promptnegative_prompt, num_inference_stepsnum_inference_steps, guidance_scaleguidance_scale, generatorgenerator ).images[0] return image # 使用示例 image generate_image( prompta cute cat playing with a ball, negative_promptugly, blurry, low quality, num_inference_steps30, guidance_scale7.5, seed42 )高级技巧图像到图像from diffusers import StableDiffusionImg2ImgPipeline from PIL import Image # 加载图像到图像模型 img2img_pipe StableDiffusionImg2ImgPipeline.from_pretrained( runwayml/stable-diffusion-v1-5, torch_dtypetorch.float16 ).to(cuda) # 加载输入图像 init_image Image.open(input.jpg).convert(RGB) init_image init_image.resize((512, 512)) # 生成 prompt turn this photo into a painting in the style of Van Gogh image img2img_pipe( promptprompt, imageinit_image, strength0.75 ).images[0] image.save(output.png)深度引导from diffusers import StableDiffusionDepth2ImgPipeline # 加载深度模型 depth_pipe StableDiffusionDepth2ImgPipeline.from_pretrained( stabilityai/stable-diffusion-2-depth, torch_dtypetorch.float16 ).to(cuda) # 使用深度图引导 prompt a futuristic city skyline image depth_pipe( promptprompt, imageinit_image, depth_mapNone # 自动计算深度 ).images[0]模型微调准备数据集from datasets import load_dataset # 加载数据集 dataset load_dataset(lambdalabs/pokemon-blip-captions) # 预处理 def preprocess(examples): images [image.convert(RGB).resize((512, 512)) for image in examples[image]] return {images: images, captions: examples[text]} dataset dataset.map(preprocess, batchedTrue)训练脚本from diffusers import StableDiffusionPipeline from diffusers.training_utils import set_seed # 设置种子 set_seed(42) # 加载模型 model_id runwayml/stable-diffusion-v1-5 pipe StableDiffusionPipeline.from_pretrained(model_id) # 配置训练参数 training_args { output_dir: ./pokemon-model, per_device_train_batch_size: 4, gradient_accumulation_steps: 4, learning_rate: 1e-5, num_train_epochs: 10, logging_steps: 10, save_steps: 100 } # 开始训练简化示例 # trainer.train()Web UI 部署import gradio as gr def generate(prompt, negative_prompt, steps, scale): 生成图像 image pipe( promptprompt, negative_promptnegative_prompt, num_inference_stepssteps, guidance_scalescale ).images[0] return image # 创建界面 with gr.Blocks() as demo: gr.Markdown(# Stable Diffusion Demo) with gr.Row(): with gr.Column(): prompt gr.Textbox(labelPrompt) negative_prompt gr.Textbox(labelNegative Prompt) steps gr.Slider(minimum10, maximum100, value50, labelSteps) scale gr.Slider(minimum1, maximum20, value7.5, labelGuidance Scale) generate_btn gr.Button(Generate) with gr.Column(): output gr.Image(labelOutput) generate_btn.click(generate, inputs[prompt, negative_prompt, steps, scale], outputsoutput) demo.launch()常见问题显存不足# 解决方案使用安全模式 pipe.enable_attention_slicing() # 或使用 CPU 卸载 pipe.enable_model_cpu_offload() # 或减少 batch size pipe.set_progress_bar_config(disableTrue)生成质量差# 提高质量的技巧 # 1. 使用更高的 steps # 2. 调整 guidance_scale # 3. 添加详细的 negative prompt # 4. 使用更好的模型如 SDXL总结Stable Diffusion 是强大的图像生成工具基础用法文本到图像的简单生成高级技巧图像到图像、深度引导微调适应特定风格或主题部署构建 Web 应用关键要点提示词质量直接影响生成结果negative prompt 很重要调整参数需要经验大显存 GPU 能显著提升速度

pubnub代码示例

import time from pubnub.pnconfiguration import PNConfiguration from pubnub.pubnub import PubNub, SubscribeListener from pubnub.exceptions import PubNubExceptionpublish_key=pub-c-fab-b05a-c355bb3adac5 subscribe_key=sub...

2026/5/22 0:18:12 阅读更多 →

知识竞赛裁判怎么当？评分标准与争议处理

知识竞赛裁判怎么当？评分标准与争议处理公平专业高效守护竞赛的生命线🎯 一、裁判的角色与职责知识竞赛裁判是竞赛公平的守护者，不仅要掌握规则，还要具备快速判断和沟通能力。核心职责：📋 赛前熟悉题…...

2026/5/22 0:14:17 阅读更多 →

单车检测数据集介绍｜适用于智慧交通监测、共享单车管理、安防巡检与目标检测算法训练场景

单车检测数据集介绍｜适用于智慧交通监测、共享单车管理、安防巡检与目标检测算法训练场景前言随着智慧城市建设与人工智能视觉技术的快速发展，基于深度学习的目标检测技术已经广泛应用于交通监测、安防巡检、共享单车管理、自动驾驶感知等多个领域。…...

2026/5/22 0:11:11 阅读更多 →

大彩串口屏在非接触测温仪HMI设计中的实战应用与优势解析

1. 项目概述：串口屏如何重塑非接触测温仪的用户体验在非接触红外测温仪这个看似传统的行业里，用户体验的“最后一公里”往往决定了产品的成败。几年前，我们团队接手一个手持式红外测温仪的项目升级，客户反馈的核心痛点非常集中&am…...

2026/5/21 4:08:59 阅读更多 →

在macOS上运行Windows程序的终极指南：使用Whisky轻松突破系统壁垒

在macOS上运行Windows程序的终极指南：使用Whisky轻松突破系统壁垒【免费下载链接】Whisky A modern Wine wrapper for macOS built with SwiftUI 项目地址: https://gitcode.com/gh_mirrors/wh/Whisky 想要在Apple Silicon Mac上无缝运行Windows专属软件和游…...

2026/5/21 4:08:54 阅读更多 →