Stable Diffusion Basemodel 로컬에서 사용하기

Stable Diffusion Basemodel 로컬에서 사용하기

2025. 5. 6. 19:46ㆍ연구하기, 지식

개기

다양한 이미지 생성 AI들이 있지만 대부분 한정된 토큰으로 유료화.
이미지 생성 모델을 돌리기 위한 하드웨어를 갖추고 있음.

Stable Diffusion 소개

개발사 : Stability AI
License : CreativeML Open RAIL+-M License
허깅페이스 링크 : https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
소스 코드 : https://github.com/Stability-AI/generative-models

stabilityai/stable-diffusion-xl-base-1.0 · Hugging Face

SD-XL 1.0-base Model Card Model SDXL consists of an ensemble of experts pipeline for latent diffusion: In a first step, the base model is used to generate (noisy) latents, which are then further processed with a refinement model (available here: https://hu

huggingface.co

GitHub - Stability-AI/generative-models: Generative Models by Stability AI

Generative Models by Stability AI. Contribute to Stability-AI/generative-models development by creating an account on GitHub.

github.com

라이센스

상업적 이용이 가능하지만 다양한 의무와 제한이 따르므로 아래 라이센스 링크를 참고하기 바람.
라이센스 링크 : https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md

LICENSE.md · stabilityai/stable-diffusion-xl-base-1.0 at main

Copyright (c) 2023 Stability AI CreativeML Open RAIL++-M License dated July 26, 2023 Section I: PREAMBLE Multimodal generative models are being widely adopted and used, and have the potential to transform the way artists, among other individuals, conceive

huggingface.co

사용법

환경 : Colab T4 GPU & HuggingFace
필수 패키지 설치

!pip install diffusers --upgrade
!pip install invisible_watermark transformers accelerate safetensors

허깅페이스 토큰 추가 : Access Tokens -> Create new token -> Token type = Read

# 필요하지 않을 수도 있지만 필자는 추가
from huggingface_hub import login

login("Type your huggingface tocken. ex)hf_...")

모델 로드 : float32는 T4 GPU 에서 OOM

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

이미지 생성

prompt = "Type user prompt to image"

images = pipe(prompt=prompt).images[0]
images

사용결과

프롬프트

prompt = "An intensely colored image of a man clipping a math graph with scissors"

결과

이미지 생성 실행 시간 : 45초
리소스 점유율

결론

여러 프롬프트를 테스트 해보았지만 이미지 생성 모델 초창기의 고질적 문제인 손 표현이 힘듬.
프롬프트가 조금이라도 복잡해지면 표현이 제대로 되지 않으니 간략화 하자.
비교적 적은 리소스 사용률로 10GB 이상의 VRAM을 가진 GPU에서는 Colab이 아닌 로컬로 사용 가능.
다양한 파라미터 조정도 가능하니 허깅페이스 링크를 참고.
실행 속도도 매우 빠름.
앞으로는 너로 정했다.

추가 프롬프트 테스트

Introduce yourself with vivid pop art style

Pig armys fight with elephant warriors, Realistic photo

Stable diffusion expressed in abstract and intense colors

728x90

저작자표시 비영리 동일조건 (새창열림)

'연구하기, 지식' 카테고리의 다른 글

API 통신과 소켓 통신 (4)	2025.05.18
쿨백-라이블러 발산(Kullback–Leibler divergence, KLD) (1)	2025.05.06
Whisper vs Faster-Whisper : 성능 비교 (0)	2025.02.03
NeMo speaker embedding model(TitaNet-L) FInetuning 코드 (0)	2025.02.03
NeMo speaker embedding model(TitaNet-L) 학습 코드 (0)	2025.02.03

태그

최근글

댓글

아카이브

개기

Stable Diffusion 소개

라이센스

사용법

사용결과

결론

추가 프롬프트 테스트

'연구하기, 지식' 카테고리의 다른 글

관련글

티스토리툴바