Stable Diffusion

Stable Diffusion CUDA Out of Memory — How to Fix It

The 'CUDA out of memory' error in Stable Diffusion WebUI occurs when your GPU does not have enough VRAM to process the image generation request. It is most commonly triggered when running high-resolution outputs, large batch sizes, or memory-heavy models on consumer-grade GPUs. Users with 4GB to 8GB VRAM cards encounter this error most frequently.

Why does this error happen?

Stable Diffusion loads the full model weights, attention maps, and intermediate latent tensors directly into GPU VRAM during inference. At higher resolutions, the attention mechanism in the U-Net scales quadratically, meaning a 768x768 image requires significantly more memory than a 512x512 image. When the cumulative memory demand of the model, VAE, and active tensors exceeds the physical VRAM capacity of your GPU, PyTorch throws a CUDA OutOfMemoryError and halts the process. This is compounded when running multiple images in a batch or using full-precision (float32) weights instead of half-precision (float16).

✓

How to fix it

Reduce Image Resolution to 512x512

Start by setting your output resolution to 512x512 pixels, which is the native training resolution for most SD 1.5 models. This dramatically reduces the memory required for attention computations in the U-Net. Once generation is stable, you can use a hi-res fix pass to upscale the image without holding the full high-res tensor in VRAM at once.

Enable xformers in Launch Settings

xformers is a memory-efficient attention library that replaces the default PyTorch attention mechanism with a highly optimized version. Enable it by adding the --xformers flag to your launch command or toggling it in the WebUI settings under 'Optimizations'. This alone can reduce VRAM usage by 30-50% and also speeds up generation on most NVIDIA GPUs.

Add --medvram or --lowvram Launch Flag

The --medvram flag instructs Stable Diffusion to keep only the active model component in VRAM at a time, offloading others to system RAM. If you have 4GB or less VRAM, use --lowvram instead, which applies even more aggressive memory splitting at the cost of slower generation speed. Add the appropriate flag to your webui-user.bat or webui-user.sh file in the COMMANDLINE_ARGS variable.

Reduce Batch Size to 1

Generating multiple images simultaneously multiplies VRAM consumption almost linearly per image in the batch. Set your batch size to 1 in the WebUI to ensure only a single image is processed at a time. If you need multiple outputs, use the batch count setting instead, which generates images sequentially and reuses the same VRAM allocation.

💡 Pro Tip

Add --no-half-vae to your launch flags alongside --lowvram to prevent the VAE decoder from producing black or corrupted images, which is a common secondary issue when running in low VRAM mode.

Frequently Asked Questions

Does --lowvram significantly slow down image generation?

Yes, --lowvram increases generation time because model components are constantly swapped between VRAM and system RAM. Using --medvram is a better balance of speed and memory savings if your GPU has at least 5-6GB VRAM.

Can I run Stable Diffusion SDXL on a 6GB VRAM GPU without this error?

SDXL requires significantly more VRAM than SD 1.5, typically 8GB minimum for standard use. On a 6GB card, you will need --medvram, xformers, and should avoid resolutions above 1024x1024 to prevent out of memory crashes.

Why does the error only happen sometimes and not every generation?

VRAM fragmentation and other GPU processes running in the background can cause inconsistent available memory between runs. Restarting the WebUI clears the VRAM cache, and closing other GPU-accelerated applications like browsers or games before generating can help stabilize memory availability.

Will upgrading to more system RAM fix the CUDA out of memory error?

Adding system RAM does not directly fix CUDA OOM errors because GPU VRAM is a separate memory pool. However, more system RAM helps when using --lowvram or --medvram flags, as those modes offload model parts to system RAM during generation.

✓

Quick diagnostic checklist

Before diving into the full fix, run through these quick checks — they resolve the issue in most cases without additional steps:

1.Verify your GPU has sufficient VRAM for the selected model (SD 1.5 needs 4GB, SDXL needs 8GB)

2.Check that the model checkpoint file is not corrupted by comparing its hash

3.Update your GPU drivers to the latest version

4.Try reducing image resolution or batch size to lower memory usage

5.Check the Automatic1111 console output for specific Python error messages

Common root causes

Understanding why this error occurs helps you prevent it in the future. The most frequent causes are:

Insufficient GPU VRAM for the selected model
Corrupted model checkpoint file
Outdated GPU drivers
Python dependency conflicts in the installation
Incompatible CUDA version for the installed PyTorch

Still not working?

If none of the steps above resolved the issue, the next step is to contact Stable Diffusion support directly. When reaching out, include:

• The exact error message or code you see
• The steps you already tried from this guide
• Your account plan and the approximate time the error started
• Your browser/OS version if it is a web interface issue

Open Stability AI Support →

About Stable Diffusion

Stable Diffusion is an open-source AI image generation model developed by Stability AI. Unlike cloud-based tools, it can be run locally on consumer GPUs. It is accessible via Automatic1111 WebUI, ComfyUI, and cloud platforms like DreamStudio. Local installations require a compatible NVIDIA or AMD GPU with at least 4GB VRAM.

Browse all Stable Diffusion error guides →

Stable Diffusion →