Stable Diffusion CUDA Out of Memory — How to Fix It
The 'CUDA out of memory' error in Stable Diffusion WebUI occurs when your GPU does not have enough VRAM to process the image generation request. It is most commonly triggered when running high-resolution outputs, large batch sizes, or memory-heavy models on consumer-grade GPUs. Users with 4GB to 8GB VRAM cards encounter this error most frequently.
Why does this error happen?
How to fix it
Reduce Image Resolution to 512x512
Start by setting your output resolution to 512x512 pixels, which is the native training resolution for most SD 1.5 models. This dramatically reduces the memory required for attention computations in the U-Net. Once generation is stable, you can use a hi-res fix pass to upscale the image without holding the full high-res tensor in VRAM at once.
Enable xformers in Launch Settings
xformers is a memory-efficient attention library that replaces the default PyTorch attention mechanism with a highly optimized version. Enable it by adding the --xformers flag to your launch command or toggling it in the WebUI settings under 'Optimizations'. This alone can reduce VRAM usage by 30-50% and also speeds up generation on most NVIDIA GPUs.
Add --medvram or --lowvram Launch Flag
The --medvram flag instructs Stable Diffusion to keep only the active model component in VRAM at a time, offloading others to system RAM. If you have 4GB or less VRAM, use --lowvram instead, which applies even more aggressive memory splitting at the cost of slower generation speed. Add the appropriate flag to your webui-user.bat or webui-user.sh file in the COMMANDLINE_ARGS variable.
Reduce Batch Size to 1
Generating multiple images simultaneously multiplies VRAM consumption almost linearly per image in the batch. Set your batch size to 1 in the WebUI to ensure only a single image is processed at a time. If you need multiple outputs, use the batch count setting instead, which generates images sequentially and reuses the same VRAM allocation.
💡 Pro Tip
Add --no-half-vae to your launch flags alongside --lowvram to prevent the VAE decoder from producing black or corrupted images, which is a common secondary issue when running in low VRAM mode.
Frequently Asked Questions
Does --lowvram significantly slow down image generation?
Can I run Stable Diffusion SDXL on a 6GB VRAM GPU without this error?
Why does the error only happen sometimes and not every generation?
Will upgrading to more system RAM fix the CUDA out of memory error?
Quick diagnostic checklist
Before diving into the full fix, run through these quick checks — they resolve the issue in most cases without additional steps:
Common root causes
Understanding why this error occurs helps you prevent it in the future. The most frequent causes are:
- Insufficient GPU VRAM for the selected model
- Corrupted model checkpoint file
- Outdated GPU drivers
- Python dependency conflicts in the installation
- Incompatible CUDA version for the installed PyTorch
Still not working?
If none of the steps above resolved the issue, the next step is to contact Stable Diffusion support directly. When reaching out, include:
- • The exact error message or code you see
- • The steps you already tried from this guide
- • Your account plan and the approximate time the error started
- • Your browser/OS version if it is a web interface issue
About Stable Diffusion
Stable Diffusion is an open-source AI image generation model developed by Stability AI. Unlike cloud-based tools, it can be run locally on consumer GPUs. It is accessible via Automatic1111 WebUI, ComfyUI, and cloud platforms like DreamStudio. Local installations require a compatible NVIDIA or AMD GPU with at least 4GB VRAM.
Browse all Stable Diffusion error guides →