Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of Memory Error #1

Closed
chrischen opened this issue Mar 1, 2025 · 5 comments
Closed

Out of Memory Error #1

chrischen opened this issue Mar 1, 2025 · 5 comments

Comments

@chrischen
Copy link

I ran it with a 512x512 style and content image and always getting an out of memory on GPU error regardless of resolution.

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 MiB. GPU 0 has a total capacity of 23.68 GiB of which 4.94 MiB is free. Including non-PyTorch memory, this process has 23.67 GiB memory in use. Of the allocated memory 22.71 GiB is allocated by PyTorch, and 662.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Seems to use about 18gb before the StableDiffusionXLImg2ImgPipeline part.

I have an RTX 3090 with 24gb memory. Able to run stuff like InstantStyle just fine.

@XWJByte
Copy link
Collaborator

XWJByte commented Mar 3, 2025

@chrischen Hi, thank you for your interests on our work.
Yeah, Compare to InstantStyle, StyleSSP need more gpu memory to inference IP-Adapter-Instruct model to get embedding of style and content information.
We suggest use the card with more gpu memory to inference. For instance, we inference successfully on a A100 card with 40g and 80g.
For inference on multi-gpu parallel we haven't tried it. If you succeed in inference on multiple 3090 cards, we welcome you to share your success experience here.

@chrischen
Copy link
Author

Do you know the minimum memory requirement? Would 32gb 5090 work? I am testing on consumer level setups so A100 is difficult to access.

@XWJByte
Copy link
Collaborator

XWJByte commented Mar 3, 2025

On my machine, style transfer with a resolution of 1024 requires at least 30.67G of GPU memory.
You can try it on 5090, if not work, down the image resolution to reduce part of gpu memory consumed.

@chrischen
Copy link
Author

I set load_in_4bit=True for Blip2, lowered the resolution to 768, and cannot use multiple controlnets (such as "combined" or "tile_canny") and this will fit in 24gb ram.

Lowering the resolution didn't help much. Lowering to even 64px still went over 24gb.

@XWJByte
Copy link
Collaborator

XWJByte commented Mar 3, 2025

@chrischen Well Done. Thank you for sharing your experience in saving GPU memory.
I close this issue, feel free to reopen it if you encounter GPU OOM issue again.

@XWJByte XWJByte closed this as completed Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants