Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check #7

Open
FreshDumbledore opened this issue Oct 6, 2023 · 15 comments

Comments

@FreshDumbledore
Copy link

Following your tutorial, no matter if i skip over or do the pytorch steps, leads to the following error in the very end on my FreeBSD 13.2 system:

Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

I could add this arg, but i understand i would then run stablediffusion without GPU support.

To resolve it, i tried to copy over the line from your PyTorch instructions to the stable-diffusion instructions:

pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

But that leads to:

Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Launching Web UI with arguments:
/home/xxxx/stablediff/conda/envs/automatic/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/usr/xxxx/resu/stablediff/conda/envs/automatic/lib/python3.10/site-packages/torchvision/image.so: undefined symbol: _ZNK3c107SymBool10guard_boolEPKcl'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
no module 'xformers'. Processing without...
No SDP backend available, likely because you are running in pytorch versions < 2.0. In fact, you are using PyTorch 1.12.1+cu113. You might want to consider upgrading.
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Bus error (core dumped)

And dmesg reads:
linux: jid 0 pid 51026 (python3.10): linux: jid 0 pid 51026 (python3.10): syscall mbind not implementedlinux: jid 0 pid 51026 (python3.10):
syscall mbind not implemented
syscall mbind not implemented
pid 51026 (python3.10), jid 0, uid 1001: exited on signal 10 (core dumped)
linux: jid 0 pid 98077 (python3.10): linux: jid 0 pid 98077 (python3.10): syscall mbind not implementedsyscall mbind not implemented

pid 98077 (python3.10), jid 0, uid 1001: exited on signal 10 (core dumped)

All the tests worked fine, even before trying to add torch:

(pytorch) # LD_PRELOAD="${BASE_PATH}/dummy-uvm.so" python3 -c 'import torch; print(torch.cuda.is_available())'
True

(pytorch) # LD_PRELOAD="${BASE_PATH}/dummy-uvm.so" python3 -c 'import torch; print(torch.cuda.get_device_name(0))'
NVIDIA GeForce GTX 1050

@FreshDumbledore
Copy link
Author

I downgraded python in the miniconda environment to 3.9 as i saw in one of your tutorial output lines that you are using that but no luck:

Python 3.9.18 (main, Sep 11 2023, 13:41:44)
[GCC 11.2.0]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Installing clip
Installing open_clip
Installing requirements for CodeFormer
Installing requirements
Launching Web UI with arguments:
/home/xxxx/stablediff/conda/envs/automatic/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/lib64/libstdc++.so.6: version GLIBCXX_3.4.21' not found (required by /usr/home/xxxx/stablediff/conda/envs/automatic/lib/python3.9/site-packages/torchvision/image.so)'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpegorlibpnginstalled before buildingtorchvision` from source?
warn(
no module 'xformers'. Processing without...
No SDP backend available, likely because you are running in pytorch versions < 2.0. In fact, you are using PyTorch 1.12.1+cu113. You might want to consider upgrading.
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Bus error (core dumped)

@verm
Copy link
Owner

verm commented Oct 6, 2023

I'll try to update the tutorial this weekend it's a holiday here but there shouldn't be bus errors regardless. I'll try installing it tonight if I can to see if it's a quick fix.

@FreshDumbledore
Copy link
Author

I am trying to resolve the GLIBCXX_3.4.21' not found error on my end, not sure if its the core issue though (:

@verm
Copy link
Owner

verm commented Oct 6, 2023

3.4.19 is the latest available on FreeBSD at the moment. The fix will be to install an older version of torchvision that relies on GLIBCXX_3.4.19 or lower.

@FreshDumbledore
Copy link
Author

I identified all the libstdc6++ libraries in my conda environment and confirmed they all do include GLIBCXX_3.4.21
The one on my base system does not though, and that seems to be the one referenced: /lib64/libstdc++.so.6:

That should be why it throws the error UserWarning: Failed to load image Python extension: '/lib64/libstdc++.so.6: version GLIBCXX_3.4.21' not found

Looking into what you said about installing an older torchvision version

@verm
Copy link
Owner

verm commented Oct 6, 2023

If you install it before doing anything else then it won't install it during startup you should just be able to delete the package and rewind the version as well.

I'll see about getting a fully updated one with the latest PyTorch going but that'll be more detailed as the outdated Linux base will cause some issues as you've noticed...

@FreshDumbledore
Copy link
Author

Did not find a way to confirm which version uses GLIBCXX_3.4.19 so i just went with 1.11.0+cu113 for a try.

(automatic) bash-4.2$ pip install torch==1.11.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu113
Collecting torch==1.11.0+cu113
Downloading https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp39-cp39-linux_x86_64.whl (1637.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 GB ? eta 0:00:00
Requirement already satisfied: typing-extensions in ./conda/envs/automatic/lib/python3.9/site-packages (from torch==1.11.0+cu113) (4.7.1)
Installing collected packages: torch
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tomesd 0.1.3 requires torch>=1.12.1, but you have torch 1.11.0+cu113 which is incompatible.
Successfully installed torch-1.11.0+cu113
(automatic) bash-4.2$ LD_PRELOAD="/home/xxxx/stablediff/dummy-uvm.so" python3 launch.py
Python 3.9.18 (main, Sep 11 2023, 13:41:44)
[GCC 11.2.0]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Launching Web UI with arguments:
/home/xxxx/stablediff/conda/envs/automatic/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/lib64/libstdc++.so.6: version GLIBCXX_3.4.21' not found (required by /usr/home/xxxx/stablediff/conda/envs/automatic/lib/python3.9/site-packages/torchvision/image.so)'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpegorlibpnginstalled before buildingtorchvision` from source?
warn(
no module 'xformers'. Processing without...
No SDP backend available, likely because you are running in pytorch versions < 2.0. In fact, you are using PyTorch 1.11.0+cu113. You might want to consider upgrading.
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Bus error (core dumped)

Guess i would need to try something older?

@verm
Copy link
Owner

verm commented Oct 6, 2023

Torchvision is a separate package downgrading PyTorch won't help.

@FreshDumbledore
Copy link
Author

(automatic) bash-4.2$ conda list | grep torchvision
torchvision 0.15.2 cuda118py39h196c800_0

(automatic) bash-4.2$ conda search torchvision
Loading channels: done

Name Version Build Channel

torchvision 0.2.0 py27_0 pkgs/main
torchvision 0.2.0 py35_0 pkgs/main
torchvision 0.2.0 py36_0 pkgs/main
torchvision 0.2.1 py27_0 pkgs/main
torchvision 0.2.1 py35_0 pkgs/main
torchvision 0.2.1 py36_0 pkgs/main
torchvision 0.2.1 py37_0 pkgs/main
torchvision 0.3.0 cuda100py27h72fc40a_0 pkgs/main
torchvision 0.3.0 cuda100py36h72fc40a_0 pkgs/main
torchvision 0.3.0 cuda100py37h72fc40a_0 pkgs/main
torchvision 0.3.0 cuda90py27h6edc907_0 pkgs/main
torchvision 0.3.0 cuda90py36h6edc907_0 pkgs/main
torchvision 0.3.0 cuda90py37h6edc907_0 pkgs/main
torchvision 0.3.0 cuda92py27h0601742_0 pkgs/main
torchvision 0.3.0 cuda92py36h0601742_0 pkgs/main
torchvision 0.3.0 cuda92py37h0601742_0 pkgs/main
torchvision 0.4.0 cpu_py27h1e143f5_0 pkgs/main
torchvision 0.4.0 cpu_py36h1e143f5_0 pkgs/main
torchvision 0.4.0 cpu_py37h1e143f5_0 pkgs/main
torchvision 0.4.0 cuda100py27hecfc37a_0 pkgs/main
torchvision 0.4.0 cuda100py36hecfc37a_0 pkgs/main
torchvision 0.4.0 cuda100py37hecfc37a_0 pkgs/main
torchvision 0.4.0 cuda92py27h1667eeb_0 pkgs/main
torchvision 0.4.0 cuda92py36h1667eeb_0 pkgs/main
torchvision 0.4.0 cuda92py37h1667eeb_0 pkgs/main
torchvision 0.4.2 cpu_py27h9ec355b_0 pkgs/main
torchvision 0.4.2 cpu_py36h9ec355b_0 pkgs/main
torchvision 0.4.2 cpu_py37h9ec355b_0 pkgs/main
torchvision 0.4.2 cuda100py27hecfc37a_0 pkgs/main
torchvision 0.4.2 cuda100py36hecfc37a_0 pkgs/main
torchvision 0.4.2 cuda100py37hecfc37a_0 pkgs/main
torchvision 0.4.2 cuda92py27h1667eeb_0 pkgs/main
torchvision 0.4.2 cuda92py36h1667eeb_0 pkgs/main
torchvision 0.4.2 cuda92py37h1667eeb_0 pkgs/main
torchvision 0.8.2 cpu_py37ha229d99_0 pkgs/main
torchvision 0.8.2 cpu_py38ha229d99_0 pkgs/main
torchvision 0.8.2 cpu_py39ha229d99_0 pkgs/main
torchvision 0.11.3 cpu_py310h164cc8f_1 pkgs/main
torchvision 0.11.3 cpu_py310h164cc8f_2 pkgs/main
torchvision 0.11.3 cpu_py37h164cc8f_0 pkgs/main
torchvision 0.11.3 cpu_py37h164cc8f_1 pkgs/main
torchvision 0.11.3 cpu_py37h164cc8f_2 pkgs/main
torchvision 0.11.3 cpu_py38h164cc8f_0 pkgs/main
torchvision 0.11.3 cpu_py38h164cc8f_1 pkgs/main
torchvision 0.11.3 cpu_py38h164cc8f_2 pkgs/main
torchvision 0.11.3 cpu_py39h164cc8f_0 pkgs/main
torchvision 0.11.3 cpu_py39h164cc8f_1 pkgs/main
torchvision 0.11.3 cpu_py39h164cc8f_2 pkgs/main
torchvision 0.13.1 cpu_py310h164cc8f_0 pkgs/main
torchvision 0.13.1 cpu_py37h164cc8f_0 pkgs/main
torchvision 0.13.1 cpu_py38h164cc8f_0 pkgs/main
torchvision 0.13.1 cpu_py39h164cc8f_0 pkgs/main
torchvision 0.15.2 cpu_py310h83e0c9b_0 pkgs/main
torchvision 0.15.2 cpu_py311h6e929fa_0 pkgs/main
torchvision 0.15.2 cpu_py38h83e0c9b_0 pkgs/main
torchvision 0.15.2 cpu_py39h83e0c9b_0 pkgs/main
torchvision 0.15.2 cuda118py310h196c800_0 pkgs/main
torchvision 0.15.2 cuda118py311h4cc2eb7_0 pkgs/main
torchvision 0.15.2 cuda118py38h196c800_0 pkgs/main
torchvision 0.15.2 cuda118py39h196c800_0 pkgs/main

so i got 0.15.2 for cuda118.
the previous versions are all labeled cpu, i would have to go down to 0.4.2 to try an older version with cuda support?

either way, i am not in a hurry, if you want to revisit it and adjust the tutorial, i can wait for that :)

@verm
Copy link
Owner

verm commented Oct 6, 2023

Sure try 0.4.2 that's what I'd end up doing. It's probably the version that was being used before.

@verm
Copy link
Owner

verm commented Oct 6, 2023

Actually looks like it was using 0.13.1 Python 3.10 CUDA 113: torchvision-0.13.1-py310_cu113 I wonder where that package went.

@verm
Copy link
Owner

verm commented Oct 6, 2023

It's missing from https://anaconda.org/anaconda/torchvision/files I wonder why they got rid of it hmm I might have to host it somewhere.

@FreshDumbledore
Copy link
Author

Do you want me to test anything on my end at this point?

@FreshDumbledore
Copy link
Author

There is linux-64/torchvision-0.13.1-py310_cu113.tar.bz2 at https://anaconda.org/pytorch/torchvision/0.13.1/download/linux-64/torchvision-0.13.1-py310_cu113.tar.bz2 if that helps. Tarball.

@FreshDumbledore
Copy link
Author

Sort of forgot about this whole topic, do you want to carry on with it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants