-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama-server <chat> exited with status code 1, and no response from localhost:8080 (docker installation) #3744
Comments
hi @cchciose, I saw that there is only 2G memory on your GPU, it's probably not enough for running models in 3B, maybe you can try the starCoder 1B and QwenCoder 0.5B |
According to your last message, I have tried width starCoder 1B and QwenCoder 0.5B, and migrated to linux installation to see if docker was the problem.
Is there a way to debug what happens ? |
Hi @cchciose, according to the GPU status you post previously, | N/A 76C P0 N/A / ERR! | 1393MiB / 2048MiB | 3% Default | there are only 500+MiB free Mem on your GPU, it's actually not enough to run the model. The 1B and 0.5B are the smallest models, but you have tried them with no success. You will require a device with greater memory capacity to execute the models. |
I have swapped to CPU (core i7), and the program works fine, but is surely less efficient than on GPU. By exiting a lot of opened programs, I have freed memory on the GPU to have about 1600 MiB available, but the program still exits in error. Does it means my NVIDIA GeForce GTX 1050 won't be useable at all (in your documentation you say that : "For 1B to 3B models, it's advisable to have at least NVIDIA T4, 10 Series, or 20 Series GPUs, or Apple Silicon like the M1." ? My GPU is from 10 series, so I hoped to use it with 1B or 3B models. If I upgrade my machine, what are the minimum memory requirements for the GPU ? |
Describe the bug
I have installed the docker version of tabby, launched with
docker-compose.yml
:The container loops on error with llama-server :
and I have no access to
http://localhost:8080
.Information about your version
Information about your GPU
Additional context
My distro is Ubuntu 20.04 with cuda-toolkit-12.6.77-1 and my cpu supports avx2
The text was updated successfully, but these errors were encountered: