Lambda Tensorbook - Unable to recognize GPU with PyTorch

JayUrbain · February 24, 2022, 10:02am

I’m unable to recognize the GPU when installing PyTorch. Initially, I could recognize the GPU by rebooting, but that no longer works.

I have followed the instructions at this post to create a Conda environment and install PyTorch with GPU support:

$ conda create -n pytorch-gpu python=3.8
$ conda activate pytorch-gpu
$ conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

$ python
Python 3.8.12 (default, Oct 12 2021, 13:49:34)
[GCC 7.5.0] :: Anaconda, Inc. on linux

import torch
torch.cuda.is_available()
False

I have also tried rebooting and re-creating a new environment as suggested here and that does not work.:
I Pytorch sometimes fails to recognize GPU

Any direction would be greatly appreciated.

Thanks,
Jay

JayUrbain · February 24, 2022, 10:16am

Note: I have also upgraded the system:
$ sudo apt-get update && sudo apt-get upgrade -y

And still no GPU.

matteo-pennisi · February 24, 2022, 11:49am

Hi,
are you able to run nvidia-smi ?

JayUrbain · February 24, 2022, 12:10pm

No. I get the following. I have not tried changing the installation.

Command ‘nvidia-smi’ not found, but can be installed with:

sudo apt install nvidia-utils-435 # version 435.21-0ubuntu7, or
sudo apt install nvidia-utils-440 # version 440.82+really.440.64-0ubuntu6
sudo apt install nvidia-340 # version 340.108-0ubuntu5.20.04.2
sudo apt install nvidia-utils-390 # version 390.144-0ubuntu0.20.04.1
sudo apt install nvidia-utils-450-server # version 450.172.01-0ubuntu0.20.04.1
sudo apt install nvidia-utils-470 # version 470.103.01-0ubuntu0.20.04.1
sudo apt install nvidia-utils-470-server # version 470.103.01-0ubuntu0.20.04.1
sudo apt install nvidia-utils-510 # version 510.47.03-0ubuntu0.20.04.1
sudo apt install nvidia-utils-418-server # version 418.226.00-0ubuntu0.20.04.2

Thanks

matteo-pennisi · February 24, 2022, 1:28pm

ok this is something that happened to me after doing ubuntu server/desktop upgrade. For some reasons the nvidia drivers of lambda are overrided. A solution not optimal but that works for me is to reinstall lambda stack server and reboot.

LAMBDA_REPO=$(mktemp) && \
wget -O${LAMBDA_REPO} https://lambdalabs.com/static/misc/lambda-stack-repo.deb && \
sudo dpkg -i ${LAMBDA_REPO} && rm -f ${LAMBDA_REPO} && \
sudo apt-get update && \
sudo apt-get --yes upgrade && \
sudo apt-get install --yes --no-install-recommends lambda-server && \
sudo apt-get install --yes --no-install-recommends nvidia-headless-470-server && \
sudo apt-get install --yes --no-install-recommends nvidia-fabricmanager-470 && \
sudo apt-get install --yes --no-install-recommends lambda-stack-cuda

and then reboot

sudo reboot now

tell me if this works

JayUrbain · February 24, 2022, 3:57pm

Reinstalling the Lambda stack did not help. But thanks, I really appreciate your help.

Should I reinstall the nVidia drivers?

Seems like something is out of sync.

Thanks.

matteo-pennisi · February 25, 2022, 7:19pm

just to understand, is ubuntu desktop or server?

JayUrbain · February 26, 2022, 6:03pm

ubuntu desktop (Tensorbook)

matteo-pennisi · March 1, 2022, 4:18pm

are you able to do a clean reinstall? I’ve found that the best way to install lambda stack is to install ubuntu without letting him to search for drivers (remove the tick during the installation) and then install lambda stack.
This is surely something about the drivers as you are not able to run nvidia-smi. If you can’t reinstall try purge all nvidia drivers and reinstall lambda stack (not for servers but for desktop)

markd · March 4, 2022, 9:55am

But it looks like you are missing basic packages like the nvidia-util so that needs to be
resolved first, and ensure you have the driver installed.

Also from the above, it looks like you do not have Lambda stack installed, but instead it points only to Ubuntu.

Is this a tensorbook, desktop or a server?
Check to see if the kernel driver is loaded:
$ lsmod | grep nvidia
Check to make sure the nvidia packages are installed (and from which repository)
(normally around 30-40 packages, or you may have old stale packages)
$ dpkg --list | grep nvidia
You can clean up what is there and reinstall lambda stack:
Lambda Stack: an AI software stack that's always up-to-date
* Depending on if this is a desktop or server has different instructions for packages
to install. (You do not need headless or fabric manager on a desktop).
a. To remove Lambda stack:
$ sudo rm -f /etc/apt/sources.list.d/{graphics,nvidia,cuda}*;
COLUMNS=200 dpkg -l |awk ‘/cuda|lib(accinj64|cu(blas|dart|dnn|fft|inj|pti|rand|solver|sparse)|magma|nccl|npp|nv[^p])|nv(idia|ml)|tensor(flow|board)|torch/ { print $2 }’ |
sudo xargs -or apt -y remove --purge

b. For a Desktop to re-install Lambda stack:
LAMBDA_REPO=$(mktemp) &&
wget -O${LAMBDA_REPO} https://lambdalabs.com/static/misc/lambda-stack-repo.deb &&
sudo dpkg -i ${LAMBDA_REPO} && rm -f ${LAMBDA_REPO} &&
sudo apt-get -y update && sudo apt-get -y install lambda-stack-cuda
c. For a server to reinstall lambda stack:
LAMBDA_REPO=$(mktemp) &&
wget -O${LAMBDA_REPO} https://lambdalabs.com/static/misc/lambda-stack-repo.deb &&
sudo dpkg -i ${LAMBDA_REPO} && rm -f ${LAMBDA_REPO} &&
sudo apt-get update &&
sudo apt-get --yes upgrade &&
sudo apt-get install --yes --no-install-recommends lambda-server &&
sudo apt-get install --yes --no-install-recommends nvidia-headless-470-server &&
sudo apt-get install --yes --no-install-recommends nvidia-fabricmanager-470 &&
sudo apt-get install --yes --no-install-recommends lambda-stack-cuda

Topic		Replies	Views
How to get PyTorch to recognize gpus when using Anaconda Technical Help	1	1858	February 10, 2022
Pytorch sometimes fails to recognize GPU Technical Help	8	2328	September 28, 2020
Lambda workstation gpu not recognized Technical Help	1	1314	March 4, 2022
Pytorch and conda on Lambda Workstation RTX 3090	12	4868	July 22, 2022
GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation Technical Help	1	8560	April 23, 2021

Lambda Tensorbook - Unable to recognize GPU with PyTorch

Related Topics