For Versioning of CUDA, PyTorch, TensorFlow, etc. these need to be in sync. You would use Docker, Python venv or Anaconda for versioning. This gives you the most flexibility to change versions of CUDA and software dependent on the CUDA version.
For the virtual envs: instructions for the virtual envs and containers. as mentioned by Cody
For the driver you can install with Ubuntu, LambdaStack, NVIDIA via packages, NVIDIA via runtime script. So whatever you use. I could mostly help you with LambdaStack.
Driver with support for CUDA 11.1 will work with 3090s/A100s or older GPUs. H100’s would require Driver with support for CUDA 11.8.
But having the latest released driver is normally best.
You can use the CUDA as old as the minimum required for your GPU.
This will install the latest tested NVIDIA Driver, CUDA that is trusted/tested currently 12.2 but also other packages like TensorFlow, PyTorch built for the CUDA version.
The python venv and Anaconda will ignore the python and modules (except stuff installed in ~/.local and /usr/local) common issue for conflicts. So it allows you to change the CUDA version, Pytorch, TensorFlow.
To be more clear. What I need is an environment with CUDA 11.8. Lambda Cloud instance has CUDA 12.2 preinstalled.
For the virtual envs: instructions for the virtual envs and containers. as mentioned by Cody
I don’t see how using a container would help. Docker containers rely on the host CUDA installation. There is not such thing as a cuda installation in the container layer.
For the driver you can install with Ubuntu, LambdaStack, NVIDIA via packages, NVIDIA via runtime script.
Removing the installed version and installing with Ubuntu package manager or Nvidia scripts will raise an endless list of conflicts with preinstalled lambda stack packages. I guess there should be a simpler way. That’s why I was asking directly for a command.
How to install Lambda stack is here
It’s already installed. The docs only specify how to upgrade LambdaStack to the latest version. I don’t see any specifications on how to go to an old version with CUDA 11.8. I don’t even see a list of available versions.
Thank you! It is important to always list where you are running and your requirements.
You have not listed which GPUs you are running on, etc.
It is fairly simple just following the basic instructions Cody listed. The Drive is NOT CUDA.
The Driver shows the maximum CUDA level it supports.
The virtual environments: docker, python venv, Anaconda allow you to switch the CUDA version using the current driver running. The driver supports older versions of CUDA.
It is a bad practice to change the or move the driver to a older version, and normally not needed. instead you use the documented steps for virtual environments, to the flavor you need.
With the new information that this is Cloud, and the Driver is already installed and up to date, you only need to use the virtual environments (Docker or python venv or Anaconda). Docker being the safest due to people installing ‘misc pip junk’ in ~/.local. And ask you mentioned the Nvidia driver is already installed, so Docker would be easy to use and would version the CUDA version.
But I think your confusion was thinking the Driver version is the CUDA version, which is common, and I which nvidia-smi made the more clear.
If you give me which virtual env, and what you are using: docker, python venv, anaconda
or if you have a github. I could take a look and write up the exact instructions.