Downgrading Cuda 12.4 to 12.1

I am working on developing models that require Cuda 12.1, but lambda stack seems to pin the Cuda version to 12.4. I have no issues Downgrading Cuda from 12.4 to 12.1 on an AWS EC2 instance. However, with lambda stack some of the drivers seem to be pinned to a lambda url and auto-choose the versions. Is there any way to use Lambda with a specific version of Cuda?

I clear all Nvidia drivers with:

sudo apt-get --purge remove "*cublas*" "cuda*" "nsight*" 
sudo apt-get --purge remove "*nvidia*"
sudo apt-get --purge remove "libcuda*"
sudo apt-get remove --purge
sudo apt-get autoremove
sudo apt-get autoclean
sudo rm -rf /usr/local/cuda*

Because I cannot uninstall Cuda with the normal installer because of sudo being password protected.

I try to install 12.1 with:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

but then I get:

cuda-drivers-530 : Depends: nvidia-dkms-530 (>= 530.30.02)
                    Depends: nvidia-kernel-common-530 (>= 530.30.02) but it is not installable
                    Depends: nvidia-kernel-source-530 (>= 530.30.02) but it is not installable or
                             nvidia-kernel-open-530 (>= 530.30.02) but it is not installable
                    Depends: nvidia-utils-530 (>= 530.30.02) but it is not installable
                    Depends: xserver-xorg-video-nvidia-530 (>= 530.30.02) but it is not installable
 libnvidia-gl-550 : Depends: libnvidia-compute-550 (= 550.120-0lambda0.22.04.1) but it is not installable
 nvidia-driver-550 : Depends: libnvidia-compute-550 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Depends: nvidia-compute-utils-550 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Depends: libnvidia-decode-550 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Depends: libnvidia-encode-550 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Depends: libnvidia-fbc1-550 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Recommends: libnvidia-compute-550:i386 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Recommends: libnvidia-decode-550:i386 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Recommends: libnvidia-encode-550:i386 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Recommends: libnvidia-extra-550:i386 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Recommends: libnvidia-fbc1-550:i386 (= 550.120-0lambda0.22.04.1) but it is not installable
                     Recommends: libnvidia-gl-550:i386 (= 550.120-0lambda0.22.04.1) but it is not installable

So I recognize that the instance comes with the driver 550.120, and that doesn’t work with 12.1… So I try to install driver 530

sudo apt install nvidia-driver-530

Which works, and then I try to remove 550, sudo apt autoremove, and reboot sudo reboot now, but driver 550 is still installed. Trying to change the driver from 550 to 535 or 550 seems to just install 550.

Alternatively, trying to directly install Cuda 12.1 without removing driver 550 gives the following error, saying that nvidia-dkms-530 could not be configured. But trying to configure it with sudo dpkg --configure nvidia-dkms-530 fails. So I try to uninstall and reinstall the drivers with,

sudo dkms remove nvidia/530 --all
sudo dkms build nvidia/530
sudo dkms install nvidia/530

But that doesn’t work either.

I tried looking at the make.log with cat /var/lib/dkms/nvidia/530.30.02/build/make.log, but that just said

var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-pat.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-vtophys.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-pci.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-usermap.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-vm.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-mmap.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-p2p.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-i2c.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv-procfs.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/os-interface.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/530.30.02/build/nvidia/nv.o] Error 1
make[2]: *** [/usr/src/linux-headers-6.8.0-47-generic/Makefile:1925: /var/lib/dkms/nvidia/530.30.02/build] Error 2
make[1]: *** [Makefile:240: __sub-make] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.8.0-47-generic'
make: *** [Makefile:82: modules] Error 2

Lambda usually recommends to customers that they use conda virtual environments to use different CUDA versions than what’s preinstalled.

See also: Cuda | Anaconda.org