/usr/bin/nvidia-container-runtime missing (NVIDIA Device Plugin)

Hello, I’m following the instructions given here to install the nvidia container toolkit:

https://docs.nvidia.com/datacenter/cloud-native/kubernetes/install-k8s.html#install-nvidia-container-toolkit-nvidia-docker2

Particularly, I’m trying to use containerd, which requires updates to include:

--- config.toml 2020-12-17 19:13:03.242630735 +0000
+++ /etc/containerd/config.toml 2020-12-17 19:27:02.019027793 +0000
@@ -70,7 +70,7 @@
   ignore_image_defined_volumes = false
   [plugins."io.containerd.grpc.v1.cri".containerd]
      snapshotter = "overlayfs"
-      default_runtime_name = "runc"
+      default_runtime_name = "nvidia"
      no_pivot = false
      disable_snapshot_annotations = true
      discard_unpacked_layers = false
@@ -94,6 +94,15 @@
         privileged_without_host_devices = false
         base_runtime_spec = ""
         [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
+            SystemdCgroup = true
+       [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
+          privileged_without_host_devices = false
+          runtime_engine = ""
+          runtime_root = ""
+          runtime_type = "io.containerd.runc.v1"
+          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
+            BinaryName = "/usr/bin/nvidia-container-runtime"
+            SystemdCgroup = true
   [plugins."io.containerd.grpc.v1.cri".cni]
      bin_dir = "/opt/cni/bin"
      conf_dir = "/etc/cni/net.d"

Notice, the binary name they give is /usr/bin/nvidia-container-runtime, and as far as I can tell this is not given by nvidia-container-toolkit_1.7.0+dfsg-0lambda0.20.04.1_amd64.deb:

drwxr-xr-x root/root         0 2022-01-25 23:50 ./
drwxr-xr-x root/root         0 2022-01-25 23:50 ./etc/
drwxr-xr-x root/root         0 2022-01-25 23:50 ./etc/nvidia-container-runtime/
-rw-r--r-- root/root       542 2022-01-25 23:50 ./etc/nvidia-container-runtime/config.toml
drwxr-xr-x root/root         0 2022-01-25 23:50 ./usr/
drwxr-xr-x root/root         0 2022-01-25 23:50 ./usr/bin/
-rwxr-xr-x root/root   2335480 2022-01-25 23:50 ./usr/bin/nvidia-container-toolkit
drwxr-xr-x root/root         0 2022-01-25 23:50 ./usr/share/
drwxr-xr-x root/root         0 2022-01-25 23:50 ./usr/share/doc/
drwxr-xr-x root/root         0 2022-01-25 23:50 ./usr/share/doc/nvidia-container-toolkit/
-rw-r--r-- root/root      1064 2022-01-25 23:50 ./usr/share/doc/nvidia-container-toolkit/changelog.Debian.gz
-rw-r--r-- root/root      2094 2022-01-25 23:49 ./usr/share/doc/nvidia-container-toolkit/copyright
drwxr-xr-x root/root         0 2022-01-25 23:50 ./usr/share/lintian/
drwxr-xr-x root/root         0 2022-01-25 23:50 ./usr/share/lintian/overrides/
-rw-r--r-- root/root       274 2020-04-03 02:37 ./usr/share/lintian/overrides/nvidia-container-toolkit
lrwxrwxrwx root/root         0 2022-01-25 23:50 ./usr/bin/nvidia-container-runtime-hook -> nvidia-container-toolkit

What’s the equivalent binary?

This is also the issue if you use docker, fyi. The instructions state that you must configure:

{
   "default-runtime": "nvidia",
   "runtimes": {
      "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
      }
   }
}

If you just want to use docker and do not need to use the old nvidia-docker2, You can follow the tutorial on:

sudo apt-get install -y docker.io nvidia-container-toolkit
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:21.06-py3 nvidia-smi

I will see if I can get nvidia-docker2 working, I was able to get it to install and have the file /usr/bin/nvidia-container-runtime at the same time as having docker (like the tutorial) function at the same time. However, I just did that tonight and I have to test the ‘nvidia-docker2’ being functional.

Mark