Tensorflow in virtual environment on GPU Workstation

We have some issues getting Tensorflow to work in a virtual environment on the Lambda Labs GPU Workstation with Ubuntu 20.04.

We test with the following minimal example tftest.py:

from tensorflow.python.client import device_lib
print([x.name for x in device_lib.list_local_devices()])

We can run it and we get GPU:0 and GPU:1 -

asj@zlambda:~$ python3 tftest.py
... lots of tensorflow output ...
['/device:CPU:0', '/device:XLA_CPU:0', '/device:XLA_GPU:0', '/device:XLA_GPU:1', '/device:GPU:0', '/device:GPU:1']

However, if we run it in a virtual environment (as described on this official page: Lambda Stack: an AI software stack that's always up-to-date, Using Lambda Stack with python virtual environments, pip install tensorflow-gpu), we instead get the following output:

(venv) asj@zlambda:~$ python3 tftest.py
... lots of tensorflow output ...
2020-11-30 15:11:44.852707: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
... lots of tensorflow output ...
['/device:CPU:0', '/device:XLA_CPU:0', '/device:XLA_GPU:0', '/device:XLA_GPU:1']

What to do? How can we make a virtual environment with tensorflow-gpu on the Lambda machine, when we have Lambda Stack installed?

We are aware that we can use --system-site-packages, but we would like to be able to build docker containers that can use the GPU, and while we have it working with Pytorch, it doesn’t work with Tensorflow. Since I can’t get it to work in a virtual environment, I assume this might be related.

Hey Andreas,

Assuming you’re on Ubuntu 20.04, simply run the following:

sudo apt -y install libcudart10.1