We have some issues getting Tensorflow to work in a virtual environment on the Lambda Labs GPU Workstation with Ubuntu 20.04.
We test with the following minimal example
from tensorflow.python.client import device_lib print([x.name for x in device_lib.list_local_devices()])
We can run it and we get GPU:0 and GPU:1 -
asj@zlambda:~$ python3 tftest.py ... lots of tensorflow output ... ['/device:CPU:0', '/device:XLA_CPU:0', '/device:XLA_GPU:0', '/device:XLA_GPU:1', '/device:GPU:0', '/device:GPU:1']
However, if we run it in a virtual environment (as described on this official page: Lambda Stack: an AI software stack that's always up-to-date, Using Lambda Stack with python virtual environments,
pip install tensorflow-gpu), we instead get the following output:
(venv) asj@zlambda:~$ python3 tftest.py ... lots of tensorflow output ... 2020-11-30 15:11:44.852707: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory ... lots of tensorflow output ... ['/device:CPU:0', '/device:XLA_CPU:0', '/device:XLA_GPU:0', '/device:XLA_GPU:1']
What to do? How can we make a virtual environment with
tensorflow-gpu on the Lambda machine, when we have Lambda Stack installed?
We are aware that we can use
--system-site-packages, but we would like to be able to build docker containers that can use the GPU, and while we have it working with Pytorch, it doesn’t work with Tensorflow. Since I can’t get it to work in a virtual environment, I assume this might be related.