Lambda Stack - CUDNN 8 Upgrade Query

Hi,

I use a Tensorbook and need to leverage on Tensorflow GPU support for CUDA 11. Though the latest Lambda Stack upgrade switched my previous CUDA 10.2 to 11.1, the CUDNN version still remains 7.6.

Do we know of a timeline by when we can expect Lambda Stack to upgrade its CUDNN 7.6 to CUDNN 8.x?

Alternatively, is there a suggestion how to upgrade it manually without breaking the Lambda Stack - ensuring no compatibility issues in future upgrades of the Lambda Stack?

Many thanks in advance!

Regards.

1 Like

Lambda Stack with CuDNN 8 is coming shortly.

In the mean time, you should be able to link to CUDNN with something like modifying your LD_LIBRARY_PATH to include a path to the libcudnn.so.8 files.

1 Like

Many thanks @sabalaba for your kind response!

I downloaded cuDNN 8 for Ubuntu 20.04 and added the LD_LIBRARY_PATH as per your suggestion.

However, while testing for Tensorflow’s GPU association, I received an error stating, “Could not load dynamic library ‘libcusolver.so.10’; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory”.

This might be resolved by completely removing all installed CUDA files and a fresh install of CUDA, but I fear, this would lead to an incongruity in the existing Lambda Stack.

Note: I even tried with Tensorflow 2.4.0-rc1 whose pip packages are now built with CUDA11 and cuDNN 8.0.2.

Would you kindly be able to suggest a solution here?

Hello to all,

Just to add more details @sabalaba (because I faced the same problem), it seems there is however a problem with the current lambdastack version. On a fresh install, any pytorch call to code that uses cudnn throws an error :
“Could not load library libcudnn_cnn_train.so.8. Error: libcudnn_ops_train.so.8: cannot open shared object file: No such file or directory
Please make sure libcudnn_cnn_train.so.8 is in your library path!”

The only way to get things working is to set LD_LIBRARY_PATH accordingly:
export LD_LIBRARY_PATH=/usr/lib/python3/dist-packages/torch/lib/

But is seem not to be that lambda stack way.

I understand that cudnn 8 is not yet supported but in that case why does a fresh install try to use cudnn 8 by default ?

1 Like

Is there a reason that you’re using a pip installed version of pytorch instead of the default Lambda Stack pytorch?

The default Lambda stack pytorch has cudnn built in and doesn’t throw that error.

>>> import torch
>>> torch.__path__
['/usr/lib/python3/dist-packages/torch']
>>> torch.__version__
'1.6.0'

`Hello,
No, not using any pip installed version of pytorch. Just Ubuntu 20.04 and a fresh install of lambda stack (nothing else added appart from jupyterhub). In that configuration I must export torch location in LD_LIBRARY_PATH. Otherwise I’ve got the reported error about libcudnn_cnn_train.so.8 not found :-(.
Pytorch it self works fine, just the cudnn related part that fails. Reported torch version is however 1.7.0.
>>> import torch
>>> torch.path
[‘/usr/lib/python3/dist-packages/torch’]
>>> torch.version
‘1.7.0’

1 Like

Good information thanks for sharing
vmware

I solved this problem by running this in python:

import torch
torch.__path__
>>> ['/some/path/to/torch']

Then in terminal:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/some/path/to/torch

Any updates on when it will be supported by default?