I have been using keras to train some models for the past few months, and everything has been running smoothly. A few days ago, we updated the workstation including a Tensorflow update that brought us from 2.2 to 2.3. Ever since then, I can no longer train my models. It gets to the line where I add a Conv1D layer to my Sequential model before I get the following error message:
AttributeError: module ‘tensorflow.python.framework.ops’ has no attribute ‘_TensorLike’
I have tried rebooting and changing the keras import statements from keras.models and keras.layers to tensorflow.keras.models and tensorflow.keras.layers, but this just gives me a different error:
TypeError: Parameter to MergeFrom() must be instance of same class: expected tensorflow.TensorShapeProto got tensorflow.TensorShapeProto.
I’m using a Lambda Quad AI workstation, and it’s running Ubuntu 20.04 with the Lambda software stack. I am also running this remotely using a jupyter notebook inside a virtual environment. Any help is greatly appreciated regarding what caused the sudden break with keras and tensorflow post-update.
Same issue, but on a TensorBook (w/ Ubuntu 20.04). Jupyter notebook using TF 2.2 was working as expected. Same notebook using TF 2.3 after Lambda Stack update gives the identical error message as mentioned above:
TypeError: Parameter to MergeFrom() must be instance of same class: expected tensorflow.TensorShapeProto got tensorflow.TensorShapeProto.
As @ebgoldst suggested, installing TF 2.2 is the fix here.
Since the TF 2.2 binary distributed by pip is linked against libcudart10.1, you may also want to install that as well: pip3 install tensorflow==2.2.0 && sudo apt -y install libcudart10.1
thx @jeremy — downgrading helped w/ the TypeError — thank you. However the TF model is still failing to run. The Jupyter notebook error is:
UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
the terminal message is:
tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
We had a similar problem once we updated our Lambda workstation. TensorFlow got updated to version 2.5 and we could no longer run any Tensorflow models. The command