Tensorflow no longer training with gpu

I have a tensorbook purchased about six months ago. Ubuntu encountered an error and did a partial udpate, during this update something was removed, unfortunately i didn’t catch what and now tensorflow is no longer training on gpu.

I tried updating the lambda stack but get this message in the command line:

. The nvidia-smi is still there but when training appears to be at 0%. Would appreciate any help or suggestions!

oh and i do have the gpu version of tensorflow installed

So the good news is that it is showing it found the GPU (GPU:0).
The other message after the upgrade:
‘The following packages were automatically installed and are no longer required:’

This just means you should be free to remove those packages, it has left them there for now.
A previous package that was installed required those packages, but after the upgrade those packages are no longer required.

To clear this you can do the following:
sudo apt autoremove

nvidia-smi would not likely show a much usage on a short code. If you run something a little longer you should see the utilization.
Here is another way to view utilization:

  • nvidia-smi --query-gpu=index,gpu_bus_id,utilization.gpu,utilization.memory,power.draw --format=csv -l
  • nvidia-smi dmon

Mark

the system started by upgrading your Nvidia device drivers. Most likely you successfully upgraded the Nvidia drivers and now these packages were all built with the previous 470 version.

You can check the current driver’s version when you run on the shell nvidia-smi.

  1. After that, it is highly recommended that you autoremove all the older file installations as suggested.
  2. Then run update command again. When you get the partial upgrade option, go ahead and hit it. It will take quite a while downloading several gigabytes and installing the new stack. Then reboot.
  3. I’d run the update again incrementally until there are no further updates needed.
  4. on the command line, run sudo apt update in order to update, followed by sudo apt -f upgrade
  5. clean up: run sudo apt autoremove followed by sudo apt autoclean

good luck