Best practices for other package management without breaking my lambda stack?

jcc · August 2, 2023, 12:30pm

First of all, thank you so so much for making the lambda-stack installer available

I installed it a while ago on a fresh ubuntu 22.04 using your instructions:

wget -nv -O- https://lambdalabs.com/install-lambda-stack.sh | sh -
sudo reboot

It worked beautifully and Tensorflow could talk to my GPU without any issues. Then I installed some additional libraries via pip: tensorflow_addons, tensorflow_probability. And this was fine, a pip freeze | grep tensor showed:

tensorboard-plugin-profile==2.11.1
tensorflow-addons==0.19.0
tensorflow-estimator==2.11.0
tensorflow-gpu==2.11.0
tensorflow-probability==0.19.0

However, when I installed tensorflow_graphics it installed the pip version of tensorflow and now it cannot see my GPU.

pip freeze | grep tensor

tensorboard==2.13.0
tensorboard-data-server==0.7.1
tensorboard-plugin-profile==2.11.1
tensorflow==2.13.0
tensorflow-addons==0.19.0
tensorflow-estimator==2.13.0
tensorflow-gpu==2.11.0
tensorflow-graphics==1.0.0
tensorflow-io-gcs-filesystem==0.32.0
tensorflow-probability==0.19.0

For now I guess I could just reinstall the lambda_stack with the same instructions even though my computer is no longer a fresh ubuntu.

My question is, what is the recommended way to install pip packages so that they don’t touch any software from the lambda stack?

Thank you!

cody_b · August 5, 2023, 4:23pm

The recommended way is to use a Python virtual environment (venv) or a conda virtual environment.

Hope this helps!

markd · August 8, 2023, 9:53pm

Yes, it is best practice not to use pip in your main account, but to use versioning (as Cody mentioned).
pip -v list | egrep -v “/usr/lib/python3/dist-packages”
* This will show all packages that are in your current environment not from Lambda.

It is best to use:

Docker
Python venv
Anaconda/Miniconda

This will allow you to have a environment for each code, so it does not conflict with others.

Docker images should be complete or at least reset to defaults on relaunching the image.
NVIDIA NGC Tutorial: Run a PyTorch Docker Container using nvidia-container-toolkit on Ubuntu
Python venv - allows you to try with using system installed packages or without
- This will setup a environment using system packages by default and you just add pip packages to see if they are compatible. Of course you may want to make the environment names much shorter, but for clarity I made them longer (these are affected by default pip installs in ~/.local or /usr/local):
  $ python -m venv --system-site-packages myenv-with-site-packages
  $ source ./myenv-site-packages/bin/activate
- This will setup a independent environment, needed at times when libraries conflict
  $ python -m venv myenv-independent/bin/activate
  $ source ./myenv-independent
Anaconda always always replaces any except software in /usr/local or ~/.local
- To setup conda you need to down load their script install, then for example with CUDA 11.8:
  $ conda create --name torch_gpu pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
  $ conda activate torch_gpu

Pytorch has a nice matrix on how to install (limited versions but helpful):
Start Locally | PyTorch
pip is limited on what it can install, and at times you need to change the LD_LIBRARY_PATH for pip packages (in Conda or in python venv). And always make sure your ~/.local and /usr/local do not have conflicts.

Also make sure ‘which python’ you are using. Example:
$ which python
/usr/bin/python
Versus
$ which python
/home/username/miniconda3/bin/python

I have additional examples and they break between versions and changes in packages.
https://github.com/markwdalton/lambdalabs/tree/main/documentation/software

Also a useful tip to find alternate versions without any work is use the “?” versus version and it will show you valid versions that are available.
$ pip install tensorflow-gpu==?
ERROR: Could not find a version that satisfies the requirement tensorflow-gpu==? (from versions: 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0, 2.12.0)
ERROR: No matching distribution found for tensorflow-gpu==?

Topic		Replies	Views
Why lambda stack not install tensorflow-gpu? Technical Help	2	2687	June 15, 2019
What is the best practice for 1. installing packages & 2. returning the system to having lambda stack properly working?	1	1204	July 22, 2022
Potential Lambda Stack bugs Technical Help	3	1597	March 17, 2021
Updating Stack on GPU Cloud to latest versions Technical Help	1	1102	April 30, 2023
Lamdastack Quick Introduction/ FAQ Technical Help	1	3080	September 7, 2018

Best practices for other package management without breaking my lambda stack?

Related topics