There are a number of possible reasons that you’d want to be doing this so I’ll try and cover an example for how to tackle each of them:
Scenario1: You just want to install common packages on the Lambda GPU Instance similar to your local machine:
This one is very straightforward. On your local machine, be sure to have your conda env activated and…
# export env requirements
conda env export > environment.yml
# upload your requirements
scp -i <YOUR_KEY>.pem environment.yml ubuntu@<INSTANCE_IP>:/home/ubuntu/environment.yml
# set up your env
ssh ubuntu@<YOUR_IP> -i <YOUR_KEY>.pem "/opt/miniconda/bin/conda init && /opt/miniconda/bin/conda env create -f environment.yml"
Lambda GPU Instances seem to have a miniconda install, so be sure to save yourself some time and typing and run:
alias conda=/opt/miniconda/bin/conda
If this was me and I needed to do this frequently (akin to the concept of User Data
on something like AWS) I would set up a little setup.sh
bash script with the above to save me some time.
Scenario2: You have some binaries or local modules that you can’t get off of the open-web from a Lambda GPU Instance and just want to drop your EXACT Conda Env onto a Lambda GPU Instance:
I run into this scenario quite often personally. A benefit of Conda over, say, virtualenv, is that binary files that your Python modules are bundled into the environment itself. This means that the whole thing is pseudo-portable and if you have a similar-enough system (x86 CPU and ideally Ubuntu) then this might work for you:
In this example I’ve made a Conda Env with two modules:
numpy
→ commonly-used module that requires multiple binaries to operate
requests
→ I’ve modified its __init__.py
to call a little “hey, it works!” Go binary to represent a piece of the environment that perhaps your team/company provides which Lambda GPU Cloud would obviously not have access to in the environment.yml
method above
With your local Conda environment activated:
# install conda-pack and pack your whole Conda Environment into a tarball
conda install -n base conda-pack
conda pack
# send it to your Lambda GPU Instance
scp -i <YOUR_KEY>.pem <YOUR_ENV>.tar.gz ubuntu@<INSTANCE_IP>:/home/ubuntu/<YOUR_ENV>.tar.gz
then hop onto your GPU instance and run:
# extract it to your new install location
mkdir <YOUR_ENV_NAME> && cd <YOUR_ENV_NAME>
tar -xzvf ../<YOUR_ENV_NAME>.tar.gz
# activate your new env
source <YOUR_ENV_NAME>/bin/activate
# set your PYTHONPATH
export PYTHONPATH=$(pwd)/lib/python3.8/site-packages/
…test out our semi-portable environment with custom binaries:
Python 3.8.16 (default, Jun 12 2023, 18:09:05)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
Hey, the binary works!
>>> # works
>>> import numpy
>>> # (works)
once again, the above can be made into a very quick setup.sh
bash script that lives on your local machine to simulate a quick-setup User-Data
type feature with on-demand instances. I’ll disclaim that there may be a few pathing issues with this method, but test it out and see if it works for your needs.
Scenario3: You need a combination of the above and quite a bit of additional setup:
For anything more complicated than this I’d recommend looking into The Lambda Stack Docker Image. The instructions are incredibly easy and if you set up a simple Dockerfile
for your project you can use the instance’s GPU(s) within the container which will be set up for you ready-to-go every time.
Hope somewhere in this comment I covered what you were looking for!! Usually when renting a GPU Instance that requires a specific environment every time I’ll always keep myself a little handy setup.sh
script to get it ready quickly each time. Let me know if something like this works for you and your team’s workflow!