Import Environment YAML to Cloud GPU via conda

mad · June 14, 2023, 8:30pm

Is it possible to import a conda environment to a GPU instance on the Lambda Cloud? If not, what is the best way to include common packages available through conda on a cloud instance?

We essentially rely on some cheminformataics / data science packages like RDKit and SciKit-Learn. What would be the best way to import our workflow a GPU instance and run it?

mpapili · June 15, 2023, 2:27am

There are a number of possible reasons that you’d want to be doing this so I’ll try and cover an example for how to tackle each of them:

Scenario1: You just want to install common packages on the Lambda GPU Instance similar to your local machine:

This one is very straightforward. On your local machine, be sure to have your conda env activated and…

# export env requirements
conda env export > environment.yml
# upload your requirements
scp -i <YOUR_KEY>.pem environment.yml ubuntu@<INSTANCE_IP>:/home/ubuntu/environment.yml
# set up your env
ssh ubuntu@<YOUR_IP> -i <YOUR_KEY>.pem "/opt/miniconda/bin/conda init && /opt/miniconda/bin/conda env create -f environment.yml"

Lambda GPU Instances seem to have a miniconda install, so be sure to save yourself some time and typing and run:

alias conda=/opt/miniconda/bin/conda

If this was me and I needed to do this frequently (akin to the concept of User Data on something like AWS) I would set up a little setup.sh bash script with the above to save me some time.

Scenario2: You have some binaries or local modules that you can’t get off of the open-web from a Lambda GPU Instance and just want to drop your EXACT Conda Env onto a Lambda GPU Instance:

I run into this scenario quite often personally. A benefit of Conda over, say, virtualenv, is that binary files that your Python modules are bundled into the environment itself. This means that the whole thing is pseudo-portable and if you have a similar-enough system (x86 CPU and ideally Ubuntu) then this might work for you:

In this example I’ve made a Conda Env with two modules:

numpy → commonly-used module that requires multiple binaries to operate
requests → I’ve modified its __init__.py to call a little “hey, it works!” Go binary to represent a piece of the environment that perhaps your team/company provides which Lambda GPU Cloud would obviously not have access to in the environment.yml method above

With your local Conda environment activated:

# install conda-pack and pack your whole Conda Environment into a tarball
conda install -n base conda-pack
conda pack
# send it to your Lambda GPU Instance
scp -i <YOUR_KEY>.pem <YOUR_ENV>.tar.gz  ubuntu@<INSTANCE_IP>:/home/ubuntu/<YOUR_ENV>.tar.gz

then hop onto your GPU instance and run:

# extract it to your new install location
mkdir <YOUR_ENV_NAME> && cd <YOUR_ENV_NAME>
tar -xzvf ../<YOUR_ENV_NAME>.tar.gz
# activate your new env
source <YOUR_ENV_NAME>/bin/activate
# set your PYTHONPATH
export PYTHONPATH=$(pwd)/lib/python3.8/site-packages/

…test out our semi-portable environment with custom binaries:

Python 3.8.16 (default, Jun 12 2023, 18:09:05) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
Hey, the binary works!
>>> # works
>>> import numpy
>>> # (works)

once again, the above can be made into a very quick setup.sh bash script that lives on your local machine to simulate a quick-setup User-Data type feature with on-demand instances. I’ll disclaim that there may be a few pathing issues with this method, but test it out and see if it works for your needs.

Scenario3: You need a combination of the above and quite a bit of additional setup:

For anything more complicated than this I’d recommend looking into The Lambda Stack Docker Image. The instructions are incredibly easy and if you set up a simple Dockerfile for your project you can use the instance’s GPU(s) within the container which will be set up for you ready-to-go every time.

Hope somewhere in this comment I covered what you were looking for!! Usually when renting a GPU Instance that requires a specific environment every time I’ll always keep myself a little handy setup.sh script to get it ready quickly each time. Let me know if something like this works for you and your team’s workflow!

mad · June 15, 2023, 1:39pm

This is exactly what I was looking for, thank you!!

Topic		Replies	Views
Pytorch and conda on Lambda Workstation RTX 3090	12	5681	July 22, 2022
Update CUDA on Lambda Cloud Technical Help	2	2895	August 18, 2020
Lambda Quad: Using Anaconda / Docker? Technical Help	5	2855	May 5, 2018
Python: Accessing installed Lambda stack modules Technical Help	2	3517	February 27, 2019
Anaconda install Deep Learning: Getting Started	2	2974	May 16, 2019

Import Environment YAML to Cloud GPU via conda

Scenario1: You just want to install common packages on the Lambda GPU Instance similar to your local machine:

Scenario2: You have some binaries or local modules that you can’t get off of the open-web from a Lambda GPU Instance and just want to drop your EXACT Conda Env onto a Lambda GPU Instance:

Scenario3: You need a combination of the above and quite a bit of additional setup:

Related topics