Best Practices(?): common tactics for server setups & data storage

Hey guys,
I would like to know what type of setup would be recommended for developing code with the use of on-demand servers that delete all files once they are terminated.

My current mission: develop some computer vision code/ model for an app.

My Machine: MacBook Pro M1

What I need: Develop and test my code (e.g. with notebooks) and test other cv repos & frameworks (mostly shell).

What Is my problem: because we are a small startup and we yet have to apply for aws activate, we dont have an S3 storage or a similar solution. The dataset is on my local machine, but for the lack of a NVIDIA card I cant properly install dependencies or frameworks that require cuda, so developing and prototyping on my MacBook is shit.
So I need the server and would theoretically need to set up the whole server every time I start up but also fetch the code and dataset.

My question is:
what solutions, types of scripts, usual approaches would help me out here?

Right now I am thinking of a shell scripts that once the instance is running sets up the necessary installs I need (e.g. Pycharm SSH plugin, conda) and clones my project from GitHub (for the current task the dataset is small enough to be posted to GitHub). Once I am done developing for the day I update the dependency and server setup files (pip or conda freezes) and push the code.

Is this the best/ a good setup for now? Are there any popular ways that I am missing? What is this process called?
is docker perfect for this use or the wrong thing? How do you guys do it?.. (except with AWS, Azure, GCP because deep diving into their ecosystems is not feasible right now (I am very happy about the easy of use of lambda instances)).

Happy to hear from you guys!


I would think this very much depends on the circumstances. In my case, I have spent some time on a setup script, only to realize I would have to downgrade CUDA (I need 10.0). I have a Docker image that works out of the box, so I switched to that, accepting the small overhead, I’ve read somewhere that it is around 1%.

For prototyping, however, I would think a local GPU has many advantages, like one-off setup, no data transfer and one-time cost, I would definitely look into external graphics cards, however I don’t know if it is possible on a Mac. And docker as well, there are plenty prebuilt images with major ftrameworks.

Hello @JimVincent and @comodoro!

Here is a simple way to deploy your software stack on the Lambda Cloud using Ansible.
eg. create “ready_to_work.yml”

# you can run this as follows:
# ansible-playbook --ssh-common-args='-o StrictHostKeyChecking=no' ready_to_work.yml -i lambda_cloud_instance_ip,
# eg. ansible-playbook --ssh-common-args='-o StrictHostKeyChecking=no' ready_to_work.yml -i,
- hosts: all
remote_user: ubuntu
    ansible_ssh_private_key_file: "~/.ssh/your-private-ssh-key-for-lambda-instances"
    github_username: put_username_here
    github_token: put_password_here
    github_repo: put_repo_name_here
# Here, we use the variables set above
# We clone the repo to the home directory of the user
- name: Checkout Your Code From Github Using Ansible.
    repo: "https://{{ github_token }}{{ github_username }}/{{ github_repo }}.git"
    dest: "~/{{ github_repo }}"

- name: Install apt packages
    name: "{{ item }}"
    state: present
    update_cache: yes
    - package1
    - package2

- name: Install pip packages
    name: "{{ item }}"
    - package1
    - package2

# Or start a Docker container
- name: Start Docker container
    name: mycontainer
    state: present
    image: ubuntu:22.04
    command: sleep infinity

You can also create a bash script to check if the Lambda instance has booted (by checking if port 22 is active) and then run the ansible script on it.


until nc -vzw 2 $1 22; do sleep 2; done
ansible-playbook --ssh-common-args='-o StrictHostKeyChecking=no' rready_to_work.yml -i $1,

Run it like this:
$ bash <instance_ip>