VLLM cluster using VLLM production stack

morrow_maki · August 18, 2025, 7:15am

Hello community,

I’m looking forward to renting multiple lambdalabs machines (with H100 gpus) and deploying them on a cluster using the vllm production stack.

My goal is to be able to manually purchase more machines from lambda labs and connect them to the cluster as my infrastructure get more load (→ more GPUs and more VLLM instances).

Any one has achieved something similar to this and would be willing to give me a few hints ? I followed the vllm production stack guide but I only managed to make it work with minikube (a single worker). I’d like to make it work for multi worker

Cheers !

Topic		Replies	Views
Installing more libraries to use alongside Lambda Stack	1	1903	March 31, 2019
Imitation Learning with Horovod and Open MPI HPC Libraries Technical Help	5	2584	October 15, 2019
Lamdastack Quick Introduction/ FAQ Technical Help	1	3105	September 7, 2018
Lambda stack with ubuntu 18.04 and rtx 2080ti Technical Help	0	1670	November 20, 2018
Lambda Quad: Using Anaconda / Docker? Technical Help	5	2857	May 5, 2018

VLLM cluster using VLLM production stack

Related topics