Definition of on-demand

Hi there,

I’d like to confirm some details of on-demand instances. I wanted to deploy a model so that I’m only charged when I use the model to inference or fine tune the model. I only want to inference with it intermittently and want to avoid bills when I don’t use it. But I still would like it to be responsive anytime when I use it.

Is using on-demand instances such a convenient way?

Thanks!

You’re charged for all the time an instance is running, regardless of whether or not the GPUs are being utilized.

See Lambda’s docs on billing for more info.