The Lambda Inference API enables you to use large language models (LLMs) without the need to set up a server. No limits are placed on the rate of requests. The Lambda Inference API can be used as a drop-in replacement for applications currently using the OpenAPI API. See, for example, our guide on integrating the Lambda Inference API into VS Code.
If you have suggestions on how this can be made more visible, please let me know. I’m happy to consider your suggestions.