Model and content limit

Masira · February 15, 2025, 8:24am

Hello, may I get a confirmation if there are limits put on lambda chat? as in max output limit and context length limit.

I couldn’t seem to find any documentation of this.

Thank you in advance.

cody_b · February 18, 2025, 4:58pm

@Masira

There are no limits:

The Lambda Inference API enables you to use large language models (LLMs) without the need to set up a server. No limits are placed on the rate of requests. The Lambda Inference API can be used as a drop-in replacement for applications currently using the OpenAPI API. See, for example, our guide on integrating the Lambda Inference API into VS Code.

If you have suggestions on how this can be made more visible, please let me know. I’m happy to consider your suggestions.

Masira · February 19, 2025, 3:06pm

Good day.

Thank you so much for getting back to me truly appreciate your time and help!

I just wanted to clarify that my earlier question was about the Lambda chat interface itself, not the inference API.

I apologize if there was any confusion in my wording

May I ask, what is the default output length for Lambda Chat, and does the context length align with the model’s supported context length?

I’d love to better understand its limits.

Topic		Replies	Views
Does Inference API support batch/asynchronous processing	1	91	March 13, 2025
How do I set a spend limit on Lambda Cloud? Technical Help	1	807	March 8, 2024
Lambda <> Openrouter Woes Model Debugging	11	237	January 17, 2025
Unable to locate package lambda-stack-cuda then timeout Technical Help	0	1357	August 16, 2019
[request] please add deepseek to lambda inference	2	178	April 17, 2025

Model and content limit

Related topics