Model and content limit

Hello, may I get a confirmation if there are limits put on lambda chat? as in max output limit and context length limit.

I couldn’t seem to find any documentation of this.

Thank you in advance.

@Masira

There are no limits:

The Lambda Inference API enables you to use large language models (LLMs) without the need to set up a server. No limits are placed on the rate of requests. The Lambda Inference API can be used as a drop-in replacement for applications currently using the OpenAPI API. See, for example, our guide on integrating the Lambda Inference API into VS Code.

If you have suggestions on how this can be made more visible, please let me know. I’m happy to consider your suggestions.

Good day.

Thank you so much for getting back to me truly appreciate your time and help!

I just wanted to clarify that my earlier question was about the Lambda chat interface itself, not the inference API.

I apologize if there was any confusion in my wording

May I ask, what is the default output length for Lambda Chat, and does the context length align with the model’s supported context length?

I’d love to better understand its limits.