Inference API Limits?

Does anyone know the specific API limits (e.g. Tokens per minute (TPM), Request per minute (RPM), etc.) applicable to the Inference API, per model? Could not find any documentation on the website. Apologies if I missed anything.

@junruilee

There are no rate limits.

I just now added to the docs, “No limits are placed on the rate of requests.”

Thank you for confirming.