Lambda <> Openrouter Woes

Lambda has a great service through OpenRouter, having incredible models for incredible Prices.

However, it’s not so great if I’m trying to make a platform off of Lambda’s inference endpoints if every other second I’m being rate limited. My users (only 100 ish in total) get constant 429s.

I’m using Hermes 405 and Llama 3.3 70B, both of which are crazy limited.

Is there way to raise my key’s limits and Im just being dumb or is this how it’s meant to be.

@Jason Lambda’s engineers are looking into the issue. I’ll post an update in this thread as soon as I have one.

1 Like

Thank you!! I’m excited to use the service again without issues :slight_smile:

@Jason Quick update: Lambda’s engineers currently suspect the issue is with the DDoS protection implementation. If that’s the case, the rate limit will be bumped up for all customers.

If you happen to have Ray IDs, those would be helpful for diagnosing the issue.

@Jason Work was done last night to address the issue. Let me know in this thread if the issue appears to be fixed (or not).

Hey! I’m still seeing 429s right now. I do not see it earlier, but I see it rn.

It comes from OpenRouter so I don’t see a Ray ID, but I do see the error being returned and nothing logged otherwise.

If it’s a rate limit, can I get my accounts limits increased? I’m happy to use my key and route through OR if this is possible.

Hi, I’m also hitting the same issue - huge numbers of 429s. It seems to have become abruptly worse within the last day.

Yeah it is getting bad. I’m all for prevent DDOS, but I’d like to raise my limits on my max inference requests somehow. I’ve been using Lambda for a while now and I really really want to keep using you guys, but the 429s are unacceptable now.

@cody_b

@Jason @friendly_fox

Here’s the latest update I received at 11:38 AM PST:

[Rate limiting has been implemented as a] temporary measure as we recover from an outage. The team is on it and will remove rate limits again when everything is stable.

I’ll continue to provide updates in this thread.

Eagerly awaiting updates! My app normally routes well over 1B tokens/week to Hermes 405b via OpenRouter. I’ve had to temporarily switch it all to DeepInfra, which is slower and less reliable in general, so I would love to start pointing to Lambda again!

@Jason @friendly_fox Small update but it’s all I have at the moment: Lambda’s engineers are slowly easing the rate limiting for the Inference product. I’m trying to get an ETA for when the rate limiting should stop affecting your projects.