Have been using the inference API’s a fair bit and more and more seem to be getting 524 status code responses. A quick google suggests this might be cloudflare response.
Is this just me and is this more common?
Edit: Should have added i’m mostly using the llama3.3-70b-instruct-fp8
endpoint, if i try and run multiple (approx 5) concurrent queries it seems to trigger it, or at least increase the chances of it occuring. Haven’t had a chance to see if different models have the same issue.