I am downloading llm model on H100 but the downloading speed is pretty slow, around 70MB/s, which is much slower than what I had on A100 previously. What decides the downloading speed, time of the day? Thanks:)
Can you please share the llm model link that you are downloading so as I can check from my end as well.
Thanks.
Ryan E.
Hi @ryane, sure, thanks git lfs clone tiiuae/falcon-40b · Hugging Face
Hi @artemis I have raised this concern to our engineering team. During my testing we seem to have the same behavior. I am testing this on H100 and A10 instance. Will get back to you once I have an update.
Thanks.
Hi @artemis we get this feedback from Engineering team.
when 3 clients in CA are sending data to the H100 instance, each gets about ~670Mb/s so 3 streams total consistently gave me ~2Gb/s
when we run the same test but in reverse (H100 server sending data to clients) We are getting even better results, about 1.3Gb/s per client and 4Gb/s total
Please note that all of that depends on tons of factors, but one of the main ones is how far the server and the client are from each other. I tested with instances in different locations and the results were all over the place, from ~200Mb/s to 800Mb/s per client. Additionally, the timing also has big impact since all our tenants share the same internet circuits between them.
My goal was simply to make sure there’s no limitation of 50MB/s (400Mb/s) anywhere along the way.
Do note that when downloading from huggingface in our A10 instance which is in California Region it is resolving to it’s server in the same region.
$ ping huggingface.co
PING huggingface.co (18.155.181.29) 56(84) bytes of data.
64 bytes from server-18-155-181-29.sfo53.r.cloudfront.net (18.155.181.29): icmp_seq=1 ttl=252 time=1.04 ms
64 bytes from server-18-155-181-29.sfo53.r.cloudfront.net (18.155.181.29): icmp_seq=2 ttl=252 time=0.973 ms
64 bytes from server-18-155-181-29.sfo53.r.cloudfront.net (18.155.181.29): icmp_seq=3 ttl=252 time=1.02 ms
While for H100 instance which is in Utah it is pointing to a huggingface server in Denver.
ping huggingface.co
PING huggingface.co (108.156.201.52) 56(84) bytes of data.
64 bytes from server-108-156-201-52.den52.r.cloudfront.net (108.156.201.52): icmp_seq=1 ttl=244 time=12.3 ms
64 bytes from server-108-156-201-52.den52.r.cloudfront.net (108.156.201.52): icmp_seq=2 ttl=244 time=12.0 ms
64 bytes from server-108-156-201-52.den52.r.cloudfront.net (108.156.201.52): icmp_seq=3 ttl=244 time=11.9 ms