I want to host a model on the H100 GPU (currently accessible on localhost). Can I make the server to be a public endpoint (similar to how api.lambdlabs.com works) so I can make api request to it from any authenticated machine?
So, what is the best way to expose the model api to public but with only HTTP traffic allowed in firewall?
Also, do you have private tunnel between AWS and lambda on-demand cloud GPUs?
Public facing IP address OR port forward. If you have a dedicated public IP for your machine that is easiest – this is most common at universities which have big blocks of IPs. Alternatively, you need access to some machine on the network (like the router) which has a public IP, then you can do a reverse-proxy on that machine or port forward 80 and 443.
You will need to obtain a domain name. Whoever you rent a domain from, there will be a portal where you can “create DNS records” (assign IP address to the domain). For example, if you have lambda.com and you assign it to your machine with public IP 10.0.0.1 then all requests to lambda.com will go to 10.0.0.1:80.
(Optional, Recommended) Setup HTTPS. These days this is pretty straightforward. If you don’t do this, every time someone visits the site via browser they will get a warning that “connection is not secure”.
Recommended tools
I highly recommend to use caddy as a reverse-proxy. It can do a lot more, like being a full web server, but I love it because it has great docs and in my view its super simple to configure. With caddy you would do two things:
Setup reverse proxy.
Automatically setup HTTPS.
Caddy uses a config file called a caddyfile which for your use case could be as follows.
example.com
reverse_proxy /api/* 127.0.0.1:8080
In the above example any HTTPS request to example.com/api/* will be forwarded to 127.0.0.1:8080 (localhost port 8080).
What is a reverse-proxy?
Usually HTTP and HTTPS requests are made on port 80 or 443. A reverse-proxy looks at the URL of the request and forwards it to a specific port. For example, you could map http://api.lambda.com to 192.168.1.1:8080, http://docs.lambda.com to 192.168.1.1:8081, and http://lambda.com/contact to 192.168.1.2:8080. There is lot of things you can do, but in general, you are intercepting HTTP/HTTPS requests and remapping/forwarding them.