As Cody mentioned.
The considerations you need to make are:
1. Power - it depends on how many GPUs and how much power you may need.
* 4x 2080’s is about 1000 Watts (250 Watts each)
* 4x 3090’s is about 1400 Watts (350 Watts each) + spikes above
2. Cooling/Thermal - this is probably fine, but make sure you have good airflow in the case.
3. The Motherboard and the PCIe slots you are using. Some motherboards have full x16 slots.
Also make sure you have/get the correct power cables for the GPU from your power supply.
There is variance between GPUs and their connectors.
Also decide if you want/need NVLink also.
Finally generally you want more memory based on the amount of GPU memory.
This varies somewhat on what applications you are using.
Another big speed up is using NVMe M.2 or U.2 drives. (that can also give you a boost in performance or even RAIDing those drives). How many drives in your chassis depends on your motherboard or add-on RAID cards. NVMe prices have come down and sizes increased. But they are still more limited than SATA SSD’s.
I recommend running the same model (or at least generation) of cards on the same machine.
Other things to be aware of…
* If you are moving from older 1080’s or 2080s to a 3090 for example.
The Ampere line (3070,3080,3090,A5000,A6000, etc) require CUDA 11.1 or higher
(You can download new drivers and CUDA).
Just be aware of it for your code.
There are docker images with old tensorflow with new CUDA if you have dependencies.
Where ever you buy them from:
* make sure you know what the warranty is (ideally written down in the quote so it is clear to all parties).
* I would buy from a reputable place (it can be a risk buying from ebay or similar).
* And ask for the power cables for your new GPU and for your power supply with your order.
(make sure you know which power supply).
If you have the motherboard version (‘sudo dmidecode -t 1’) I can pull up the site/manual for that motherboard and show you the PCI slots and their limits.