PyTorch DDP NCCL hangs on h100 server

I wasn’t able to replicate the issue.

Can you provide step-by-step instructions to replicate the issue?