Open MPI warning - no preset parameters were found

Hi and thanks in advance for any help and advice!

I am trying to fine-tune a transformer model from the huggingface library and do some pre-processing using spacy. When I run the script, I get the following warning, which seems to be related to the Open MPI configuration. I can run the code despite the warning, but I’m wondering whether this impacts performance… Does anyone have an idea how to solve this issue?

--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:

Local host:            (I omit this info here)
Device name:           mlx5_0
Device vendor ID:      (I omit this info here)
Device vendor part ID: (I omit this info here)

Default device parameters will be used, which may result in lower
performance.  You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.


NOTE: You can turn off this warning by setting the MCA parameter
  btl_openib_warn_no_device_params_found to 0.

--------------------------------------------------------------------------

--------------------------------------------------------------------------

No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

Local host:           (I omit this info here)
Local device:         mlx5_0
Local port:           1
CPCs attempted:       udcm

--------------------------------------------------------------------------

2023-05-12 14:59:25.306866: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful 
NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA 
node zero

2023-05-12 14:59:25.307798: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful 
NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA 
node zero

2023-05-12 14:59:25.307949: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful 
NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA 
node zero

Hello Den,

That error means that the Mellanox card isn’t known by Open MPI lib.
As the message suggests this may be affecting performance.
If you’re not doing multi-node training this shouldn’t be a problem for Open MPI not to be able to optimize for your ethernet NIC.

Can you tell me more about your architecture?
How many nodes do you have and how are they connected?
Also, can you, please, send me the command you run with all the parameters?
Do you use Lambda Stack?

Best,
Yanos

Hello Yanos and thank you for your response,

I use a Lambda Cloud instance with a single A100 GPU, which comes with the Lambda Stack pre-installed if I’m not mistaken.
What I’m trying to do is some text pre-processing using spacy’s language processing pipeline (using nlp.pipe()) and then fine-tuning a PyTorch model.
If I understand you correctly, I could ignore the warning message since I’m not doing multi-node training?
Sorry if this is too unspecific, would it be helpful to provide more hardware info using a sudo lshw output?

Hi Den,

In your case, you can definitely ignore this.
The Mellanox card doesn’t take any part in your workflow.

Best,
Yanos

Thanks a lot for your help!

1 Like