Tensorflow binary not compiled to support AVX2 FMA

oden · April 14, 2018, 3:22pm

getting the below message when running python code. apparently, tensorflow is not compiled to support the AVX2 and FMA. also get the message below for CUDA. currently my code only runs 20 seconds faster than my HP

2018-04-14 08:01:56.859478: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-14 08:01:56.960850: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-14 08:01:56.961120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1070 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.2655
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 206.62MiB
2018-04-14 08:01:56.961151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-14 08:01:57.170502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-14 08:01:57.170549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-04-14 08:01:57.170557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-04-14 08:01:57.170725: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 152 MB memory) → physical GPU (device: 0, name: GeForce GTX 1070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-04-14 08:01:57.171981: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 152.62M (160038912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-14 08:01:57.172873: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 137.36M (144035072 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

sabalaba · April 21, 2018, 6:35am

One thing that I notice is that you are running out of memory (you can see that from the CUDA_ERROR_OUT_OF_MEMORY errors that you’re getting:

2018-04-14 08:01:57.171981: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 152.62M (160038912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-14 08:01:57.172873: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 137.36M (144035072 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

Can you try to re-run your job with a smaller batch size?

Topic		Replies	Views
The TensorFlow library was compiled to use AVX2 instructions, but these aren't available on your machine - Solving this issue	2	8198	December 16, 2019
Failed call to cuInit: CUDA_ERROR_UNKNOWN Technical Help	1	7677	November 3, 2017
Tensorflow does not detect GPU	1	3757	May 24, 2021
TensorFlow w/ Lambda Stack	0	1077	September 9, 2022
TensorBook: was LambdaStack version of tf compiled with XLA support? Technical Help	0	1210	September 26, 2019

Tensorflow binary not compiled to support AVX2 FMA

getting the below message when running python code. apparently, tensorflow is not compiled to support the AVX2 and FMA. also get the message below for CUDA. currently my code only runs 20 seconds faster than my HP

Related topics