Cuda no longer work when installing new ubuntu update

Hi I just recently make update to latest ubuntu version. However, cuda no longer work.

When I compile using nvcc I got this msg

nvcc warning : The ‘compute_20’, ‘sm_20’, and ‘sm_21’ architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
In file included from /usr/local/cuda/bin/…//include/cuda_runtime.h:78:0,
from :0:
/usr/local/cuda/bin/…//include/host_config.h:119:2: error: #error – unsupported GNU version! gcc versions later than 5 are not supported!
#error – unsupported GNU version! gcc versions later than 5 are not supported!
^~~~~

Would you please help reinstall CUDA?

Thank you,
–Binh Pham

Hey Binh,

You can try to use Lambda Stack. Lambda Stack: an AI software stack that's always up-to-date

That will install TensorFlow with GPU support / CUDA / Drivers / etc. Let us know if that one-liner works:

LAMBDA_REPO=$(mktemp) && \
wget -O${LAMBDA_REPO} https://lambdalabs.com/static/misc/lambda-stack-repo.deb && \
sudo dpkg -i ${LAMBDA_REPO} && rm -f ${LAMBDA_REPO} && \
sudo apt-get update && sudo apt-get install -y lambda-stack-cuda

Hi,
The command works after I added --allow-downgrades. However, when I run a test cuda code, it doesn’t work. It gives me wrong result. This code works before. Below is my test code and output.

CODE:
#include <stdio.h>

global void cube(float * d_out, float * d_in){
// Todo: Fill in this function
int idx = threadIdx.x;
float f = d_in[idx];
d_out[idx] = fff;
}

int main(int argc, char ** argv) {
const int ARRAY_SIZE = 96;
const int ARRAY_BYTES = ARRAY_SIZE * sizeof(float);

// generate the input array on the host
float h_in[ARRAY_SIZE];
for (int i = 0; i < ARRAY_SIZE; i++) {
h_in[i] = float(i);
}
float h_out[ARRAY_SIZE];

// declare GPU memory pointers
float * d_in;
float * d_out;

// allocate GPU memory
cudaMalloc((void**) &d_in, ARRAY_BYTES);
cudaMalloc((void**) &d_out, ARRAY_BYTES);

// transfer the array to the GPU
cudaMemcpy(d_in, h_in, ARRAY_BYTES, cudaMemcpyHostToDevice);

// launch the kernel
cube<<<1, ARRAY_SIZE>>>(d_out, d_in);

// copy back the result array to the CPU
cudaMemcpy(h_out, d_out, ARRAY_BYTES, cudaMemcpyDeviceToHost);

// print out the resulting array
for (int i =0; i < ARRAY_SIZE; i++) {
printf(“%f”, h_out[i]);
printf(((i % 4) != 3) ? “\t” : “\n”);
}

cudaFree(d_in);
cudaFree(d_out);

return 0;
}

OUTPUT:
64856.226562 -3442308913561600.000000 -0.040856 -8858266431913984.000000
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 -0.000000 0.000000
0.000000 0.000000 -nan -nan
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 -49057123615670591891177472.000000 0.000005
-0.000000 0.000000 -344859.468750 0.000000
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 -49057123615670591891177472.000000 0.000005
-0.000000 0.000000 -702221.000000 0.000000
-6630776053649813224172895258552565760.000000 0.000000 -702323.312500 0.000000
-1446412.000000 0.000000 0.000000 0.000000
-0.000000 0.000000 -702231.437500 0.000000
-0.000000 0.000000 -0.000000 0.000000
-1446958.000000 0.000000 -0.000000 0.000000
-0.000000 0.000000 -427953.250000 0.000000
-0.000000 0.000000 -307492.281250 0.000000
0.000000 0.000000 0.000000 0.000000
-0.000000 0.000000 -0.000000 0.000000
0.000000 0.000000 -0.000000 0.000000
-1172276.000000 0.000000 0.000000 0.000000
-0.000000 0.000000 -0.000000 0.000000

Please help,

Thanks Sabalaba,
–Binh Pham

What is the expected output? What did you used to get on the old version? Seems odd that it would be different.