CUDA build does not correctly find cuda location
tl;dr we need to change how the build script detects cuda's location OR tell people to ensure that CUDATOOLKIT_HOME is set correctly (which doesn't seem to be a default env var for cuda)
Several builds on the campus cluster were failing when building with the cuda option. I had loaded cuda using:
module load cuda
but the build was still failing. I noticed that at the top of build's output the following line:
[mprobson@taubh2 charm]$ ./build charm++ netlrts-linux-x86_64 cuda smp checking for CUDA toolkit directory CUDA_DIR=/usr/local/cuda/
With some grep handiwork:
[mprobson@taubh2 charm]$ grep -rn "checking for CUDA toolkit directory" * grep: VERSION: No such file or directory build:451: echo "checking for CUDA toolkit directory" grep: include: No such file or directory
And on line 451 of build:
451 echo "checking for CUDA toolkit directory" 452 CUDA_CANDIDATE_DIRS="$CUDATOOLKIT_HOME /usr/local/cuda /usr/lib/nvidia-cuda-toolkit"
Each of those dir's is checked for existence. If they exist then that's where CUDA_DIR is set to. The problem on the campus cluster is that each of the versions of cuda has their own subdir inside /usr/local/cuda/, e.g. /usr/loca/cuda/6.5. This causes the build script to misrecognize cuda and for the build to break. Current work around is to set the non-standard CUDATOOLKIT_HOME env var and then build. I'm ultimately not sure if we need to change the line in build or if we should just ensure users/vendors make sure that variable is set.
#10 Updated by Michael Robson over 1 year ago
Sam White wrote:
What does that documentation patch have to do with this issue?
From the Description:
tl;dr we need to change how the build script detects cuda's location OR tell people to ensure that CUDATOOLKIT_HOME is set correctly
So this is my first cut fix, i.e. document the problem in the manual. I plan to do the other half of that or as well. One suggestion that I'm currently working on is making the build fail more obviously when it doesn't find cuda.