Can‘t compile with USE_CUDA=1 and ENABLE_DEEPKS=1 simultaneously #5910

xuan112358 · 2025-02-19T12:34:42Z

Describe the bug

I can compile ABACUS with USE_CUDA=1 or ENABLE_DEEPKS=1.
But I can't compile with USE_CUDA=1 and ENABLE_DEEPKS=1 simultaneously.
The cmake error is:

CMake Error at cmake/FindMKL.cmake:87 (add_library):
add_library cannot create ALIAS target "IntelMKL::MKL" because another
target with the same name already exists.
Call Stack (most recent call first):
/home/xuan/03_library/libtorch-2.3.1/share/cmake/Caffe2/public/mkl.cmake:1 (find_package)
/home/xuan/03_library/libtorch-2.3.1/share/cmake/Caffe2/Caffe2Config.cmake:113 (include)
/home/xuan/03_library/libtorch-2.3.1/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:499 (find_package)

-- Found Torch: /home/xuan/03_library/libtorch-2.3.1/lib/libtorch.so
-- Checking for one of the modules 'libxc'
-- Found Libxc: /home/xuan/03_library/libxc/libxc-5.2.3/lib/libxc.a
-- Found Libxc: version 5.2.3
-- Configuring incomplete, errors occurred!

I used oneapi of 2022. If I use the 2024 version, it seems that there is a mismatch between compiler and cudatoolkit, with the error of "Could not find librt library, needed by CUDA::cudart_static"
@dyzheng @caic99 @dzzz2001 Can you help me?

Expected behavior

No response

To Reproduce

No response

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

AsTonyshment · 2025-02-20T04:38:21Z

Actually, compiling with GCC works fine. It seems that Intel oneAPI does not naturally support CUDA very well:

~/abacus-develop (develop) $ gcc -v       
......
gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04) 

~/abacus-develop (develop) $ cmake -B build -DELPA_INCLUDE_DIR=~/Softwares/elpa-2024.05.001/elpa -DELPA_LIBRARIES=~/Softwares/elpa-2024.05.001/lib/libelpa_openmp.so -DCMAKE_PREFIX_PATH=~/Softwares/elpa-2024.05.001/lib -DENABLE_LIBXC=1 -DUSE_CUDA=1 -DENABLE_DEEPKS=1 -DTorch_DIR=~/Softwares/libtorch/share/cmake/Torch/ -Dlibnpy_INCLUDE_DIR=~/Softwares/libnpy/include
-- The CXX compiler identification is GNU 12.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1") 
-- Found git: attempting to get commit info...
-- Current commit hash: e7b5c1257
-- Last commit date: Wed Feb 19 17:34:48 2025 +0800
-- Found Cereal: /usr/include  
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2") 
-- Found ELPA: ~/Softwares/elpa-2024.05.001/lib/libelpa_openmp.so  
-- Performing Test ELPA_VERSION_SATISFIES
-- Performing Test ELPA_VERSION_SATISFIES - Success
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda-12.1/bin/nvcc
-- Found CUDAToolkit: /usr/local/cuda-12.1/include (found version "12.1.66") 
-- The CUDA compiler identification is NVIDIA 12.1.66
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-12.1/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found FFTW3: /usr/lib/x86_64-linux-gnu/libfftw3_omp.so  
-- Looking for sgemm_
-- Looking for sgemm_ - not found
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /usr/lib/x86_64-linux-gnu/libopenblas.so  
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /usr/lib/x86_64-linux-gnu/libopenblas.so;-lm;-ldl  
-- Found ScaLAPACK: /usr/lib/x86_64-linux-gnu/libscalapack-openmpi.so  
-- Could NOT find MKL (missing: MKL_DIR)
-- Found MKL_SCALAPACK: MKL_SCALAPACK-NOTFOUND
-- Found Torch: ~/Softwares/libtorch/lib/libtorch.so  
-- Checking for one of the modules 'libxc'
-- Found Libxc: ~/Softwares/libxc-7.0.0-install/lib/libxc.a  
-- Found Libxc: version 7.0.0
-- Configuring done
-- Generating done
-- Build files have been written to: ~/abacus-develop/build

~/abacus-develop (develop) $ cmake --build build -j32
......
[100%] Building CXX object CMakeFiles/abacus.dir/source/main.cpp.o
[100%] Linking CXX executable abacus
[100%] Built target abacus

mohanchen added DeePKS Issues related to the DeePKS GPU & DCU & HPC GPU and DCU and HPC related any issues labels Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can‘t compile with USE_CUDA=1 and ENABLE_DEEPKS=1 simultaneously #5910

Can‘t compile with USE_CUDA=1 and ENABLE_DEEPKS=1 simultaneously #5910

xuan112358 commented Feb 19, 2025

AsTonyshment commented Feb 20, 2025

Can‘t compile with USE_CUDA=1 and ENABLE_DEEPKS=1 simultaneously #5910

Can‘t compile with USE_CUDA=1 and ENABLE_DEEPKS=1 simultaneously #5910

Comments

xuan112358 commented Feb 19, 2025

Describe the bug

Expected behavior

To Reproduce

Environment

Additional Context

Task list for Issue attackers (only for developers)

AsTonyshment commented Feb 20, 2025