CUTLASS Compilation
Follow official tutorial to build and compile cutalss.
Building
Run the following commands to build cutalss:
$ export CUDACXX=${CUDA_INSTALL_PATH}/bin/nvcc
$ mkdir build && cd build
$ cmake .. -DCUTLASS_NVCC_ARCHS=${gencode} # compiles for NVIDIA Hopper GPU architecture
How to get $CUDA_INSTALL_PATH?
Usually the cuda is installed at /usr/local/cuda or /usr/local/cuda-11.7/, which depends on your cuda version. If not, run echo $LD_LIBRARY_PATH, the output is supposed to be /usr/local/cuda/lib64, the prefix without lib64 is the location of cuda.
How to get $gencode?
$gencode is determined by your GPU architecture, please match your gencode with your GPU architecture according to this map. Other references:
- How to get the GPU architecture?,
- GPU Architecture Compatibility Guide
- How to find out which NVIDIA GPU I have
Using CUTLASS in your CUDA program
Applications should list /include within their include paths. They must be compiled as C++17 or greater. Specifically, we want to use cutalss in the following test.cu program:
#include <iostream>
#include <cutlass/cutlass.h>
#include <cutlass/numeric_types.h>
#include <cutlass/core_io.h>
int main() {
cutlass::half_t x = 2.25_hf;
std::cout << x << std::endl;
return 0;
}
You should compile the program by including /include. Specifically, If your project path is as follows:
~/cutlass
~/test/test.cu
Then the compilation command at ~/test/ is:
nvcc -I../cutlass/include -gencode=arch=compute_80,code=compute_80 -std=c++17 test.cu -o main
It is worthy to notice that you have to specify the gencode. We have seen that specifying gpu architecture is needed as some libraries are only available in the last couple of years. If we don’t specify gpu architecture, I am not sure which architecture it will pick. In that case half suppport may not be identified by nvcc.
If you want to use cutlass utilities, then make sure tools/util/include is listed as an include path:
nvcc -I../cutlass/include -I../cutlass/include/tools/util/include -gencode=arch=compute_80,code=compute_80 -std=c++17 test.cu -o main
本文详细介绍了如何在CUDA项目中构建和编译CUTLASS库,包括设置环境变量CUDA_INSTALL_PATH和gencode,以及在C++程序中正确引用和编译CUTLASS。特别强调了指定GPU架构的重要性。

被折叠的 条评论
为什么被折叠?



