CubeCL 项目教程

最新推荐文章于 2025-05-05 14:44:56 发布

苏承根

最新推荐文章于 2025-05-05 14:44:56 发布

阅读量719

点赞数 18

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/gitblog_00708/article/details/146905979

CubeCL 项目教程

cubecl Multi-platform high-performance compute language extension for Rust. 项目地址: https://gitcode.com/gh_mirrors/cu/cubecl

1. 项目介绍

CubeCL 是一个为 Rust 语言设计的多平台高性能计算语言扩展。它允许开发者使用 Rust 编写代码，并通过零成本抽象在 GPU 上运行，以开发可维护、灵活且高效的计算核心。CubeCL 目前支持函数、泛型和结构体，对特性、方法和类型推断有部分支持。随着项目的进展，预计将支持更多 Rust 语言原语，同时保持最优性能。

2. 项目快速启动

首先，确保你已经安装了 Rust 和相应的依赖。以下是一个简单的步骤来快速启动 CubeCL 项目：

# 使用 cargo 创建一个新的 Rust 项目
cargo new cubecl_project

# 进入项目目录
cd cubecl_project

# 添加 cubecl 作为依赖
[dependencies]
cubecl = "0.1.0" # 请使用最新版本

# 编写你的 GPU 计算函数
use cubecl::prelude::*;

#[cube(launch_unchecked)]
fn gelu_scalar(x: Line<f32>) -> Line<f32> {
    // 在编译时执行 sqrt 函数。
    let sqrt2 = f32::new(comptime!(2.0f32.sqrt()));
    let tmp = x / Line::new(sqrt2);
    x * (Line::erf(tmp) + 1.0) / 2.0
}

# 启动你的计算核心
pub fn launch<R: Runtime>(device: &R::Device) {
    let client = R::client(device);
    let input = &[[-1.0, 0.0, 1.0, 5.0]];
    let vectorization = 4;
    let output_handle = client.empty(input.len() * core::mem::size_of::<f32>());
    let input_handle = client.create(f32::as_bytes(input));

    unsafe {
        gelu_scalar::launch_unchecked::<f32, R>(
            &client,
            CubeCount::Static(1, 1, 1),
            CubeDim::new(input.len() as u32 / vectorization, 1, 1),
            ArrayArg::from_raw_parts(&input_handle, input.len(), vectorization as u8),
            ArrayArg::from_raw_parts(&output_handle, input.len(), vectorization as u8),
        );
    }

    let bytes = client.read_one(output_handle.binding());
    let output = f32::from_bytes(&bytes);
    println!("Executed gelu with runtime {:?} => {:?}", R::name(), output);
}

# 你可以运行以下命令来编译和运行你的项目
# 注意：你需要根据你的 GPU 硬件选择合适的运行时，例如 CUDA 或 WGPU
cargo run --example gelu --features cuda