OneFlow CHANGELOG V0.3.2

原创

于 2020-12-18 16:43:04 发布 · 492 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习

OneFlow 发布了 0.3.2 版本，带来了性能优化、新功能，如亚线性内存优化、CUDA 11.1 支持、动态损失缩放调度。修复了多个 Op，新增了如 polyval、broadcast_like 等 Op，并优化了系统组件，包括 NCCL All2All 支持和 Collective Boxing。Eager 模式的 bug 也得到了修复。

Changelog

OneFlow 发布了新版本 0.3.2，这个版本以及之前的 0.3.1 版本都是大版本 0.3.0 的 minor 版本，所以在此一并介绍。
在这个版本中，引入了大量性能优化、加入了不少新的 feature，率先支持了 CUDA 11.1。

主要新功能一览

支持亚线性内存优化
通过 oneflow.experimental.scope(checkpointing=self.checkpoint_activations) 开启，大幅节省内存。例如：

def transformer_layer(self, name, x, *, past):
    # ...
    with flow.scope.namespace(name):
        x = flow.identity(x)
        with flow.experimental.scope.config(
            checkpointing=self.checkpoint_activations
        ):
            norm1 = norm(x, name="layernorm_1")
            # ...

新版本的 checkpoint
新版本的 checkpoint 大幅提高了灵活性。支持部分加载/保存，支持获取权重的值（可用于打印等操作），支持使用 numpy 数组给权重赋值。

with tempfile.TemporaryDirectory() as save_dir:
    refresh_session()
    large1 = get_checkpoint_ready_model(model_getter, dtype)
    flow.checkpoint.save(save_dir)
    res1 = large1()
    refresh_session()
    large2 = get_checkpoint_ready_model(model_getter, dtype)
    vars_in_file = flow.checkpoint.get(save_dir)
    flow.load_variables(vars_in_file)
    res2 = large2()

refresh_session()
model = get_checkpoint_ready_model(get_add_and_reduce_mean_model, dtype)
var_x = flow.get_all_variables()["x"]
var_y_value_before_loading = flow.get_all_variables()["y"].numpy()
new_val_np = np.random.random(var_x.shape).astype(np.float32)
flow.load_variables({
     
     "x": new_val_np})
var_y_value_after_loading = flow.get_all_variables()["y"].numpy()
flow_res = model()

支持 dynamic loss scale schedule
具体开启方式：

loss_scale_policy = flow.optimizer.loss_scale.dynamic_loss_scale(increment_period=2000)
optimizer = flow.optimizer.AdamW(..., loss_scale_policy=loss_scale_policy)

支持最新的 CUDA 11.1

可以通过如下命令安装:

python3 -m pip install --find-links https://release.oneflow.info oneflow_cu111 --user

最低0.47元/天解锁文章