pytorch_lightning笔记

最新推荐文章于 2025-06-01 09:02:13 发布

原创

最新推荐文章于 2025-06-01 09:02:13 发布 · 482 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#pytorch #笔记 #人工智能 #python

Debug

1. 快速运行一次所有的代码 (fast_dev_run)

训练了好长时间但是在训练or 验证的时候崩溃了使用 fast_dev_run运行5个batch 的 training validation test and predication 查看是否存在错误：

train = Trainer(fast_dev_run=True) # True 时为5 
train = Trainer(fast_dev_run=7) # 可以调节为任意int值

2.缩短epoch的长度 (limit_xxx_batch)

有时仅使用training or validation or … 是helpful的例如在Imagenet等较大的数据集上，比等待complete epoch faster

train = Trainer(limit_train_batch=0.1, limit_val_batch=0.01) # 10% and 1%
train = Trainer(limit_train_batch=10, limit_val_batch=5) # 10 batches and 5 batches

3. 打印输入输出层尺寸(example_input_array)

class LitModel(LightningModule):
    def __init__(self, *args, **kwargs):
        self.example_input_array = torch.Tensor(32, 1, 28, 28)

summary table 将会输出包括 input and output 的 dimensions

  | Name  | Type        | Params | Mode  | In sizes  | Out sizes
----------------------------------------------------------------------
0 | net   | Sequential  | 132 K  | train