[医学影像分割] nnunet 处理自己的数据集 + 并行训练 +训练时遇到的问题

version-1

import os
import shutil
import json

dataset_json = {
    "name" : "HNC Segmentation",
    "description" : "HNC pre-radiological treatment Segmentation",
    "tensorImageSize" : "3D",
    "modality": {
        "0": "T2"
    },
    "labels" : {
        "0": "background",
        "1": "GTVp",
        "2": "GTVn"
    },
    "file_ending": ".nii.gz",
    "numTraining" : 0,
    "numTest" : 0,
    "training" : [],
    "validation": [],
    "test" : []
}



results_path = "/mnt/home/pc/Chenq_team/HNC_dataset/nnunetv1_dataset/nnUNet_raw_data/Task018_HNC/"
dir_path = "/mnt/home/pc/Chenq_team/HNC_dataset/HNTSMRG24_train"

os.makedirs(results_path, exist_ok=True)
os.makedirs(os.path.join(results_path, "imagesTr"), exist_ok=True)
os.makedirs(os.path.join(results_path, "labelsTr"), exist_ok=True)
os.makedirs(os.path.join(results_path, "imagesTs"), exist_ok=True)


for dir in os.listdir(dir_path):
    pre_rt_dir_path = os.path.join(dir_path, dir, "preRT")
    pid = int(dir)
    img_new_path_json, mask_new_path = None, None
    for file in os.listdir(pre_rt_dir_path):
        if "T2" in file:
            img_old_path = os.path.join(pre_rt_dir_path, file)
            img_new_path = os.path.join(results_path, "imagesTr", f"preRT_{pid:03d}_0000.nii.gz")
            img_new_path_json = os.path.join(results_path, "imagesTr", f"preRT_{pid:03d}.nii.gz")
            shutil.copy(img_old_path, img_new_path)
        if "mask" in file:
            mask_old_path = os.path.join(pre_rt_dir_path, file)
            mask_new_path = os.path.join(results_path, "labelsTr", f"preRT_{pid:03d}.nii.gz")
            shutil.copy(mask_old_path, mask_new_path)
    assert img_new_path is not None and mask_new_path is not None, f"{pre_rt_dir_path} does not contain T2 or mask file"
    dataset_json["training"].append(
        {
            "image": img_new_path_json,
            "label": mask_new_path
        },
    )
    dataset_json["numTraining"] += 1

with open(os.path.join(results_path, "dataset.json"), "w") as f:
    json.dump(dataset_json, f, indent=4)

常见的命令

nnUNet_plan_and_preprocess -t 19 --verify_dataset_integrity  # 预处理数据集

自己划分数据集

更换split_final.pkl文件

version-1的并行训练

我没有找到相关的nnunet命令,但是nnunet-v1提供了run_training_DDP.py文件。训练命令如下:
其中MedNeXt没有提供DDP的trainer,自己抄一下nnunet的trainer就好了。记得替换sync_batchnorm,然后注意一下是半精度训练还是fp32。

export nnUNet_raw_data_base="/home/Guanjq/HNC_SEG_Data/nnunetv1_dataset/" # replace to your database
export nnUNet_preprocessed="/home/Guanjq/HNC_SEG_Data/nnunetv1_dataset/nnUNet_preprocessed/"
export RESULTS_FOLDER="/home/Guanjq/HNC_SEG_Data/nnunetv1_dataset/nnUNet_results/"

export TORCH_DISTRIBUTED_DEBUG="DETAIL"
export fold=3
export CUDA_VISIBLE_DEVICES=0,1 
export WORLD_SIZE=2
python3 -m torch.distributed.launch \
    --master_port 23331 \
    --nproc_per_node=2 \
    /home/Guanjq/Work/MedNeXt/nnunet_mednext/run/run_training_DDP.py \
    -network 3d_fullres \
    -network_trainer nnUNetTrainerV2_MedNeXt_S_kernel3_DDP \
    -task Task018_HNC_pre \
    -fold ${fold} \
    -p /home/Guanjq/HNC_SEG_Data/nnunetv1_dataset/nnUNet_preprocessed/Task018_HNC_pre/nnUNetPlansv2.1 \
    --fp32 

version-2

import os
import shutil
import json

dataset_json = {
    "name" : "HNC Segmentation",
    "description" : "HNC pre-radiological treatment Segmentation",
    "tensorImageSize" : "3D",
    "channel_names": {
        "0": "T2"
    },
    "labels" : {
        "background" : 0,
        "GTVp" : 1,
        "GTVn" : 2
    },
    "file_ending": ".nii.gz",
    "numTraining" : 0,
    "numTest" : 0,
    "training" : [],
    "validation": [],
    "test" : []
}



results_path = "/mnt/home/pc/Chenq_team/HNC_dataset/nnunet_dataset/raw/Dataset018_HNC/"
dir_path = "/mnt/home/pc/Chenq_team/HNC_dataset/HNTSMRG24_train"

os.makedirs(results_path, exist_ok=True)
os.makedirs(os.path.join(results_path, "imagesTr"), exist_ok=True)
os.makedirs(os.path.join(results_path, "labelsTr"), exist_ok=True)

for dir in os.listdir(dir_path):
    pre_rt_dir_path = os.path.join(dir_path, dir, "preRT")
    pid = int(dir)
    img_new_path, mask_new_path = None, None
    for file in os.listdir(pre_rt_dir_path):
        if "T2" in file:
            img_old_path = os.path.join(pre_rt_dir_path, file)
            img_new_path = os.path.join(results_path, "imagesTr", f"preRT_{pid:03d}_0000.nii.gz")
            shutil.copy(img_old_path, img_new_path)
        if "mask" in file:
            mask_old_path = os.path.join(pre_rt_dir_path, file)
            mask_new_path = os.path.join(results_path, "labelsTr", f"preRT_{pid:03d}.nii.gz")
            shutil.copy(mask_old_path, mask_new_path)
    assert img_new_path is not None and mask_new_path is not None, f"{pre_rt_dir_path} does not contain T2 or mask file"
    dataset_json["training"].append(
        {
            "image": img_new_path,
            "label": mask_new_path
        },
    )
    dataset_json["numTraining"] += 1

with open(os.path.join(results_path, "dataset.json"), "w") as f:
    json.dump(dataset_json, f, indent=4)

遇到的问题

  1. RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message

在使用version-1训练时,报了这个错误,尝试了 export nnUNet_n_proc_DA="1" 都没办法解决问题。

### 使用UNETR模型训练自定义数据集 #### 数据准备与预处理 为了使自定义数据集适用于UNETR模型,需遵循特定的数据预处理步骤。这些步骤确保输入数据的一致性和兼容性,从而提升模型性能。 - **标准化和归一化**:所有图像应经过标准化处理,使得像素值分布在一个固定范围内(通常为0到1之间)。这一步骤有助于加速收敛并稳定训练过程[^1]。 - **裁剪与重采样**:对于不均匀大小或分辨率的扫描文件,应当调整至统一尺寸,并保持各向同性的体素间距。这样可以减少计算资源消耗的同维持空间特征不变形[^3]。 - **增强操作**:适当应用随机旋转、翻转等几何变换作为数据扩增手段,在不影响标注位置的前提下增加样本多样性,进而改善泛化能力[^2]。 ```python import numpy as np from monai.transforms import ( Compose, LoadImaged, EnsureChannelFirstd, ScaleIntensityRanged, RandCropByPosNegLabeld, RandFlipd, RandRotated, ToTensord) def prepare_data(data_dir): train_transforms = Compose([ LoadImaged(keys=["image", "label"]), EnsureChannelFirstd(keys="image"), ScaleIntensityRanged( keys="image", a_min=-175, a_max=250, b_min=0.0, b_max=1.0, clip=True), RandCropByPosNegLabeld( keys=["image", "label"], label_key="label", spatial_size=(96, 96, 96), pos=1, neg=1, num_samples=4), RandFlipd(keys=["image", "label"], prob=0.5, spatial_axis=[0]), RandRotated(keys=["image", "label"], range_x=np.pi/12, prob=0.5, keep_size=True), ToTensord(keys=["image", "label"]) ]) data_dicts = [{"image": img_path, "label": seg_path} for img_path, seg_path in zip(image_files, segmentation_files)] dataset = CacheDataset(data=data_dicts, transform=train_transforms, cache_rate=1.0, num_workers=8) return DataLoader(dataset, batch_size=2, shuffle=True, num_workers=8) ``` #### 构建UNETR架构 UNETR结合了Transformer编码器的强大表征能力和CNN解码器的空间定位优势,特别适合于三维医学影像的任务需求。具体实现如下: - Transformer编码器负责捕捉全局上下文信息以及远距离依赖关系; - 解码路径则由一系列反卷积层构成,逐步恢复原始分辨率,并通过跳跃连接引入高层次语义指导低层次重建细节。 ```python from monai.networks.nets import UNETR model = UNETR( in_channels=1, out_channels=num_classes, img_size=(96, 96, 96), # 应匹配上述裁剪尺寸 feature_size=16, hidden_size=768, mlp_dim=3072, num_heads=12, pos_embed='perceptron', norm_name='instance', conv_block=True, res_block=True, dropout_rate=0.0 ).to(device) ``` #### 训练流程配置 设定优化算法参数及损失函数用于监督学习阶段。考虑到分割任务的特点,推荐采用Dice Loss或其他相似度量准则来衡量预测结果与真实标签之间的差异程度。此外,还需定期保存最佳权重以便最终评估部署使用。 ```python from torch.optim.lr_scheduler import ReduceLROnPlateau from monai.losses.dice_loss import DiceLoss from tqdm import trange optimizer = Adam(model.parameters(), lr=1e-4, weight_decay=1e-5) scheduler = ReduceLROnPlateau(optimizer, 'min', patience=2) loss_function = DiceLoss(to_onehot_y=True, softmax=True) best_metric = -1 for epoch in trange(num_epochs): model.train() epoch_loss = 0 for step, batch_data in enumerate(train_loader): inputs, labels = batch_data['image'].to(device), batch_data['label'].nonzero()[:, 1:].to(device) optimizer.zero_grad(set_to_none=True) outputs = model(inputs) loss = loss_function(outputs, labels) loss.backward() optimizer.step() epoch_loss += loss.item() scheduler.step(epoch_loss / (step + 1)) if validation_performance > best_metric: best_metric = validation_performance checkpoint = { "epoch": epoch + 1, "state_dict": model.state_dict(), "optim": optimizer.state_dict()} torch.save(checkpoint, os.path.join(output_directory, f'best_model.pth')) ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值