IU X-Ray 数据集

博客介绍了IU胸部X光数据集,该数据集含7470对图像与诊断报告,报告包含印象、发现等部分。还说明了数据集获取方式,可从论文及指定链接获取,且去标识后的数据集可在国家医学图书馆搜索下载,最后给出使用该数据集的论文地址。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1)数据集来源https://iuhealth.org/find-medical-services/x-rays(IU,The Indiana University Health,不直接提供数据集)

数据集简介:

The Indiana University Chest XRay Collection (IU X-Ray) is a set of chest x-ray images paired with their corresponding diagnostic reports. The dataset contains 7,470 pairs of images and reports(6470:500:500).
Each report consists of the following sections: impression, findings, tags, comparison, and indication.On average, each image is associated with 2.2 tags, 5.7 sentences, and each sentence contains 6.5 words. 

Besides, we find that top 1,000 words cover 99.0% word occurrences in the dataset, therefore we only included top 1,000 words in the dictionary.  

 

2)数据集的获取:

论文:Preparing a collection of radiology examinations for distribution and retrieval

链接:https://scholarworks.iupui.edu/bitstream/handle/1805/13649/ocv080.pdf?sequence=1&isAllowed=y

This paper presents an approach to developing a collection of radiology examinations, including both the images and radiologist narrative reports, and making them publicly available in a searchable database.
The authors collected 3996 radiology reports from the Indiana Network for Patient Care and 8121 associated images from the hospitals’ picture archiving systems.

The de-identified Indiana chest X-ray collection is available for searching and downloading from the National Library of Medicine (http://openi.nlm.nih.gov/).

数据集:https://openi.nlm.nih.gov/gridquery.php?q=Indiana%20chest%20X-ray%20collection&it=xg

 

 

3)使用该数据集的论文:

On the Automatic Generation of Medical Imaging Reports

论文地址:https://arxiv.org/pdf/1711.08195.pdf

 

在Windows系统上,使用Python和ResNet(一种深度卷积神经网络模型,常用于图像识别任务)对Chest X-ray数据集进行分类通常需要几个步骤: 1. **安装必要的库**: 首先,确保已安装了Python和相关的科学计算库如NumPy、Pandas、TensorFlow或PyTorch(这两个库都支持ResNet)。可以使用`pip install numpy pandas tensorflow keras` 或 `pip install torch torchvision`。 2. **数据预处理**: 导入`torchvision.datasets`获取ChestXRay数据集,例如`ImageFolder`。你需要将数据集划分为训练集和验证集,然后使用`DataLoader`进行批次加载。 ```python from torchvision import datasets, transforms data_transforms = { 'train': transforms.Compose([ transforms.RandomHorizontalFlip(), transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]), 'val': transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]) } dataset = datasets.ImageFolder(root='./path/to/chest_xray', transform=data_transforms['train']) ``` 3. **构建ResNet模型**: 使用PyTorch或Keras中的预训练ResNet模型,这通常是通过在网络架构中导入相应的模块并选择合适的层完成。 ```python import torch.nn as nn from torchvision.models.resnet import resnet50 model = resnet50(pretrained=True) num_features = model.fc.in_features model.fc = nn.Linear(num_features, num_classes) # num_classes为类别总数 ``` 4. **训练模型**: 定义损失函数(如交叉熵)和优化器(如Adam),并开始训练过程。 ```python import torch.optim as optim criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(model.parameters(), lr=0.001) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) for epoch in range(num_epochs): for inputs, labels in dataloader: inputs, labels = inputs.to(device), labels.to(device) outputs = model(inputs) loss = criterion(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() ``` 5. **评估和保存模型**: 训练完成后,可以在验证集上评估模型性能,并保存最优模型以便后续使用。 ```python # 评估部分 correct = 0 total = 0 with torch.no_grad(): for inputs, labels in val_loader: inputs, labels = inputs.to(device), labels.to(device) outputs = model(inputs) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() accuracy = 100 * correct / total print(f'Validation accuracy: {accuracy}%') ```
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值