【Ultralytics】评估报错:解决RT-DETR无法save_json的错误,并且顺利使用pycocotools进行评估

由百度开发的实时检测 Transformer (RT-DETR) 是一种先进的端到端对象检测器,可在保持高精度的同时提供实时性能。它基于 DETR(无 NMS 框架)的思想,同时引入了基于 conv 的骨干网络和一个高效的混合编码器以获得实时速度。

RT-DETR 通过解耦尺度内交互和跨尺度融合来高效地处理多尺度特征。该模型具有高度的适应性,支持使用不同的解码器层灵活调整推理速度,而无需重新训练。RT-DETR 在 CUDA 与 TensorRT 等加速后端上表现出色,优于许多其他实时对象检测器。

Ultralytics Python API 提供了具有不同规模的预训练 PaddlePaddle RT-DETR 模型,本文使用Ultralytics进行RT-DETR的复现。

训练没有任何问题。但是评估的时候,如果开启save_json=True的选项,会直接报错:

(Pytorch) PS D:\CodeProject\datasets\VisDrone> python .\ultralytics-main\check.py
Ultralytics 8.3.192  Python-3.10.18 torch-2.0.0 CUDA:0 (NVIDIA GeForce RTX 4090, 24564MiB)
rtdetr-l summary: 302 layers, 32,004,290 parameters, 0 gradients, 103.5 GFLOPs
val: Fast image access  (ping: 0.00.0 ms, read: 2287.3481.3 MB/s, size: 175.5 KB)
val: Scanning D:\CodeProject\datasets\VisDrone\VisDrone_YOLO\VisDrone2019-DET-val\labels.cache... 548 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 548/548  0.0s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 0% ──────────── 0/137  0.1s<
Traceback (most recent call last):
  File "D:\CodeProject\datasets\VisDrone\ultralytics-main\check.py", line 11, in <module>
    result = model.val(data=r'D:\CodeProject\datasets\VisDrone\ultralytics-main\ultralytics\cfg\datasets\VisDrone.yaml',
  File "D:\CodeProject\datasets\VisDrone\ultralytics-main\ultralytics\engine\model.py", line 635, in val
    validator(model=self.model)
  File "D:\Anaconda3\envs\Pytorch\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\CodeProject\datasets\VisDrone\ultralytics-main\ultralytics\engine\validator.py", line 221, in __call__
    self.update_metrics(preds, batch)
  File "D:\CodeProject\datasets\VisDrone\ultralytics-main\ultralytics\models\yolo\detect\val.py", line 210, in update_metrics
    predn_scaled = self.scale_preds(predn, pbatch)
  File "D:\CodeProject\datasets\VisDrone\ultralytics-main\ultralytics\models\yolo\detect\val.py", line 407, in scale_preds
    "bboxes": ops.scale_boxes(
  File "D:\CodeProject\datasets\VisDrone\ultralytics-main\ultralytics\utils\ops.py", line 128, in scale_boxes
    gain = ratio_pad[0][0]
TypeError: 'float' object is not subscriptable

在这里插入图片描述
这个错误是因为在 ops.scale_boxes() 函数中,ratio_pad 参数被当作元组处理,但实际上传入的是一个浮点数。

需要进行修改:

def scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None, padding: bool = True, xywh: bool = False):
    """
    Rescale bounding boxes from one image shape to another.

    Rescales bounding boxes from img1_shape to img0_shape, accounting for padding and aspect ratio changes.
    Supports both xyxy and xywh box formats.

    Args:
        img1_shape (tuple): Shape of the source image (height, width).
        boxes (torch.Tensor): Bounding boxes to rescale in format (N, 4).
        img0_shape (tuple): Shape of the target image (height, width).
        ratio_pad (tuple, optional): Tuple of (ratio, pad) for scaling. If None, calculated from image shapes.
        padding (bool): Whether boxes are based on YOLO-style augmented images with padding.
        xywh (bool): Whether box format is xywh (True) or xyxy (False).

    Returns:
        (torch.Tensor): Rescaled bounding boxes in the same format as input.
    """
    if ratio_pad is None:  # calculate from img0_shape
        gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])  # gain  = old / new
        pad_x = round((img1_shape[1] - img0_shape[1] * gain) / 2 - 0.1)
        pad_y = round((img1_shape[0] - img0_shape[0] * gain) / 2 - 0.1)
    else:
        if isinstance(ratio_pad, (int, float)):
            gain = ratio_pad
            pad_x, pad_y = 0, 0
        elif isinstance(ratio_pad, (list, tuple)):
            if len(ratio_pad) == 2:
                if isinstance(ratio_pad[0], (list, tuple)):
                    gain = ratio_pad[0][0]
                else:
                    gain = ratio_pad[0]

                if isinstance(ratio_pad[1], (list, tuple)) and len(ratio_pad[1]) >= 2:
                    pad_x, pad_y = ratio_pad[1][0], ratio_pad[1][1]
                else:
                    pad_x, pad_y = 0, 0
            else:
                gain = ratio_pad[0] if len(ratio_pad) > 0 else 1.0
                pad_x, pad_y = 0, 0
        else:
            gain = 1.0
            pad_x, pad_y = 0, 0

    if padding:
        boxes[..., 0] -= pad_x  # x padding
        boxes[..., 1] -= pad_y  # y padding
        if not xywh:
            boxes[..., 2] -= pad_x  # x padding
            boxes[..., 3] -= pad_y  # y padding
    boxes[..., :4] /= gain
    return clip_boxes(boxes, img0_shape)

这样可以save_json成功,但是进行coco指标评估的时候,虽然没有报错,但是指标数值很低很低!

python D:\CodeProject\datasets\VisDrone\ultralytics-main\COCO_Evalution.py --annotations D:\CodeProject\datasets\VisDrone\VisDrone_YOLO\instances_val_2017.json --predictions D:\CodeProject\datasets\VisDrone\runs\detect\val9\predictions.json

在这里插入图片描述
评估的结果格式会出现问题,所以为了正常使用,我们需要进行逐个评估,并根据GT的格式来生成json文件。代码如下:

from ultralytics import RTDETR
import json
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
import os


def correct_coco_evaluation(model_path, image_dir, gt_json_path, output_dir='results'):
    os.makedirs(output_dir, exist_ok=True)

    coco_gt = COCO(gt_json_path)
    gt_images = coco_gt.imgs
    print(f"GT 中包含 {len(gt_images)} 张图像")

    image_name_to_id = {}
    for img_id, img_info in gt_images.items():
        image_name_to_id[img_info['file_name']] = img_id

    model = RTDETR(model_path)
    results = model.predict(source=image_dir, save=False, conf=0.001, workers=0)

    predictions = []
    pred_id = 1

    for result in results:
        image_path = result.path
        image_name = os.path.basename(image_path)

        if image_name in image_name_to_id:
            image_id = image_name_to_id[image_name]
        else:
            print(f"警告: 未找到图像 {image_name} 在 GT 中的对应ID,跳过")
            continue

        if result.boxes is not None:
            for box in result.boxes:
                xyxy = box.xyxy[0].tolist()
                x1, y1, x2, y2 = xyxy
                width = x2 - x1
                height = y2 - y1

                predictions.append({
                    'id': pred_id,
                    'image_id': image_id,
                    'category_id': int(box.cls.item()) + 1,
                    'bbox': [x1, y1, width, height],
                    'score': box.conf.item(),
                    'area': width * height
                })
                pred_id += 1
        else:
            print(f"图像 {image_name} 未检测到任何目标")

    print(f"生成 {len(predictions)} 个预测")

    pred_file = os.path.join(output_dir, 'predictions.json')
    with open(pred_file, 'w') as f:
        json.dump(predictions, f, indent=2)

    coco_dt = coco_gt.loadRes(pred_file)

    coco_eval = COCOeval(coco_gt, coco_dt, 'bbox')
    coco_eval.evaluate()
    coco_eval.accumulate()
    coco_eval.summarize()

    return coco_eval.stats


if __name__ == '__main__':
    stats = correct_coco_evaluation(
        model_path=r'D:\CodeProject\datasets\VisDrone\ultralytics-main\runs\detect\train4\weights\best.pt',
        image_dir=r'D:\CodeProject\datasets\VisDrone\VisDrone_YOLO\VisDrone2019-DET-test\images',
        gt_json_path=r'D:\CodeProject\datasets\VisDrone\ultralytics-main\instances_test_2017.json'
    )

    print(f"\nCOCO 评估结果:")
    print(f"AP@[0.5:0.95]: {stats[0]:.3f}")
    print(f"AP@0.5: {stats[1]:.3f}")
    print(f"AP@0.75: {stats[2]:.3f}")
    print(f"AP_small: {stats[3]:.3f}")
    print(f"AP_medium: {stats[4]:.3f}")
    print(f"AP_large: {stats[5]:.3f}")

结果顺利:

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

WARNING:root:failed to import ujson, using json instead ��Ϣ: ���ṩ��ģʽ�޷��ҵ��ļ��� D:\pycharm\project\.venv\Lib\site-packages\paddle\utils\cpp_extension\extension_utils.py:715: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md Creating model: ('PP-LCNet_x1_0_doc_ori', None) Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('UVDoc', None) The model(UVDoc) is not supported to run in MKLDNN mode! Using `paddle` instead! Using official model (UVDoc), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-DocBlockLayout', None) Using official model (PP-DocBlockLayout), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-DocLayout_plus-L', None) Using official model (PP-DocLayout_plus-L), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-LCNet_x1_0_textline_ori', None) Using official model (PP-LCNet_x1_0_textline_ori), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-OCRv5_server_det', None) Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-OCRv5_server_rec', None) Using official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-OCRv4_server_seal_det', None) Using official model (PP-OCRv4_server_seal_det), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-OCRv5_server_rec', None) Using official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-LCNet_x1_0_table_cls', None) Using official model (PP-LCNet_x1_0_table_cls), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('SLANeXt_wired', None) The model(SLANeXt_wired) is not supported to run in MKLDNN mode! Using `paddle` instead! Using official model (SLANeXt_wired), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('SLANet_plus', None) The model(SLANet_plus) is not supported to run in MKLDNN mode! Using `paddle` instead! Using official model (SLANet_plus), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('RT-DETR-L_wired_table_cell_det', None) Using official model (RT-DETR-L_wired_table_cell_det), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('RT-DETR-L_wireless_table_cell_det', None) Using official model (RT-DETR-L_wireless_table_cell_det), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-FormulaNet_plus-L', None) The model(PP-FormulaNet_plus-L) is not supported to run in MKLDNN mode! Using `paddle` instead! Using official model (PP-FormulaNet_plus-L), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Creating model: ('PP-Chart2Table', None) Using official model (PP-Chart2Table), the model files will be automatically downloaded and saved in C:\Users\Administrator\.paddlex\official_models. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Loading configuration file C:\Users\Administrator\.paddlex\official_models\PP-Chart2Table\config.json Loading weights file C:\Users\Administrator\.paddlex\official_models\PP-Chart2Table\model_state.pdparams Loaded weights file from disk, setting weights to model. All model checkpoint weights were used when initializing PPChart2TableInference. All the weights of PPChart2TableInference were initialized from the model checkpoint at C:\Users\Administrator\.paddlex\official_models\PP-Chart2Table. If your task is similar to the task the model of the checkpoint was trained on, you can already use PPChart2TableInference for predictions without further training. Loading configuration file C:\Users\Administrator\.paddlex\official_models\PP-Chart2Table\generation_config.json [ WARN:0@34.934] global loadsave.cpp:241 cv::findDecoder imread_('图片1.jpg'): can't open/read file: check file path/integrity 错误:OpenCV无法读取图片 - 图片1.jpg
07-16
<think>我们注意到用户再次遇到了与之前类似的问题,但这次错误信息中明确指出OpenCV无法读取图片([ WARN:0@34.934] global loadsave.cpp:241 cv::findDecoder imread_('图片1.jpg'): can't open/read file: check file path/integrity)。此外,之前的错误`TypeError: 'PPStructureV3' object is not callable`已经通过修改为`analyze`方法解决,现在的问题焦点是图片路径问题。 根据错误信息,我们需要重点解决图片路径问题,并确保正确初始化PPStructureV3。同时,用户的环境存在一些警告(如ujson缺失、ccache未安装),但这些问题不影响主要功能,可以后续优化。 解决方案: 1. 确保图片路径正确且可访问。使用绝对路径,并检查文件是否存在。 2. 使用OpenCV读取图片时,检查图片是否被正确加载。 3. 如果图片路径包含中文或特殊字符,确保正确编码。 另外,从日志中看到,PPStructureV3在初始化时下载了大量模型,这可能会导致首次运行较慢。如果用户希望加快初始化速度,可以考虑预先下载模型并指定本地路径。 下面提供修正后的代码,重点解决图片路径问题,并加入更详细的错误处理。</think>根据您的错误日志,核心问题是OpenCV无法读取图片文件 "图片1.jpg"。以下是针对此问题的完整解决方案: ### 错误原因分析 1. **文件路径问题**: ```log [ WARN:0@34.934] global loadsave.cpp:241 cv::findDecoder imread_('图片1.jpg'): can't open/read file: check file path/integrity ``` - 文件不存在 - 路径包含特殊字符或中文 - 文件权限问题 2. **路径格式问题**: - 相对路径在复杂项目中可能不可靠 - 路径分隔符问题(Windows使用`\`,Python需要转义或使用原始字符串) 3. **文件格式问题**: - OpenCV可能不支持某些特殊格式的图片 - 文件扩展名与实际格式不匹配 --- ### 完整解决方案代码 ```python from paddleocr import PPStructureV3 import cv2 import os import sys import numpy as np from PIL import Image def safe_image_read(image_path): """安全读取图片的多种方法""" # 方法1:检查文件是否存在 if not os.path.exists(image_path): return None, f"文件不存在: {image_path}" try: # 方法2:使用OpenCV读取 img = cv2.imread(image_path) if img is not None: return img, "OpenCV读取成功" # 方法3:使用PIL读取并转换 pil_img = Image.open(image_path) img = cv2.cvtColor(np.array(pil_img), cv2.COLOR_RGB2BGR) return img, "PIL转换读取成功" except Exception as e: return None, f"读取失败: {str(e)}" def analyze_document(image_path): """文档分析主函数""" # 安全读取图片 img, msg = safe_image_read(image_path) if img is None: print(f"错误: {msg}") print("尝试以下解决方案:") print("1. 检查文件路径是否正确") print("2. 确认文件扩展名与实际格式匹配") print("3. 尝试将文件复制到简单路径(如D:/test.jpg)") return print(f"图片读取: {msg}") try: # 初始化PPStructureV3(精简模型加载) engine = PPStructureV3( layout=False, # 关闭版面分析(减少资源) table=True, # 启用表格识别 ocr=True, # 启用OCR show_log=False, # 关闭详细日志 # 显式指定模型减少自动下载 det_model_dir='C:\\Users\\Administrator\\.paddlex\\official_models\\PP-OCRv5_server_det', rec_model_dir='C:\\Users\\Administrator\\.paddlex\\official_models\\PP-OCRv5_server_rec' ) # 执行分析 result = engine.analyze(img) # 处理结果 print(f"分析完成!识别到 {len(result)} 个区域") for i, region in enumerate(result): print(f"区域 {i+1}: 类型={region['type']}, 位置={region['bbox']}") except Exception as e: print(f"分析过程中出错: {str(e)}") print("常见解决方案:") print("1. 更新PaddleOCR: pip install --upgrade paddleocr") print("2. 检查CUDA/cuDNN版本兼容性") print("3. 减少同时加载的模型类型") if __name__ == "__main__": # 使用原始字符串处理路径 image_path = r"D:\pycharm\project\数据处理\图片1.jpg" # 添加路径修正逻辑(自动处理中文路径) if sys.platform.startswith('win'): image_path = os.path.abspath(image_path) print(f"尝试分析文件: {image_path}") analyze_document(image_path) ``` ### 路径问题专项解决方案 ```python def fix_image_path(original_path): """自动修正常见路径问题""" # 方案1:转换为绝对路径 abs_path = os.path.abspath(original_path) # 方案2:处理中文路径编码问题 try: # Windows系统专用修复 if sys.platform.startswith('win'): import win32api fixed_path = win32api.GetLongPathName(abs_path) return fixed_path except: pass # 方案3:复制文件到临时位置 temp_dir = r"C:\Temp" if not os.path.exists(temp_dir): os.makedirs(temp_dir) new_path = os.path.join(temp_dir, "temp_image.jpg") try: import shutil shutil.copy2(abs_path, new_path) return new_path except: return abs_path # 最后尝试返回原始绝对路径 # 使用示例 original_path = "图片1.jpg" fixed_path = fix_image_path(original_path) print(f"修正后的路径: {fixed_path}") ``` ### OpenCV替代读取方案 当OpenCV无法读取时,使用PIL+NumPy组合: ```python from PIL import Image import numpy as np def alternative_image_read(path): try: # 尝试PIL读取 img = Image.open(path) # 转换通道顺序 (RGB -> BGR) if img.mode == 'RGBA': # 处理带透明通道的图片 img = img.convert('RGB') np_img = np.array(img) bgr_img = np_img[:, :, ::-1].copy() # RGB转BGR return bgr_img except Exception as e: print(f"替代方法读取失败: {str(e)}") return None ``` ### 环境优化建议 1. **安装路径修复依赖**: ```bash pip install pywin32 # Windows路径修复 pip install ujson # 加速JSON处理 ``` 2. **设置临时目录**(在代码开头添加): ```python import os os.environ['TEMP'] = r'C:\Temp' # 避免系统临时路径问题 ``` 3. **减少模型加载**: ```python # 只加载必要模型 engine = PPStructureV3( layout=False, # 关闭版面分析 formula=False, # 关闭公式识别 table=True, # 只启用表格识别 ocr=True ) ```
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

旅途中的宽~

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值