YOLOv8训练自己的OBB数据集

最新推荐文章于 2025-04-24 09:41:23 发布

戊辰happy

最新推荐文章于 2025-04-24 09:41:23 发布

阅读量1.5k

点赞数 2

文章标签： YOLO 人工智能深度学习

本文链接：https://blog.youkuaiyun.com/weixin_42569775/article/details/137014404

版权

1.采用rolabelImg标注产生相应的XML标签文件

2.下载代码及安装相应的依赖环境

git clone https://github.com/ultralytics/ultralytics.git

下面是我安装的相应依赖，torch可在官网命令安装

absl-py==2.1.0
actionlib==1.12.1
angles==1.9.12
bondpy==1.8.5
cachetools==5.3.3
camera_calibration_parsers==1.11.13
catkin==0.7.29
certifi==2024.2.2
charset-normalizer==3.3.2
coloredlogs==15.0.1
contourpy==1.1.1
controller_manager==0.18.4
controller_manager_msgs==0.18.4
cycler==0.12.1
diagnostic_analysis==1.9.7
diagnostic_common_diagnostics==1.9.7
diagnostic_updater==1.9.7
dynamic_reconfigure==1.6.5
filelock==3.13.1
flatbuffers==23.5.26
fonttools==4.50.0
fsspec==2024.3.1
gazebo_ros==2.8.7
gencpp==0.6.6
geneus==2.2.6
genlisp==0.4.16
genmsg==0.5.17
gennodejs==2.0.1
genpy==0.6.16
google-auth==2.28.2
google-auth-oauthlib==0.4.6
grpcio==1.62.1
humanfriendly==10.0
idna==3.6
importlib_metadata==7.0.2
importlib_resources==6.3.1
interactive-markers==1.11.5
Jinja2==3.1.3
joint_state_publisher==1.12.15
kdl-parser-py==1.13.3
kiwisolver==1.4.5
lanelet2-python==1.0.1
laser_geometry==1.6.7
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.7.5
mdurl==0.1.2
message_filters==1.14.13
#mmsegmentation== 0.11.0               /data/SegFormer
mpmath==1.3.0
networkx==3.1
nmea_navsat_driver==0.5.2
numpy==1.24.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.4.99
nvidia-nvtx-cu12==12.1.105
oauthlib==3.2.2
onnx==1.15.0
onnxruntime==1.15.1
onnxsim==0.4.36
opencv-python==4.9.0.80
packaging==24.0
pandas==2.0.3
pillow==10.2.0
pip==23.3.1
protobuf==5.26.0
psutil==5.9.8
py-cpuinfo==9.0.0
pyasn1==0.5.1
pyasn1-modules==0.3.0
Pygments==2.17.2
pyparsing==3.1.2
python-dateutil==2.9.0.post0
python_qt_binding==0.4.4
pytz==2024.1
PyYAML==6.0.1
qt-dotgraph==0.4.2
qt-gui==0.4.2
qt-gui-cpp==0.4.2
qt-gui-py-common==0.4.2
requests==2.31.0
requests-oauthlib==1.4.0
resource_retriever==1.12.7
rich==13.7.1
rosapi==0.11.16
rosbag==1.14.13
rosboost-cfg==1.14.9
rosbridge-library==0.11.16
rosbridge-server==0.11.16
rosclean==1.14.9
roscreate==1.14.9
rosgraph==1.14.13
roslaunch==1.14.13
roslib==1.14.9
roslint==0.11.2
roslz4==1.14.13
rosmake==1.14.9
rosmaster==1.14.13
rosmsg==1.14.13
rosnode==1.14.13
rosparam==1.14.13
rospy==1.14.13
rosservice==1.14.13
rostest==1.14.13
rostopic==1.14.13
rosunit==1.14.9
roswtf==1.14.13
rqt_action==0.4.9
rqt_bag==0.5.1
rqt_bag_plugins==0.5.1
rqt_console==0.4.9
rqt_dep==0.4.9
rqt_graph==0.4.11
rqt_gui==0.5.3
rqt_gui_py==0.5.3
rqt_launch==0.4.8
rqt_logger_level==0.4.8
rqt-moveit==0.5.10
rqt_msg==0.4.8
rqt_nav_view==0.5.7
rqt_plot==0.4.13
rqt_pose_view==0.5.8
rqt_publisher==0.4.8
rqt_py_common==0.5.3
rqt_py_console==0.4.8
rqt-reconfigure==0.5.4
rqt_robot_dashboard==0.5.7
rqt-robot-monitor==0.5.14
rqt_robot_steering==0.5.10
rqt_runtime_monitor==0.5.7
rqt-rviz==0.7.0
rqt_service_caller==0.4.8
rqt_shell==0.4.9
rqt_srv==0.4.8
rqt_tf_tree==0.6.0
rqt_top==0.4.8
rqt_topic==0.4.11
rqt_web==0.4.8
rsa==4.9
rviz==1.13.29
scipy==1.10.1
seaborn==0.13.2
sensor-msgs==1.12.8
setuptools==68.2.2
six==1.16.0
smach==2.0.1
smach_ros==2.0.1
smclib==1.8.5
sound-play==0.3.15
sympy==1.12
tensorboard==2.12.0
tensorboard-data-server==0.7.2
tensorboard-plugin-wit==1.8.1
tensorboardX==2.6.2.2
terminaltables==3.1.10
tf==1.12.1
tf_conversions==1.12.1
tf2_geometry_msgs==0.6.5
tf2_kdl==0.6.5
tf2_py==0.6.5
tf2_ros==0.6.5
thop==0.1.1.post2209072238
topic_tools==1.14.13
#torch==1.10.1+cu111
#torchaudio==0.10.1+cu111
#torchvision==0.11.2+cu111
tqdm==4.66.2
triton==2.2.0
typing_extensions==4.10.0
tzdata==2024.1
urdfdom-py==0.4.6
urllib3==2.2.1
Werkzeug==3.0.1
wheel==0.41.2
xacro==1.13.18
zipp==3.18.1

3.将相应的XML文件转化为dota格式的标签文件，转换代码可参考我的上一篇博客

4.将dota格式的标签文件转化为YOLOv8训练所需的YOLO格式

def convert_dota_to_yolo_obb(dota_root_path: str):
    """
    Converts DOTA dataset annotations to YOLO OBB (Oriented Bounding Box) format.

    The function processes images in the 'train' and 'val' folders of the DOTA dataset. For each image, it reads the
    associated label from the original labels directory and writes new labels in YOLO OBB format to a new directory.

    Args:
        dota_root_path (str): The root directory path of the DOTA dataset.

    Example:
        ```python
        from ultralytics.data.converter import convert_dota_to_yolo_obb

        convert_dota_to_yolo_obb('path/to/DOTA')
        ```

    Notes:
        The directory structure assumed for the DOTA dataset:

            - DOTA
                ├─ images
                │   ├─ train
                │   └─ val
                └─ labels
                    ├─ train_original
                    └─ val_original

        After execution, the function will organize the labels into:

            - DOTA
                └─ labels
                    ├─ train
                    └─ val
    """
    dota_root_path = Path(dota_root_path)

    # Class names to indices mapping
    class_mapping = {
        "gfb": 0,
    }

    def convert_label(image_name, image_width, image_height, orig_label_dir, save_dir):
        """Converts a single image's DOTA annotation to YOLO OBB format and saves it to a specified directory."""
        orig_label_path = orig_label_dir / f"{image_name}.txt"
        save_path = save_dir / f"{image_name}.txt"

        with orig_label_path.open("r") as f, save_path.open("w") as g:
            lines = f.readlines()
            for line in lines:
                parts = line.strip().split()
                if len(parts) < 9:
                    continue
                class_name = parts[8]
                class_idx = class_mapping[class_name]
                coords = [float(p) for p in parts[:8]]
                normalized_coords = [
                    coords[i] / image_width if i % 2 == 0 else coords[i] / image_height for i in range(8)
                ]
                formatted_coords = ["{:.6g}".format(coord) for coord in normalized_coords]
                g.write(f"{class_idx} {' '.join(formatted_coords)}\n")

    for phase in ["train", "val"]:
        image_dir = dota_root_path / "images" / phase
        orig_label_dir = dota_root_path / "labels" / f"{phase}_original"
        save_dir = dota_root_path / "labels" / phase

        save_dir.mkdir(parents=True, exist_ok=True)

        image_paths = list(image_dir.iterdir())
        for image_path in TQDM(image_paths, desc=f"Processing {phase} images"):
            if image_path.suffix != ".png":
                continue
            image_name_without_ext = image_path.stem
            img = cv2.imread(str(image_path))
            h, w = img.shape[:2]
            convert_label(image_name_without_ext, w, h, orig_label_dir, save_dir)

from ultralytics.data.converter import convert_dota_to_yolo_obb

convert_dota_to_yolo_obb("/data/data/Ship_dota_v1.5_1024/DOTA")

5.修改yolov8-obb.yaml的类别

6.修改dota8-obb.yaml文件路径及类别

path: /data/data/Ship_dota_v1.5_1024/DOTA # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images

# Classes for DOTA 1.0
names:
  0: gfb

# Download script/URL (optional)
download: https://github.com/ultralytics/yolov5/releases/download/v1.0/dota8.zip

7.训练

from ultralytics import YOLO

# Load a model
model = YOLO('yolov8n-obb.yaml')  # build a new model from YAML
model = YOLO('yolov8n-obb.pt')  # load a pretrained model (recommended for training)
model = YOLO('yolov8n-obb.yaml').load('yolov8n.pt')  # build from YAML and transfer weights

# Train the model
results = model.train(data='dota8-obb.yaml', epochs=1000, imgsz=640)

8.测试

from ultralytics import YOLO

# Load a model
model = YOLO('yolov8n-obb.pt')  # load an official model
#model = YOLO('path/to/best.pt')  # load a custom model

# Predict with the model
#results = model('https://ultralytics.com/images/bus.jpg')  # predict on an image
results = model('./P0006.png')  # predict on an image
# print("results:", results)