YOLOV5入门与滑动验证码位置识别实战

Print_lin

已于 2024-04-18 18:55:35 修改

阅读量2.3k

点赞数 28

CC 4.0 BY-SA版权

文章标签： YOLO 验证码识别滑动验证码目标检测人工智能

于 2024-04-18 18:43:29 首次发布

本文链接：https://blog.youkuaiyun.com/Print_lin/article/details/137928982

本文介绍了如何使用YOLOV5稳定版本在少量数据集上进行滑动验证码空缺位置的识别，通过实践应用展示了该算法在实际场景中的效能，包括数据准备、标注、模型选择和训练过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前言

yolo是大名鼎鼎的目标检测算法，其核心思维是（You Only Look Once），仅用一次就可得出结果，相比于传统的滑动窗口法可以获得巨大的性能提升。本文基于yolo的稳定版本v5进行实验，完成了滑动验证码空缺位置的识别任务，在少量的数据集上取得了比较不错的效果。本文不涉及算法原理讲解，仅侧重于实践应用，对底层感兴趣的同学可以移步【你一定从未看过如此通俗易懂的YOLO系列(从v1到v5)模型解读】。

环境

python3.9.13
windows10
CPU：R7-7840HS
GPU：无（集显）

YOLOV5

源码

git clone https://github.com/ultralytics/yolov5.git

模型

这里使用yolov5s模型，体积适中，效果也十分不错。更多模型可以进入官方仓库中下载。模型下载后可以放到yolo的根目录，在后续train.py中需要配置使用。

https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt

安装库

pip install -r requirements.txt

数据准备

我们从网易易盾行为式验证码手动下载30张滑动验证码的图片，如您希望结果更加准确，可以下载更多的图片加入训练集。当然，动手能力强的同学可以写个脚本自动下载，我这里偷个懒。

数据标注

数据标注使用开源的Labelme，官方仓库地址：https://github.com/labelmeai/labelme，软件下载地址：Labelme.exe。具体使用教程可以另行搜索，不过正常应该都会用，比较简单。我们使用Labelme标注每一个缺口位置，取名为target，会为每一张图片生成一个json包含标注框和原始图像内容。

标注转换

Labelme标注得到的json需要使用脚本转换为yolo所接受的标注格式，我们使用一个转换脚本生成。脚本来源地址：基于树莓派4B的YOLOv5-Lite目标检测的移植与部署（含训练教程）。

下文的脚本会查找当前目录下的imgs文件中的所有json文件，将文件转换后会在当前目录下生成train和valid文件夹，里面包含了可供yolo训练的数据。train(训练)和valid(验证)数据量的划分比例由dic_labels中的ratio参数决定，这里默认80%的数据用于训练。

import os
import json
import random
import base64
from glob import glob

dic_labels= {'target':0,
            'ratio':0.8}
 
def generate_labels(dic_labs):
    ratio = dic_labs['ratio']
    for index, labelme_annotation_path in enumerate(glob(f'imgs/*.json')):
 
        # 读取文件名
        image_id = os.path.basename(labelme_annotation_path).rstrip('.json')
        
        # 计算是train 还是 valid
        train_or_valid = 'train' if random.random() < ratio else 'valid'
 
        # 读取labelme格式的json文件
        labelme_annotation_file = open(labelme_annotation_path, 'r')
        labelme_annotation = json.load(labelme_annotation_file)
 
        # yolo 格式的 lables
        yolo_annotation_path = os.path.join(train_or_valid, 'labels',image_id + '.txt')
        yolo_annotation_file = open(yolo_annotation_path, 'w')
        
        # yolo 格式的图像保存
        yolo_image = base64.decodebytes(labelme_annotation['imageData'].encode())
        yolo_image_path = os.path.join(train_or_valid, 'images', image_id + '.jpg')
        
        yolo_image_file = open(yolo_image_path, 'wb')
        yolo_image_file.write(yolo_image)
        yolo_image_file.close()
     
 
        # 获取位置信息
        for shape in labelme_annotation['shapes']:
            if shape['shape_type'] != 'rectangle':
                print(
                    f'Invalid type `{shape["shape_type"]}` in annotation `annotation_path`')
                continue
           
 
            points = shape['points']
            scale_width = 1.0 / labelme_annotation['imageWidth']
            scale_height = 1.0 / labelme_annotation['imageHeight']
            width = (points[1][0] - points[0][0]) * scale_width
            height = (points[1][1] - points[0][1]) * scale_height
            x = ((points[1][0] + points[0][0]) / 2) * scale_width
            y = ((points[1][1] + points[0][1]) / 2) * scale_height
            object_class = dic_labels[shape['label']]
            yolo_annotation_file.write(f'{object_class} {x} {y} {width} {height}\n')
        yolo_annotation_file.close()
        print("creat lab %d : %s"%(index,image_id))
 
 
if __name__ == "__main__":
    os.makedirs(os.path.join("train",'images'),exist_ok=True)
    os.makedirs(os.path.join("train",'labels'),exist_ok=True)
    os.makedirs(os.path.join("valid",'images'),exist_ok=True)
    os.makedirs(os.path.join("valid",'labels'),exist_ok=True)
    generate_labels(dic_labels)

训练配置

在yolo目录下包含了train.py，我们需要修改train.py中的部分参数。

第516行--weights中配置下载好的模型地址，值：yolov5s.pt。模型来源见上文
第518行--data中配置数据集文件，值：data/imgs.yaml。imgs.yaml的内容见下文
第522行--imgsz中配置图片尺寸，值：320。
第537行--device中配置运行设备，值：cpu。有显卡的同学可以设置相应的显卡号

imgs.yaml文件是表示训练的数据集路径以及目标数量等配置，具体设置如下，请注意修改为您具体的文件路径。

train: ../train/
val: ../valid/
test:   

# number of classes
nc: 1

# class names
names: [ 'target' ]

开始训练

python train.py

如果一切正常，您将会在根目录得到一个runs目录，其中train文件夹中的exp*就是训练结果，我们找到最后一个exp20文件夹下的weights目录就是最终的权重文件。其中best.pt代表最好的结果，last.pt表示最后一次迭代的结果。

验证结果

python detect.py --weights best.pt --source ../imgs --data data/imgs.yaml --imgsz [160,320] --conf-thres 0.15

参数解释：

--weights 训练后的权重文件
--source 验证的图片地址
--data 可选，训练数据集位置
--imgsz 验证的图片尺寸
--conf-thres 置信度，大于此值的才会进行标记

执行detect.py后同样会在runs目录中生成一个detect文件夹，里面的exp*就是识别结果。

最终产出

本次实验取得了不错的结果，可以看到在十分少量(30张图)的数据集下就可以训练得到较为准确的识别效果，识别速度每张图片15ms，yolo是十分强大的目标检测算法。

完整的代码以及支持文件：https://pan.baidu.com/s/16CuewqUFydVwWt5LRUx1wA?pwd=icc7

互联网人，免费共享，希望您的点赞、关注、评论。