OpenCV与AI深度学习 | 基于YOLO和EasyOCR从视频中识别车牌

本文来源公众号“OpenCV与AI深度学习”，仅用于学术分享，侵权删，干货满满。

在本文中，我们将探讨如何使用 Python 中的 YOLO（You Only Look Once）和 EasyOCR（Optical Character Recognition）从视频文件中实现车牌检测。这种方法利用深度学习实时检测和识别车牌。

先决条件

在开始之前，请确保已安装以下 Python 包：

pip install opencv-python ultralytics easyocr Pillow numpy

实现步骤

步骤 1：初始化库

我们将首先导入必要的库。我们将使用 OpenCV 进行视频处理、使用 YOLO 进行对象检测以及使用 EasyOCR 读取检测到的车牌上的文字。

import cv2
from ultralytics import YOLO
import easyocr
from PIL import Image
import numpy as np

# Initialize EasyOCR reader
reader = easyocr.Reader(['en'], gpu=False)

# Load your YOLO model (replace with your model's path)
model = YOLO('best_float32.tflite', task='detect')

# Open the video file (replace with your video file path)
video_path = 'sample4.mp4'
cap = cv2.VideoCapture(video_path)

# Create a VideoWriter object (optional, if you want to save the output)
output_path = 'output_video.mp4'
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, 30.0, (640, 480))  # Adjust frame size if necessary

步骤2：处理视频帧

我们将读取视频文件中的每一帧，对其进行处理以检测车牌，然后应用 OCR 来识别车牌上的文字。为了提高性能，我们可以跳过每三帧的处理。

# Frame skipping factor (adjust as needed for performance)
frame_skip = 3  # Skip every 3rd frame
frame_count = 0

while cap.isOpened():
    ret, frame = cap.read()  # Read a frame from the video
    if not ret:
        break  # Exit loop if there are no frames left

    # Skip frames
    if frame_count % frame_skip != 0:
        frame_count += 1
        continue  # Skip processing this frame

    # Resize the frame (optional, adjust size as needed)
    frame = cv2.resize(frame, (640, 480))  # Resize to 640x480

    # Make predictions on the current frame
    results = model.predict(source=frame)

    # Iterate over results and draw predictions
    for result in results:
        boxes = result.boxes  # Get the boxes predicted by the model
        for box in boxes:
            class_id = int(box.cls)  # Get the class ID
            confidence = box.conf.item()  # Get confidence score
            coordinates = box.xyxy[0]  # Get box coordinates as a tensor

            # Extract and convert box coordinates to integers
            x1, y1, x2, y2 = map(int, coordinates.tolist())  # Convert tensor to list and then to int

            # Draw the box on the frame
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)  # Draw rectangle

            # Try to apply OCR on detected region
            try:
                # Ensure coordinates are within frame bounds
                r0 = max(0, x1)
                r1 = max(0, y1)
                r2 = min(frame.shape[1], x2)
                r3 = min(frame.shape[0], y2)

                # Crop license plate region
                plate_region = frame[r1:r3, r0:r2]

                # Convert to format compatible with EasyOCR
                plate_image = Image.fromarray(cv2.cvtColor(plate_region, cv2.COLOR_BGR2RGB))
                plate_array = np.array(plate_image)

                # Use EasyOCR to read text from plate
                plate_number = reader.readtext(plate_array)
                concat_number = ' '.join([number[1] for number in plate_number])
                number_conf = np.mean([number[2] for number in plate_number])

                # Draw the detected text on the frame
                cv2.putText(
                    img=frame,
                    text=f"Plate: {concat_number} ({number_conf:.2f})",
                    org=(r0, r1 - 10),
                    fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                    fontScale=0.7,
                    color=(0, 0, 255),
                    thickness=2
                )

            except Exception as e:
                print(f"OCR Error: {e}")
                pass

    # Show the frame with detections
    cv2.imshow('Detections', frame)

    # Write the frame to the output video (optional)
    out.write(frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break  # Exit loop if 'q' is pressed

    frame_count += 1  # Increment frame count

# Release resources
cap.release()
out.release()  # Release the VideoWriter object if used
cv2.destroyAllWindows()

代码说明：

初始化 EasyOCR：初始化 EasyOCR 阅读器以进行英文文本识别。

加载 YOLO 模型：YOLO 模型从指定路径加载。请确保将此路径替换为您的模型路径。

读取视频帧：使用 OpenCV 打开视频文件，VideoWriter如果要保存输出，则初始化。

帧处理：读取并调整每一帧的大小。该模型预测车牌位置。

绘制预测：在帧上绘制检测到的边界框。包含车牌的区域被裁剪以进行 OCR 处理。

应用 OCR：EasyOCR 从裁剪的车牌图像中读取文本。检测到的文本和置信度分数显示在框架上。

输出视频：处理后的帧可以显示在窗口中，也可以选择保存到输出视频文件中。

THE END !

文章结束，感谢阅读。您的点赞，收藏，评论是我继续更新的动力。大家有推荐的公众号可以评论区留言，共同学习，一起进步。