opencv案例实战之工业印刷品数字识别

本文详细介绍了使用Python和OpenCV从印刷品图像中识别数字的完整流程,包括环境准备、文字区域检测、非最大抑制、形态学处理、字符分割与提取、数据集生成及SVM训练,最终实现对字符的准确识别。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一、环境准备

  • Python语言包
  • OpenCV-python开发包
  • OpenCV DNN模块
  • OpenCV ML模块
  • pycharm2019

项目地址:https://github.com/zxinyang38/opencv-

二、结果预览

从给定的印刷品图像进行数字识别。

在这里插入图片描述

三、实验步骤

1、EAST TEXT对象检测模型(使用EAST网络模型实现文字区域检测)

  • EAST网络架构
    在这里插入图片描述
  • 加载获取网络各层信息
    east_text权重文件放在本人github地址:https://github.com/zxinyang38/opencv-
import cv2 as cv
import numpy as np

net = cv.dnn.readNet('C:/Program Files (x86)/pycharm/pycharm/PycharmProjects/ML/ocr_demo/frozen_east_text_detection.pb')
names = net.getLayerNames()
for name in names:
    print(name)

以上代码可以输出每一层的名字:其中feature_fusion/Conv_7/Sigmoid对应的是EAST网络中score map部分
feature_fusion/concat_3对应的是EAST网络架构中最右边的RBOX geometry部分

  • 使用网络
   def detect(self,image):
       (H,W) = image.shape[:2]
       rH = H / float(320)
       rW = W / float(320)
       blob = cv.dnn.blobFromImage(image,1.0,(320,320),(123.68, 116.78, 103.94),swapRB=True,crop=False)
       self.net.setInput(blob)
       (scores, geometry) = self.net.forward(self.layerNames)
       print(scores)

详见:text_area_detect.py函数

2、非最大抑制(NMS)

检测出来的图像可能时下图这种:
在这里插入图片描述

故而引入NMSBoxes API(非最大信号抑制去掉差的区域):

import cv2 as cv
import numpy as np

class TextAreaDetector:
   def __init__(self,model_path):
       self.net = cv.dnn.readNet(model_path)
       names = self.net.getLayerNames()
       for name in names:
           print(name)
       self.threshold = 0.5
       self.layerNames = ["feature_fusion/Conv_7/Sigmoid","feature_fusion/concat_3"]


   def detect(self,image):
       (H,W) = image.shape[:2]
       rH = H / float(320)
       rW = W / float(320)
       blob = cv.dnn.blobFromImage(image,1.0,(320,320),(123.68, 116.78, 103.94),swapRB=True,crop=False)
       self.net.setInput(blob)
       (scores, geometry) = self.net.forward(self.layerNames)
       print(scores)

       (numRows, numCols) = scores.shape[2:4]
       rects = []
       confidences = []

       # start to decode the output
       for y in range(0, numRows):
           scoresData = scores[0, 0, y]
           xData0 = geometry[0, 0, y]
           xData1 = geometry[0, 1, y]
           xData2 = geometry[0, 2, y]
           xData3 = geometry[0, 3, y]
           anglesData = geometry[0, 4, y]

           # loop over the number of columns
           for x in range(0, numCols):
               # if our score does not have sufficient probability, ignore it
               if scoresData[x] < self.threshold:
                   continue

               # compute the offset factor as our resulting feature maps will
               # be 4x smaller than the input image
               (offsetX, offsetY) = (x * 4.0, y * 4.0)

               # extract the rotation angle for the prediction and then
               # compute the sin and cosine
      
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值