基于ML算法KNN与OpenCV的数独识别与自动填充___By 何子辰

最新推荐文章于 2024-09-16 14:02:00 发布

weixin_30784501

最新推荐文章于 2024-09-16 14:02:00 发布

阅读量381

点赞数 1

CC 4.0 BY-SA版权

文章标签：人工智能数据结构与算法 python

原文链接：http://www.cnblogs.com/IrivingLoveLuna/p/10135706.html

最近在Github看到一个很有趣的案例，基于opencv与KNN的数独自动识别与填充算法，想自己写一下跑一遍，先上原地址：

【转】https://github.com/LiuXiaolong19920720/opencv-soduko

上效果图：

Processing:
- 1:九宫格数字提取---Extract_nums.py
- **方法**: cv2.findContours(image, mode, method[, contours[, hierarchy[, offset] ] ]) → contours, hierarchy
~ Python中, findContours()接收如下参数并返回contours和hierarchy(等级)
~~1.image 源图像，一般为8为单通道图像，更具体来说，二值图像。其他情况暂且不论。
~~2.mode 轮廓检索模式，简要介绍几种：
(1)cv2.RETR_EXTERNAL 只检测外轮廓。对所有轮廓设置hierarchy[i][2]=hierarchy[i][3]=-1
(2)cv2.RETR_LIST 提取所有轮廓，并放置在list中，检测到的轮廓不建立等级关系。
(3)cv2.RETR_TREE 提取所有轮廓，建立网状的轮廓结构。
~~3.method 轮廓的近似办法，是提取轮廓上所有像素点，还是只提取关键的一些点。比如一条线段是提取所有点还是只提取两个端点。
~~4.contours 检测到的轮廓，为组成轮廓的点集。
~~5.hierarchy
(1)meanings: 什么是层级结构？检测轮廓时，一个轮廓包含了另一个轮廓（同心圆）>>>外圆：父轮廓；内圆：子轮廓。
同一个级别又分为前一个轮廓，后一个轮廓。综上所述，hierachy表达不同轮廓之间的关系。
(2)对每一个轮廓有：[Next,Previous,First_Child,Parent])
(3)cv2.RETR_EXTERNAL 只检测外轮廓。对所有轮廓设置hierarchy[i][2]=hierarchy[i][3]=-1，即(2)中First_Child和Parent
都是-1

 1  # -*- coding: UTF-8 -*-
 2 import cv2
 3 import glob as gb
 4 # 九宫格，最外面包围所有的是0号轮廓，里面81个小方格就是0号轮廓的自轮廓，
 5 # 而每一个已知数字的轮廓都是对应方格的子轮廓。
 6 
 7 # 对每一个轮廓有：[Next,Previous,First_Child,Parent])
 8 
 9 #read the pic
10 img = cv2.imread('images/001.jpg')
11 #transfer the BGR to GRAY
12 # gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
13 gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
14 
15 #阈值分割
16 ret, thresh = cv2.threshold(gray,200,255,1)
17 
18 #对二值图像进行膨胀操作???
19 kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
20 dilated = cv2.dilate(thresh,kernel)
21 
22 #轮廓提取，cv2.RETR_TREE表示建立层级结构
23 image, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
24 print(hierarchy.shape)
25 # cv2.namedWindow("img", cv2.WINDOW_NORMAL); 
26 # cv2.imshow("img", image)
27 # cv2.waitKey(0) 
28 # 提取小方格,父轮廓都为0号
29 boxes = []
30 for i in range(len(hierarchy[0])):
31     if hierarchy[0][i][3] == 0:
32         boxes.append(hierarchy[0][i])
33 
34 # #提取数字
35 number_boxes = []
36 for j in range(len(boxes)):
37     if boxes[j][2] != -1:
38         #number_boxes.append(boxes[j])
39         x,y,w,h = cv2.boundingRect(contours[boxes[j][2]])
40         number_boxes.append([x,y,w,h])
41         img = cv2.rectangle(img,(x-1,y-1),(x+w+1,y+h+1),(0,0,255),2)
42         
43 cv2.namedWindow("img", cv2.WINDOW_NORMAL); 
44 cv2.imshow("img", img)
45 cv2.waitKey(0)

- 2:数字识别
- a.数据收集和处理---build_datasets.py
- b.kNN数字识别---knn_recogntion.py
-(1) 这里用opencv自带的knn算法实现。我同时尝试了opencv自带的神经网络和SVM，发现还是kNN的效果最好。有兴趣的可以自己去尝试一下。也可能是我参数没调好。
-(2) -1.加载上面保存的样本和标签数据；
-2.分别用80个作为训练数据，20个作为测试数据；
-3.用opencv自带的knn训练模型；
-4.用训练好的模型识别测试数据中的数字；
-5.输出预测值和实际标签值。

  1  # -*- coding: UTF-8 -*-
  2 import numpy as np 
  3 import cv2 
  4 import glob as gb
  5 
  6 #获取文件夹下原始数字图片
  7 img_path = gb.glob("numbers\\*")
  8 # img_path = cv2.imread()
  9 
 10 
 11 k = 0
 12 labels = []
 13 samples = []
 14 
 15 #1.遍历文件夹下原始数字图片
 16 #2.对每张图片进行轮廓提取操作，只提取外围轮廓
 17 #3.求轮廓外包矩形，并根据矩形大小信息筛选出所有的数字轮廓
 18 #4.然后根据位置信息对数字框排序，显然第一排依次是12345，第二排依次是67890；
 19 #5.提取每一个数字所在的矩形框，作为ROI取出
 20 # 对每一个轮廓等级有：[Next,Previous,First_Child,Parent])
 21 
 22 # for path in range(10):
 23 for path in img_path:
 24     # print(path)
 25     # img = cv2.imread('numbers/'+str(path+1)+'.jpg')
 26     img = cv2.imread(path)
 27     #img's shape: (pixel,pixel, 3 channels--> B,G,R)
 28     # print("image's shape:{}"+str(img.shape))
 29     #转化为灰度图像
 30     gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
 31     
 32     # #高斯滤波
 33     blur = cv2.GaussianBlur(gray,(5,5),0)
 34 
 35     #adaptive Threshold自适应阈值
 36     #将灰度图像转换为二值图像
 37     thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)
 38 
 39     #提取轮廓
 40     image,contours,hierarchy = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
 41     print("hierarchy's shape:{}".format(hierarchy.shape))
 42     # print("contours's shape:{}".format(contours.shape))
 43     # print(, sep, end, file, flush)
 44 
 45     #求轮廓外包矩形，并根据矩形大小信息筛选出所有的数字轮廓
 46     height,width = img.shape[:2]
 47     w = width/5
 48     recl_list = []
 49     list1 = []
 50     list2 = []
 51     # j = 0
 52     # labels =[]
 53     for cnt in contours:
 54         #计算并返回指定点集的最小边界矩形
 55         [x,y,w,h] = cv2.boundingRect(cnt)
 56 
 57         if w>30 and h>(height/4):
 58             if y<(height/2):
 59                 list1.append([x,y,w,h])
 60             else:
 61                 list2.append([x,y,w,h])
 62 
 63     # key = lambda 元素: 元素[字段索引] x:x[]可以随意修改，排序方式按照括号[]里面的维度进行
 64     # [0] 第一维排序、 [1]第一维排序...以此类推            
 65     list1_sorted = sorted(list1, key=lambda t:t[0])
 66     list2_sorted = sorted(list2, key=lambda t:t[0])
 67 
 68     #提取出每一个数字所在的矩形框，作为ROI取出
 69     for i in range(5):
 70         [x1,y1,w1,h1] = list1_sorted[i]
 71         [x2,y2,w2,h2] = list2_sorted[i]
 72         number_roi1 = gray[y1:y1+h1, x1:x1+w1] #Cut the frame to size
 73         number_roi2 = gray[y2:y2+h2, x2:x2+w2] #...
 74 
 75         #数据预处理
 76         #1. 把每一张ROI转换为40x20 ##########注意Opencv 里是宽x长
 77         resized_roi1 = cv2.resize(number_roi1,(20,40))
 78         resized_roi2 = cv2.resize(number_roi2,(20,40))
 79 
 80         #2.阈值分割 灰度图-> 0-1二值图
 81         thresh1 = cv2.adaptiveThreshold(resized_roi1,255,1,1,11,2)
 82         thresh2 = cv2.adaptiveThreshold(resized_roi2,255,1,1,11,2)
 83 
 84         #3.创建文件夹
 85         # path_1 = "F:\\opencv_learning\\opencv_knn_soduko\\New_datasets\\"+str(i+1)+"\\"+str(k)+".jpg"
 86         # j = 0
 87         number_path1 = "datasets_hzc\\%s\\%d" % (str(i+1),k) + '.jpg'
 88 
 89         j = i+6
 90         if j == 10:
 91             j = 0
 92         # path_2 = "F:\\opencv_learning\\opencv_knn_soduko\\New_datasets\\"+str(j)+"\\"+str(k)+".jpg"
 93         number_path2 = "datasets_hzc\\%s\\%d" % (str(j),k) + '.jpg'
 94         k+=1
 95 
 96         #4.Nomalized 
 97         # normalized_roi1
 98         normalized_roil = thresh1/255
 99         normalized_roi2 = thresh2/255
100         # 5.write into the files
101         # P.S Don't forget to annotate them, because u dont have to operate it twice
102         # cv2.imwrite(number_path1,thresh1)
103         # cv2.imwrite(number_path2,thresh2)
104         #把处理完的二值图像展开成一行，以便knn处理
105         sample_1 = normalized_roil.reshape((1,800))
106         samples.append(sample_1[0])
107         labels.append(float(i+1))
108 
109         #保存sample供训练用
110         sample_2 = normalized_roi2.reshape((1,800))
111         samples.append(sample_2[0])
112         #把数字标签按照数字的保存顺序对应保存成训练用的数据
113         labels.append(float(j))
114 
115         cv2.imwrite(number_path1,thresh1)
116         cv2.imwrite(number_path2,thresh2)
117         cv2.imshow("number",normalized_roil)
118         cv2.waitKey(5)
119         # cv2.imshow("1", thresh1)
120         # cv2.imshow("2", thresh2)
121         # cv2.waitKey(300)
122     # print(list1_sorted)
123     # print(list2_sorted)
124     # cv2.imshow("train_pic", image)
125     # cv2.waitKey(300)
126 
127 #保运数据供KNN用 k邻近算法
128 print(np.array(labels).shape)
129 
130 print("\n"*3)
131 samples = np.array(samples,np.float32)
132 #(100,800) 1x100, 800
133 print("train_data's dimension:{}".format(samples.shape))
134 labels = np.array(labels,np.float32)
135 labels = labels.reshape((labels.size,1))
136 print("labels's dimension:{}".format(labels.shape))
137 np.save('samples,npy',samples)
138 np.save('label.npy',labels)

- 3.数独生成和求解

  1 # -*- coding: UTF-8 -*-
  2 import numpy as np
  3 import cv2
  4 
  5 # 利用K邻近算法来进行数字识别
  6 # 1.加载上面保存的样本和标签数据；
  7 # 2.分别用80个作为训练数据，20个作为测试数据；
  8 # 3.用opencv自带的knn训练模型；
  9 # 4.用训练好的模型识别测试数据中的数字；
 10 # 5.输出预测值和实际标签值。
 11 
 12 # print(labels.T)
 13 # print(np.sort(labels, axis=0, kind='quicksort')
 14 # print("samples's shape:{}".format(samples.shape))
 15 
 16 ## 数独求解算法，回溯法。来源见下面链接，有细微改动。
 17 ## http://stackoverflow.com/questions/1697334/algorithm-for-solving-sudoku
 18 def findNextCellToFill(grid, i, j):
 19     for x in range(i,9):
 20         for y in range(j,9):
 21             if grid[x][y] == 0:
 22                 return x,y
 23     for x in range(0,9):
 24         for y in range(0,9):
 25             if grid[x][y] == 0:
 26                 return x,y
 27     return -1,-1
 28 
 29 def isValid(grid, i, j, e):
 30     rowOk = all([e != grid[i][x] for x in range(9)])
 31     if rowOk:
 32         columnOk = all([e != grid[x][j] for x in range(9)])
 33         if columnOk:
 34             # finding the top left x,y co-ordinates of the section containing the i,j cell
 35             secTopX, secTopY = 3 *int(i/3), 3 *int(j/3)
 36             for x in range(secTopX, secTopX+3):
 37                 for y in range(secTopY, secTopY+3):
 38                     if grid[x][y] == e:
 39                         return False
 40                 return True
 41     return False
 42 
 43 def solveSudoku(grid, i=0, j=0):
 44     i,j = findNextCellToFill(grid, i, j)
 45     if i == -1:
 46         return True
 47     for e in range(1,10):
 48         if isValid(grid,i,j,e):
 49             grid[i][j] = e
 50             if solveSudoku(grid, i, j):
 51                 return True
 52             # Undo the current cell for backtracking
 53             grid[i][j] = 0
 54     return False
 55 
 56 samples = np.load('F:/opencv_learning/opencv-soduko-master/samples.npy')
 57 labels = np.load('F:/opencv_learning/opencv-soduko-master/label.npy')
 58 
 59 k=80
 60 train_label = labels[:k]
 61 test_label = labels[k:]
 62 train_input = samples[:k]
 63 test_input = samples[k:]
 64 print("train_label's shape:{}".format(train_label.shape))
 65 print("train_label's shape:{}".format(train_input.shape))
 66 print("test_label's shape:{}".format(test_label.shape))
 67 print("test_input's shape:{}".format(test_input.shape))
 68 
 69 #create the model
 70 model = cv2.ml.KNearest_create()
 71 model.train(train_input,cv2.ml.ROW_SAMPLE,train_label)
 72 
 73 #retval:返回值类型说明
 74 # retval,results,neigh_resp,dists = model.findNearest(test_input,1)
 75 # string = results.ravel()
 76 # print(string)
 77 # print(test_label.reshape(1,len(test_label))[0])
 78 # print(string.shape)
 79 # print("The original label:{}".format(test_label.reshape(1,len(test_label))[0]))
 80 # print("The prediction:{}".format(str(string)))
 81 
 82 # # print(test_label.T)
 83 # count = 0
 84 # string = string.reshape(1,len(string))
 85 # print(string.shape)
 86 # string = np.array(string).T
 87 
 88 # for index,value in enumerate(test_label):
 89 #     if value != string[index]:
 90 #         count+=1
 91 # print(count)
 92 
 93 # accuracy = 1 - (count / len(string))
 94 # print("The test accuracy:{}".format(accuracy))
 95 
 96 img = cv2.imread('./images/001.jpg')
 97 gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
 98 
 99 ##阈值分割
100 ret,thresh = cv2.threshold(gray,200,255,1)
101 
102 # in ord to do morphological operation: returns a structuring element
103 kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
104 #Brief dilated an image by a specific structuring element
105 #膨胀
106 dilated = cv2.dilate(thresh,kernel)
107 
108 #提取轮廓
109 image, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_TREE,\
110             cv2.CHAIN_APPROX_SIMPLE)
111 
112 #提取81个小方格
113 #hierarchy[Next,Previous,First_Child,Parent])
114 boxes = []
115 for i in range(len(hierarchy[0])):
116     if hierarchy[0][i][3] == 0:
117         boxes.append(hierarchy[0][i])
118 
119 print("81个小方格: dimension:{}".format(np.array(boxes).shape))
120 # print(boxes)
121 
122 height,width = img.shape[:2]
123 
124 #"/9" 9行9列
125 box_h = height/9
126 box_w = height/9
127 number_boxes = []
128 
129 #数独转化为零阵
130 soduko = np.zeros((9,9),np.int32)
131 
132 for j in range(len(boxes)):
133     if boxes[j][2] != -1:
134         #Calculates and returns the minimal up-right bounding 
135         #rectangle for the specified point set
136         x,y,w,h = cv2.boundingRect(contours[boxes[j][2]])
137         number_boxes.append([x,y,w,h])
138         #process the data that was extracted
139         number_roi = gray[y:y+h,x:x+w]
140         #unit the size 
141         resized_roi = cv2.resize(number_roi,(20,40))
142         thresh1 = cv2.adaptiveThreshold(resized_roi,255,1,1,11,2)
143         #归一化像素值
144         normalized_roi = thresh1/255
145 
146         #展开一行让knn识别
147         sample1 = normalized_roi.reshape((1,800))
148         sample1 = np.array(sample1,np.float32)
149 
150         retval,result,neigh_resp,dists = model.findNearest(sample1,1)
151         number = int(result.ravel()[0])
152         # print(results.ravel())
153 
154         #识别结果
155         cv2.putText(img,str(number),(x+w+1,y+h-20), 3, 2., (255, 0, 0), 2, cv2.LINE_AA)
156         
157         #矩阵中位置
158         soduko[int(y/box_h)][int(x/box_w)] = number
159 
160         cv2.namedWindow("img", cv2.WINDOW_NORMAL)
161         cv2.imshow("img", img)
162         # cv2.imshow("normalized_roi",normalized_roi)
163         cv2.waitKey(120)
164 
165 print("\n生成的数独\n")
166 print(soduko)
167 print("\n求解后的数独\n")
168 
169 #数独求解
170 solveSudoku(soduko)
171 print(soduko)
172 
173 print("\n验算：求每行每列的和\n")
174 #map求和 map根据指定的函数对指定序列做映射
175 #map(function, iterable)
176 # row_sum = map(sum,soduko)
177 # col_sum = map(sum,zip(*soduko))
178 print(soduko.shape)
179 # np.sum(0>>row, 1/-1>>column)
180 row_sum = np.sum(soduko,axis = 0)
181 col_sum = np.sum(soduko,axis = -1,keepdims=1)
182 print(row_sum)
183 print(col_sum.T)
184 
185 #把对应结果按照位置填充图片中
186 for i in range(9):
187     for j in range(9):
188         x = int((i+0.25)*box_w) 
189         y = int((j+0.5)*box_h)
190         #brief Draws a text string.
191         cv2.putText(img,str(soduko[j][i]),(x,y), 3, 2.5, (0, 0, 255), 2, cv2.LINE_AA)
192 
193 cv2.namedWindow("img",cv2.WINDOW_NORMAL)
194 cv2.imshow("img", img)
195 cv2.waitKey(10000)
196         
197 # np.sum(a, axis=None, dtype=None, out=None, keepdims=np._NoValue, initial=np._NoValue)
198 # print(list(row_sum))
199 # print(list(col_sum))
200 
201 # cv2.imshow("origin",img)
202 # cv2.waitKey(300)
203 # cv2.imshow("gray",gray)
204 # cv2.waitKey(300)
205 # cv2.imshow("thresh",thresh)
206 # cv2.imshow("dilated",dilated)
207 # cv2.waitKey(0)
208 # cv2.
209 #SVM
210 # C = 5
211 # gamma = 0.5
212 # model_svm = cv2.ml.SVM_create()
213 # model_svm.setGamma(gamma)
214 # model_svm.setC(C)
215 
216 # #LINEAR MODEL
217 # model_svm.setKernel(cv2.ml.SVM_LINEAR)
218 # model_svm.setType(cv2.ml.SVM_C_SVC)
219 # model_svm.train(train_input, cv2.ml.ROW_SAMPLE, train_label)
220 # predict_label = model_svm.predict(test_input)[1].ravel()
221 # print(predict_label)
222 # print(test_label.reshape(1,len(test_label))[0])