Opencv| Handwritten Digit Detection
Today I want to share some basic steps of a project for handwritten digit detection. At the end of the article, I put a small project about the principle of the KNN algorithm which could help us build a clearer idea of KNN algorithm.
Step 1. Import some packages for the project.
import numpy as np
import scipy.special
import matplotlib.pyplot as plt
Step 2. Import and preliminary processing of the image.
img = cv2.imread('digitsbig.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Step 3. Build the training and test array.
In this step, we divided the image into 100 columns and 50 rows to separate every single number in the training array. Then we created the training array and test array in a proper size.(In this case which is 2500*400)
cells = [np.hsplit(row, 100) for row in np.vsplit(gray, 50)]
#divide the image to 100 columns and 50 rows to make every element a single number.
x = np.array(cells)
train = x[:, : 50].reshape(-1, 400).astype(np.float32)
# turn the training array to an array of size 2500*400
test = x[:, 50: 100].reshape(-1, 400).astype(np.float32)
# turn the testing array to an array of size 2500*400
Step 4. Create training labels and test labels.
k = np.arange(10) #k=1,2,3...9
train_labels = np.repeat(k, 250)[:, np.newaxis]
test_labels = train_labels.copy()
Step 5. Initialization of k-NearestNeighbor & Test execution
Next, we created the k-NearestNeighbor algorithm and tried to train the training data, and set k = 5 to test the test data.
knn = cv2.ml.KNearest_create()
knn.train(train, cv2.ml.ROW_SAMPLE, train_labels)
ret, result, neighbours,dist = knn.findNearest(test, k=5)
Step 6. Calculation of the accuracy of the KNN algorithm.
In this step, we tried to calculate and print the accuracy of the KNN algorithm.
matches = result == test_labels
correct = np.count_nonzero(matches)
accuracy = correct * 100.0 / result.size
print('准确率:', accuracy)
Step 7. Do the detection of hand written digits by means of KNN
Details can be seen in the notes of the code below.
# Forecaste handwritten digits
retval = knn.predict(test[2005:2007])
# Docstring: predict(samples[, results[, flags]])->retval
print(retval[0])
# print the retval of KNN prediction.
cv2.imshow('test', test[2005].reshape((20, 20)))
#show the number we select to be detected.
cv2.waitKey(0)
cv2.destroyAllWindows()
result:
Find the Point Closest to the Test Point
ps: Details can be seen in the notes of the code below.
Step 1. Import some packages for the project.
import cv2
import numpy as np
import matplotlib.pyplot as plt
Step 2. Create training data and labels.
# Set 25 rows and 2 columns of known training data containing (x, y)
trainData = np.random.randint(0, 100, (25, 2)).astype(np.float32)
# labels: 0 represents red, 1 represents blue.
labels = np.random.randint(0, 2, (25, 1)).astype(np.float32)
Step 3. print the scatter of red triangle and blue rectangle.
# find the red and print:
red = trainData[labels.ravel() == 0]
# Dimension reduction for ravel()
plt.scatter(red[:, 0], red[:, 1], 80, 'r', '^')
#Plot a triangle scatter of red color
# find the blue and print:
blue = trainData[labels.ravel() == 1]
# Dimension reduction for ravel()
plt.scatter(blue[:, 0], blue[:, 1], 80, 'b', 's')
#Plot a triangle scatter of blue color
Step 4. Create the test data with green circle.
test = np.random.randint(0, 100, (1, 2)).astype(np.float32)
plt.scatter(test[:, 0], test[:, 1], 80, 'g', 'o')
Step 5. Initialization of k-NearestNeighbor & Test execution
ps: k=1 represents the nearest neighbor algorithm, you can also try another value of k, which shows a few closest point of the test circle.
knn = cv2.ml.KNearest_create()
knn.train(trainData, cv2.ml.ROW_SAMPLE, labels)
results, neighbours, dist = knn.findNearest(test, k=1)
Step 6. Print the results.
print("result: ", results, "\n")
print("neighbours: ", neighbours, "\n")
print("distance: ", dist)
plt.show()
Results:
The return values include:
- The category flag of the test data calculated by the KNN algorithm is 0 or 1.
(0 represents red triangle , 1 represents blue rectangle) - The category flags of the k nearest neighbors.
- The distance from each nearest neighbor to the test data.
Thank you for reading!
--credit by dora 2020.4.3
resources:
https://www.cnblogs.com/gaowenxingxing/p/11829424.html
https://blog.youkuaiyun.com/github_39611196/article/details/81218563
https://www.bilibili.com/video/BV1g441127XL?
https://pysource.com/2018/08/26/knn-handwritten-digits-recognition-opencv-3-4-with-python-3-tutorial-36/