手写数字识别的小优化

 

       在用KNN实现手写数字识别的时候突发奇想是否可以根据数字的特点对其进行一个分类,以此来提高判断的准确率。经过许多天的改进与完善终于实现了此算法,就称其为洞拐法吧。(第一次写,有许多不足之处,还望多多包含并指正)

       将数字进行分类如下

            ①有洞的

                    一个洞:0,4,6,9

                    两个洞:8

            ②无洞的

                    没有拐点:1

                    有一个拐点:7

                    有两个拐点:2

                    有大于等于三个拐点:5,6

         (以上为理想的分类条件,但由于手写习惯的不同,往往会将2,3,5也判断为带洞的,需特殊处理,详见代码)

 

        洞和底判断的大体思路

洞的判断:
判断准备:将每一行中的1看成一段,记录其起始位置和					终止位置。
例:00001111111100011111100000000
    则记录为  list = [4,11,15,20]
若为000011111111000111111000111100
有三段(通常为连笔导致,为了不影响判断,也只记录前两段)
记录为  list = [4,11,15,20]
若为00001111111111111110000000
只有一段则记为  list = [4,15,0,0]

若list[2]==0或list[2]-list[1]<=6(防止漏判不封顶的洞)则将成连续的。记为[1,list[0],list[1],list[2],list[3]]。若为断开则记为
[0,list[0],list[1],list[2],list[3]]
     	顶的判断:若上一行连续下一行断开则判断此行为顶。
底的判断:上一行为断开的下一行为连续的,且对应的顶和底之间的行数要大部分为断开的,且两顶间的距	离>=5。(必须要先有一个顶才能判断为底)
拐法详解:
记录每一行1的平均位置
例:若为00001111111111111110000000
只有一段list = [4,15,0,0],则x=(list[0]+list[1])/2
若为00001111111100011111100000000
 有两段list = [4,11,15,20]则只记后一段即可(不影响判断)
x=(list[2]+list[3])/2
然后将重复值(相距小于0.5的看成重复值)去掉(相加除于二,合二为一),防止手抖时错识拐点。

 

        由于手写习惯问题,书写数字的时候常常会把洞写的不封口,因此简化了对洞判断的要求。通过底和顶来判断是否是洞,下面是判断顶和底时的先前准备,记录下每一行1的位置。

def holeJudge(ar)  # ar为图片二值化为28*28的矩阵
    judgei = []  # 存放每一行的信息
    lista = []  # 存放顶和底
    for i in range(28):  # 有换行,因此循环28次
        startj1 = -1  # 第i行中第一段的起始点位
        startj2 = -1  # 第i行中第二段的起始点位
        endj1 = 0  # 第i行中第一段的终止点位
        endj2 = 0  # 第i行中第二段的终止点位
        # 经反复测试如果有第三段也不用考虑,并不会影响判断(种类太多,不在列举)

        for j in range(28):  # 检测一行中断开的位置
            if ar[i][j] == 1 and startj1 == -1:  # 记录第一段点位
                startj1 = j
                while ar[i][j] == 1 and j < 27:  # (j<27)最后一位特殊对待
                    j += 1
                startj2 = j - 1
            if startj2 != -1 and ar[i][j] == 1 and ar[i][j - 1] == 0:  # 记录第二段点位
                if j < 27 and startj2 >= 26:
                    endj1 = 0
                    endj2 = 0
                else:
                    endj1 = j
                    while ar[i][j] == 1 and j < 27:
                        j += 1
                    endj2 = j
                    break
        if endj1 - startj2 <= 6 or endj1 == 0:  # 判断是否连续的,满足记为1,否者记为0(有断开且断的不大也算连续)(反复测试确定6为最佳数字)
            judgei.append([1, startj1, startj2, endj1, endj2])  # 1表示此行为连续
        else:
            judgei.append([0, startj1, startj2, endj1, endj2])  # 0表示此行断开

以下为顶和底的判断(接上)

    jjj = 0
    for i in range(28):
        if (judgei[i-1][0]==1 and judgei[i][0]==0 ):#上一行为断开不大的连续 下一行为断开,则可能为顶
            if len(lista)==0:                       #如果还未记录顶,则令其为顶(第一个顶的确定要求较低)
                starti = 1
                lista.append([starti, i])
            if len(lista) == 2 and lista[1][0] == 2: # 如果已有一洞(条件为一顶一底)
                p = 0
                jj = lista[1][1]                        #第一个底的位置
                while jj < i:                           # 更加严谨的判断顶
                    jj += 1
                    if judgei[jj][3] == 0: # 记录顶和底之间的行是为连续的个数,有多个连续行的话则不是顶(防止有顶和底的3判断成洞,以及其它奇怪手写数字)
                        p += 1
                if p <= 5:                 # 反复测试得连续行数小于等于5行时可判断为顶
                    starti = 1
                    lista.append([starti, i])
        if (judgei[i-1][3]-judgei[i-1][2]>1 and judgei[i][3]==0 and judgei[i][2]-judgei[i][1]>=judgei[i-1][3]-judgei[i-1][2]-4 and len(lista)!=0) or \
                (len(lista)!=0 and judgei[i-1][3]-judgei[i-1][2]>1 and judgei[i][3]!=0 and judgei[i][1]<=judgei[i-1][2]+1and judgei[i][2]>=judgei[i-1][3]-1):
                #judgei[i][1]<=18防止9的中右部分识别为底,竖往中间斜的9
                #不用顶的判断方式是为了防止0下部过于扁平识别不了  第二行为奇怪9的判断方法\
            ddd=29
            p = 0
            if len(lista) == 1:#满足底的条件且有且只有一个顶
                ddd = jjj = lista[0][1]
            if len(lista) == 3 and lista[2][0] == 1:
                ddd = jjj = lista[2][1]
            while jjj < i:  # 更加严谨的判断底,直接区别2,3,5
                jjj += 1
                if judgei[jjj][3] == 0:  # 看顶和底之间有几个连续的行
                    p += 1
            if p <= 4 and (jjj - ddd) >= 5:  # 连续的行数小于5则为底(不怕带开口的零等会受影响,因为顶端为从开口处的那一行)(开口不对称导致有连续行存在)
                endi = 2
                lista.append([endi, jjj])

    返回待判断数字的洞的特征(接上)

        if len(lista) >= 4:
            return [2, 8, lista, judgei]
        elif len(lista) == 3:
            return [9, 0, lista, judgei]
        elif len(lista) == 2 and lista[1][0] != 0:  # 有两个,且有底(防止奇怪的3)
            return [1, 0, lista, judgei]
        elif len(lista) == 1 or (len(lista) == 2 and lista[1][0] == 0):  # 只有一个,或两个没底
            return [3, 0, lista, judgei]  # 2,3,5, 7
        elif len(lista) == 0:  # 没顶没底
            return [4, 0, lista, judgei]  # 1,2,3,5,7

 结果判断

def answer(ar,t35,t49,t56,t3469,t24689,t2356,listar):#ar为28*28的矩阵,t****为训练集,listar为标签
    listt = holeJudge(ar)#判断洞的特征值
    if listt[0]==2:
        return 8
    if listt[0]==1:#一个洞    classify()为knn判断函数
        if listt[2][0][1]<=6 and listt[2][1][1]>=23:
            return 0
        elif 7<=listt[2][0][1] <=22 and listt[2][1][1] >= 23:#区别有洞的5
            print("knn56")
            return classify(listar, t56[0], t56[1], 7)
        elif 7<=listt[2][1][1] <=22 and listt[2][0][1] <= 6:#判断4和9
            print("knn49")
            return classify(listar,t49[0],t49[1],7)
        else:#1,7不可能有洞
            return classify(listar,t24689[0],t24689[1],7)
    if listt[0]==9:#2顶1底(可以在减少knn个数17)
        return classify(listar,t3469[0],t3469[1],7)
    if listt[0]==3 :#只有顶   判断奇怪的2,3,5, 6 (上部带弯沟的6)  7
        #knn奇怪的2,3,5,6,7###########################################
        xy = []
        abc=[]
        for i in range(28):  # 记录横坐标 将7与23456区分
            if listt[3][i][3] != 0:
                x = (listt[3][i][4] + listt[3][i][3]) / 2
            else:
                x = (listt[3][i][2] + listt[3][i][1]) / 2
            xy.append(x)
            abc.append(i)
        i = 1
        while i < len(xy):  # 去掉重复的值
            try:
                if 0 <= xy[i - 1] - xy[i] <= 0.5 or 0 <= xy[i] - xy[i - 1] <= 0.5:  # 如果两个相差二也叫相等,且取其平均值
                    xy[i] = (xy[i - 1] + xy[i]) / 2
                    del xy[i - 1]
                    del abc[i - 1]
                    i = i - 1
            except:
                break
            i = i + 1
        kkk = []#拐法
        i = 1
        emmmm=0
        while i < len(xy) - 1:  # 记录拐点
            if (xy[i - 1] - xy[i] <= -0.75 and xy[i] - xy[i + 1] >= 0.75) or (
                    xy[i - 1] - xy[i] >= 0.75 and xy[i] - xy[i + 1] <= -0.75):
                kkk.append(abc[i])
                if xy[i]-xy[i+1]>0 and i>=23:
                    emmmm=1
            if i == len(xy) - 1 and (xy[-2] - xy[-1] <= -10 or xy[-2] - xy[-1] >= 10):
                # 底边特殊对待,还有防止最后一行才开始增加的2
                kkk.append(abc[i])
                if xy[i]-xy[i+1]>0 and i>=23:
                    emmmm=1
            i += 1
        if len(kkk)==0 or (len(kkk)==1 and kkk[0]<=7) or (len(kkk)==2 and kkk[1]-kkk[0]<=2 and kkk[1]<=7):
            return 7
        else:
            print("knn2356")
            return classify(listar,t2356[0],t2356[1],3)
    if listt[0]==4:#没顶没底   判断1,2,3,5,7
        xy=[]
        width = 0
        for i in range(28):#记录横坐标
            if listt[3][i][3]!=0:
                x = (listt[3][i][4] + listt[3][i][3]) / 2
            else:
                x = (listt[3][i][2] + listt[3][i][1]) / 2
            xy.append(x)
            width+=(listt[3][i][2]-listt[3][i][1]+1)#1的平军宽度大于7
        i=1
        while i<len(xy):#去掉重复的值
            try:
                if 0<= xy[i - 1] - xy[i] <= 0.5 or 0<=xy[i ] - xy[i-1] <=0.5:  # 如果两个相差二也叫相等,且取其平均值
                    xy[i] = (xy[i - 1] + xy[i]) / 2
                    del xy[i - 1]
                    i = i - 1
            except:
                break
            i=i+1
        kkk = []
        i = 1
        while i < len(xy) - 1:  # 记录拐点
            if (xy[i - 1] - xy[i] <= 0 and xy[i] - xy[i + 1] >= 0) or (
                    xy[i - 1] - xy[i] >= 0 and xy[i] - xy[i + 1] <= 0):
                kkk.append(i)
            if i == len(xy) - 1 and (xy[-2] - xy[-1] <= -10 or xy[-2] - xy[-1] >= 10):
                # 底边特殊对待,还有防止最后一行才开始增加的2
                kkk.append(i)
            i += 1
        if width>=280:
            return 1
        else:#拐法
            if len(kkk) == 0:
                return 1
            elif len(kkk) == 1 and kkk[0] <= 8:
                return 7
            elif len(kkk) == 2 or (len(kkk) == 3 and kkk[1] - kkk[0] <= 2) or (len(kkk) == 1 and kkk[0] >= 15):
                return 2
            elif len(kkk) >= 3:  # 3和5 没顶没底的
                print("knn35")
                return classify(listar, t35[0], t35[1], 9)

以下为全部代码

注:训练数据为自己手写的数字,每组中每种数字共50个,将其二值化后存放与txt中直接使用(如t35中有50个3和50个5),共6组训练集(t35,t49,t56,t3469,t24689,t2356)

       每组数字都有其特点

       t35: 没有顶和底的3和5

       t49:没要求

       t56:5的下半部分很像洞的5

       t3469:有点像8的3,中见的横有的凸的4,像印刷体的6,像印刷体的9(两顶一底)

       t24689:没要求(顶和底所在的位置和通常情况不一样)

       t2356:只有一个顶的2(无底),只有一个顶的3(无底),只有一个顶的5(无底),只有一个顶的6(无底)像印刷体的6但洞很小

from skimage import io, transform, filters, restoration
import numpy as nu
####################################################
import os
import operator as op
def lableTxt():#标签列表创建函数
    l35=[]
    l49=[]
    l56=[]
    l3469=[]
    l24689=[]
    l2356=[]
    for i in range(50):
        l35.append(3)
        l49.append(4)
        l56.append(5)
        l3469.append(3)
        l24689.append(2)
        l2356.append(2)
    for i in range(50):
        l35.append(5)
        l49.append(9)
        l56.append(6)
        l3469.append(4)
        l24689.append(4)
        l2356.append(3)
    for i in range(50):
        l3469.append(6)
        l24689.append(6)
        l2356.append(5)
    for i in range(50):
        l3469.append(9)
        l24689.append(8)
        l2356.append(6)
    for i in range(50):
        l24689.append(9)
    return l35,l49,l56,l3469,l24689,l2356
def readTxt(txtname):从txt中读取矩阵的函数
    ar=[]
    for line in open(txtname):
        a=list(line)
        a.pop()
        d=[]
        for b in range(28*28):
            c=int(a[b])
            d.append(c)
        ar.append(d)
    return nu.array(ar)
def arrayChange(ar):#list至矩阵的转换
    onear=[]
    for j in range(28):
        onear.append(ar[28*j:28*(j+1)])
    return nu.array(onear)
def chujianxi(ar):#使断开很小的地方填充(使洞的判断更准确)
    for i in range(28):                                                                                 #有换行,因此循环28次
        startj1 = -1
        startj2 = -1
        for j in range(28) :#检测一行中断开的位置
            if ar[i][j] == 1 and startj1 == -1:#记录第一段点位
                startj1=j
                while ar[i][j] ==1 and j<27:
                    j +=1
                startj2 = j-1
            if startj2 != -1 and ar[i][j]==1 and ar[i][j-1]==0:#记录第二段点位
                endj1 = j
                while ar[i][j] ==1 and j<27:
                    j +=1
                endj2 = j
                if endj1-startj2<=4:
                    for a in range(1,endj1-startj2):
                        ar[i][startj2+a]=1
                    startj2=endj2-1
    return ar
def holeJudge(ar):#判断洞
    ar=chujianxi(ar)
    judgei=[]
    lista=[]
    for i in range(28):#有换行,因此循环28次
        startj1 = -1
        startj2 = -1
        endj1=0
        endj2 = 0
        for j in range(28) :#检测一行中断开的位置
            if ar[i][j] == 1 and startj1 == -1:#记录第一段点位
                startj1=j
                while ar[i][j] ==1 and j<27:
                    j +=1
                startj2 = j-1
            if startj2 != -1 and ar[i][j]==1 and ar[i][j-1]==0:#记录第二段点位
                if j<27 and startj2>=26:
                    endj1 = 0
                    endj2 = 0
                else:
                    endj1 = j
                    while ar[i][j] == 1 and j < 27:
                        j += 1
                    endj2 = j
                    break
        if endj1-startj2<=6 or endj1==0:#判断是否连续的,满足记为1,否者记为0(有断开且断电不大也算连续)
            judgei.append([1,startj1,startj2,endj1,endj2])
        else:
            judgei.append([0,startj1,startj2,endj1,endj2])
    jjj = 0
   # print(judgei)
    for i in range(28):
        if (judgei[i-1][0]==1 and judgei[i][0]==0 ):#上一行为断开不大的连续 下一行为断开,则可能为顶
            if len(lista)==0:                       #如果还未记录顶,则令其为顶(第一个顶的确定要求较低)
                starti = 1
                lista.append([starti, i])
            if len(lista) == 2 and lista[1][0] == 2: # 如果已有一洞(条件为一顶一底)
                p = 0
                jj = lista[1][1]                        #第一个低的位置
                while jj < i:                           # 更加严谨的判断顶
                    jj += 1
                    if judgei[jj][3] == 0: # 看顶和底之间的行是否为连续的,有多个的话则不是顶
                        p += 1
                if p <= 5:
                    starti = 1
                    lista.append([starti, i])

        if (judgei[i-1][3]-judgei[i-1][2]>1 and judgei[i][3]==0 and judgei[i][2]-judgei[i][1]>=judgei[i-1][3]-judgei[i-1][2]-4 and len(lista)!=0) or \
                (len(lista)!=0 and judgei[i-1][3]-judgei[i-1][2]>1 and judgei[i][3]!=0 and judgei[i][1]<=judgei[i-1][2]+1and judgei[i][2]>=judgei[i-1][3]-1):
                #judgei[i][1]<=18防止9的中右部分识别为底,竖往中间斜的9
                #不用顶的判断方式是为了防止0下部过于扁平识别不了  后面为奇怪9的判断方法\
            ddd=29
            p = 0
            if len(lista) == 1:#满足底的条件且有且只有一个顶
                ddd = jjj = lista[0][1]
            if len(lista) == 3 and lista[2][0] == 1:
                ddd = jjj = lista[2][1]
            while jjj < i:  # 更加严谨的判断底,直接区别2,3,5
                jjj += 1
                if judgei[jjj][3] == 0:  # 看顶和底之间有几个连续的行
                    p += 1
            #print(jjj,ddd)
            if p <= 4 and (jjj - ddd) >= 5:  # 没有连续的则为底(不怕带开口的零等会受影响,因为顶端为开口处)
                endi = 2
                lista.append([endi, jjj])
    #(lista)
    if len(lista) >= 4:
        return [2, 8, lista, judgei]
    elif len(lista) == 3:
        return [9, 0, lista, judgei]
    elif len(lista) == 2 and lista[1][0] != 0:  # 有两个,且有底(防止奇怪的3)
        return [1, 0, lista, judgei]
    elif len(lista) == 1 or (len(lista) == 2 and lista[1][0] == 0):  # 只有一个,或两个没底
        return [3, 0, lista, judgei]  # 2,3,5, 7
    elif len(lista) == 0:  # 没顶没底
        return [4, 0, lista, judgei]  # 1,2,3,5,7
def answer(ar,t35,t49,t56,t3469,t24689,t2356,listar):
    listt = holeJudge(ar)
    if listt[0]==2:
        return 8
    if listt[0]==1:#一个洞
        if listt[2][0][1]<=6 and listt[2][1][1]>=23:
            return 0
        elif 7<=listt[2][0][1] <=22 and listt[2][1][1] >= 23:#区别有洞的5
            print("knn56")
            return classify(listar, t56[0], t56[1], 7)
        elif 7<=listt[2][1][1] <=22 and listt[2][0][1] <= 6:#判断4和9
            print("knn49")
            return classify(listar,t49[0],t49[1],7)
        else:#1,7不可能有洞
            #knn0到9
            return classify(listar,t24689[0],t24689[1],7)
    if listt[0]==9:#2顶1底(可以在减少knn个数17)
        print("knn3469")
        return classify(listar,t3469[0],t3469[1],7)
    if listt[0]==3 :#只有顶   判断奇怪的2,3,5, 6 (上部带弯沟的6)  7
        #knn奇怪的2,3,5, 6,7###################################################################
        xy = []
        abc=[]
        for i in range(28):  # 记录横坐标 将7与23456区分
            if listt[3][i][3] != 0:
                x = (listt[3][i][4] + listt[3][i][3]) / 2
            else:
                x = (listt[3][i][2] + listt[3][i][1]) / 2
            xy.append(x)
            abc.append(i)
        i = 1
        while i < len(xy):  # 去掉重复的值
            try:
                if 0 <= xy[i - 1] - xy[i] <= 0.5 or 0 <= xy[i] - xy[i - 1] <= 0.5:  # 如果两个相差二也叫相等,且取其平均值
                    xy[i] = (xy[i - 1] + xy[i]) / 2
                    del xy[i - 1]
                    del abc[i - 1]
                    i = i - 1
            except:
                break
            i = i + 1
        kkk = []#拐法
        i = 1
        emmmm=0
        while i < len(xy) - 1:  # 记录拐点
            if (xy[i - 1] - xy[i] <= -0.75 and xy[i] - xy[i + 1] >= 0.75) or (
                    xy[i - 1] - xy[i] >= 0.75 and xy[i] - xy[i + 1] <= -0.75):
                kkk.append(abc[i])
                if xy[i]-xy[i+1]>0 and i>=23:
                    emmmm=1
            if i == len(xy) - 1 and (xy[-2] - xy[-1] <= -10 or xy[-2] - xy[-1] >= 10):
                # 底边特殊对待,还有防止最后一行才开始增加的2
                kkk.append(abc[i])
                if xy[i]-xy[i+1]>0 and i>=23:
                    emmmm=1
            i += 1
        if len(kkk)==0 or (len(kkk)==1 and kkk[0]<=7) or (len(kkk)==2 and kkk[1]-kkk[0]<=2 and kkk[1]<=7):
            return 7
        else:
            print("knn2356")
            return classify(listar,t2356[0],t2356[1],3)
    if listt[0]==4:#没顶没底   判断1,2,3,5,7
        xy=[]
        width = 0
        for i in range(28):#记录横坐标
            if listt[3][i][3]!=0:
                x = (listt[3][i][4] + listt[3][i][3]) / 2
            else:
                x = (listt[3][i][2] + listt[3][i][1]) / 2
            xy.append(x)
            width+=(listt[3][i][2]-listt[3][i][1]+1)#1的平军宽度大于7
        i=1
        while i<len(xy):#去掉重复的值
            try:
                if 0<= xy[i - 1] - xy[i] <= 0.5 or 0<=xy[i ] - xy[i-1] <=0.5:  # 如果两个相差二也叫相等,且取其平均值
                    xy[i] = (xy[i - 1] + xy[i]) / 2
                    del xy[i - 1]
                    i = i - 1
            except:
                break
            i=i+1
        kkk = []
        i = 1
        while i < len(xy) - 1:  # 记录拐点
            if (xy[i - 1] - xy[i] <= 0 and xy[i] - xy[i + 1] >= 0) or (
                    xy[i - 1] - xy[i] >= 0 and xy[i] - xy[i + 1] <= 0):
                kkk.append(i)
            if i == len(xy) - 1 and (xy[-2] - xy[-1] <= -10 or xy[-2] - xy[-1] >= 10):
                # 底边特殊对待,还有防止最后一行才开始增加的2
                kkk.append(i)
            i += 1
        if width>=280:
            return 1
        else:#拐法
            if len(kkk) == 0:
                return 1
            elif len(kkk) == 1 and kkk[0] <= 8:
                return 7
            elif len(kkk) == 2 or (len(kkk) == 3 and kkk[1] - kkk[0] <= 2) or (len(kkk) == 1 and kkk[0] >= 15):
                return 2
            elif len(kkk) >= 3:  # 3和5 没顶没底的
                print("knn35")
                return classify(listar, t35[0], t35[1], 9)
def classify(inX,dataset,labels,k):#inx测试数据,dataset学习数据,lable学习标签,k前几个数
    datasetSize = dataset.shape[0]#看有几组数据
    diffMat = nu.tile(inX,(datasetSize,1))-dataset#tile函数的作用是复制
    sqDiffMat = diffMat**2
    sqDistances = sqDiffMat.sum(axis=1)#每一行的向量相加(axis=1)
    distances = sqDistances ** 0.5
    ###以上是距离计算公式
    toShortDisdance = distances.argsort()#距离从大到小排序,返回距离的序号
    classCount = {}#字典的声明
    for i in range(k):#前K个距离最小的
        voteIlabel = labels[toShortDisdance[i]]
        classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1#返回key的值,如果没有返回零
    sortedClassCount = sorted(classCount.items(), key=op.itemgetter(1), reverse=True)#给该字典排序,sortedClassCount[0][0]是K中支持的标签数最大的
    return sortedClassCount[0][0]
###################################################
def txttrain():
    t35 = readTxt(r"D:\python\python程序\trainingData\35.txt")
    t49 = readTxt(r"D:\python\python程序\trainingData\49.txt")
    t56 = readTxt(r"D:\python\python程序\trainingData\56.txt")
    t3469 = readTxt(r"D:\python\python程序\trainingData\3469.txt")
    t24689 = readTxt(r"D:\python\python程序\trainingData\24689.txt")
    t2356 = readTxt(r"D:\python\python程序\trainingData\trainingDataNew\new2356.txt")
    l35, l49, l56, l3469, l24689, l2356 = lableTxt()
    t35 = [t35, l35]
    t49 = [t49, l49]
    t56 = [t56, l56]
    t3469 = [t3469, l3469]
    t24689 = [t24689, l24689]
    t2356 = [t2356, l2356]
    return t35, t49, t56, t3469, t24689, t2356
#####################################################
def incise(im,t35,t49,t56,t3469,t24689,t2356):  # 切割多个数字
    a = []#左边界
    b = []#右边界
    if any(im[:, 0] == 1):
        a.append(0)
    for i in range(len(im[0]) - 1):#找到列切割位置
        if all(im[:, i] == 0) and any(im[:, i + 1] == 1):
            a.append(i + 1)
        elif any(im[:, i] == 1) and all(im[:, i + 1] == 0):
            b.append(i)

    if any(im[:, len(im[0]) - 1] == 1):
        b.append(len(im[0]) - 2)#为和-2
    names = {}
    flag = 0

    for i in range(len(a)):
        c = []#上边界
        d = []#下边界
        names['na%s' % i] = im[:, a[i]:b[i] + 1]#取出有内容的部分,存入names中
        if any(names['na%s' % i][0, :] == 1):#判断第一行
            c.append(0)
        if any(names['na%s' % i][len(names['na%s' % i]) - 1, :] == 1):#判断最后一行
            d.append(len(names['na%s' % i]) - 2)
        for j in range(len(names['na%s' % i])-1):
            if all(names['na%s' % i][j, :] == 0) and any(names['na%s' % i][j + 1, :] == 1):
                c.append(j + 1)
            elif any(names['na%s' % i][j, :] == 1) and all(names['na%s' % i][j + 1, :] == 0):
                d.append(j)
#*********************************************************
        for x in range(len(c)):
            z = im[c[x]:d[x]+1,a[i]:b[i]+1]
            print(z[0])
            z=transform.resize(z,(28,28))
            num1=z
            color=filters.threshold_otsu(num1)
            print("B",color)
            print(num1[0])
            num1[num1>=0.5]=1
            num1[num1!=1] = 0
            for g in range(len(num1)):
                nuu=[]
                for h in range(len(num1[0])):
                    if num1[g][h]>=color:
                        #num1[g][h] = 1
                        nuu.append(1)
                    else:
                        #num1[g][h] = 0
                        nuu.append(0)
                print(nu.array(nuu))
            num2 = num1.reshape((1, 28 * 28))
            print('(' + (str)(flag + 1) + ') ' +(str)(answer(num1,t35,t49,t56,t3469,t24689,t2356,num2))+'  ')
            flag=flag+1
def GetTrainPicture(files):  # 主函数

    a=str(files)
    files=os.listdir(files)
    for i in range(len(files)):
        os.chdir(a)
        pic = io.imread(files[i], as_grey=True)

        pic = transform.resize(pic, (700,700))

        color = filters.threshold_otsu(pic)#寻找图片一阈值,该值用于分别背景区域和目标区域
        pic = restoration.denoise_tv_chambolle(pic)#去噪函数
        print("A",color)
        num = True#背景为浅色
        if pic[0][0]<color and pic[0][len(pic[0])-1]<color and pic[len(pic)-1][0]<color and pic[len(pic)-1][len(pic[0])-1]<color:
            num=False#背景为深色
        
        if num==True:
            pic[pic>=color]=0
            pic[pic!=0]=1
        else:
            pic[pic>=color]=1
            pic[pic!=1]=0        
        #pic = interference_point(pic)  # 去噪
        #print("fdsfg" ,pic[pic!=1])
        t35, t49, t56, t3469, t24689, t2356 = txttrain()
        incise(pic, t35,t49,t56,t3469,t24689,t2356)  # 切割数字

在补一个画板的代码

import pygame
import os
import math
import besthua


class Brush():#画笔类

    def __init__(self, screen, color, size):#构造函数
        self.screen = screen
        self.color = color
        self.size = size
        self.drawing = False
        self.last_pos = None

    def start_draw(self, pos):#开始
        self.drawing = True
        self.last_pos = pos

    def end_draw(self):#结束
        self.drawing = False

    def _get_points(self, pos):#得到后一点和前一点之间的所有点(pos即为前一点)
        points = [(self.last_pos[0], self.last_pos[1])]#后一点和前一点之间的所有点(先写出最后一点)
        len_x = pos[0] - self.last_pos[0]#前后两点的x距离
        len_y = pos[1] - self.last_pos[1]#前后两点的y距离
        length = math.sqrt(len_x ** 2 + len_y ** 2)#前后两点的距离
        step_x = len_x / length
        step_y = len_y / length
        for i in range(int(length)):#将所有的点的坐标表示出来(及其有道理的一种方法,但我不会解释,却感觉很有道理)
            points.append((points[-1][0] + step_x, points[-1][1] + step_y))
        points = map(lambda x: (int(0.5 + x[0]), int(0.5 + x[1])), points)
        #map做映射(相当于一循环)lambda :x后面紧跟执行语句,x为points的映射,返回以迭代器
        return list(set(points))#set() 函数创建一个无序不重复元素集。

    def draw(self, pos):#画的过程
        if self.drawing:
            for p in self._get_points(pos):
                    pygame.draw.circle(self.screen,self.color, p, self.size)
            self.last_pos = pos
class Painter():

    def __init__(self, color, size):
        self.screen = pygame.display.set_mode((1200, 700))#建立一个窗口,规定其大小
        pygame.display.set_caption("Painter")#窗口名字
        self.clock = pygame.time.Clock()#实现帧率的控制(创建Clock对象)
        self.brush = Brush(self.screen, color, size)


    def run(self):
        self.brush.screen.fill((255, 255, 255))#给窗口填充颜色
        self.clock.tick(30)  # 调用对象里的tick方法实现帧率的控制self.clock.tick(30)#调用对象里的tick方法实现帧率的控制

        while True:
            for event in pygame.event.get():#得到输入信息指令
                if event.type == pygame.QUIT:#推出指令
                    return 0
                elif event.type == pygame.KEYDOWN:#键盘按键被按下
                    if event.key ==  pygame.K_SPACE:
                        self.brush.screen.fill((255, 255, 255))
                    if event.key == pygame.K_UP:
                        return self.brush .screen
                elif event.type == pygame.MOUSEBUTTONDOWN:#鼠标按键被按下
                    self.brush.start_draw(event.pos)
                elif event.type == pygame.MOUSEMOTION:#鼠标移动
                    self.brush.draw(event.pos)
                    self.brush.last_pos = event.pos
                elif event.type == pygame.MOUSEBUTTONUP:#鼠标键松开
                    self.brush.end_draw()
            pygame.display.update()#更新部分界面

if __name__ == '__main__':
    def one():#写一个字保存为A.pmm
        color=(0,255,0)
        size=10
        app = Painter(color, size)
        picture = app.run()
        if picture == 0:
            pass
        else:
            os.chdir(r"D:\python\python程序\testData\PPM")
            pygame.image.save(picture, "A.jpg")
    
    one()
    files = r"D:\python\python程序\testData\ppm"
    besthua.GetTrainPicture(files)
    #many()

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值