利用腾讯云api实现手写字体识别

掉了牙的大黄狗

已于 2022-06-05 18:06:59 修改

阅读量1.2k

点赞数

分类专栏： python 文章标签：腾讯云 python 云计算

于 2022-06-05 18:06:11 首次发布

本文链接：https://blog.youkuaiyun.com/qq_38120778/article/details/125134148

版权

python 专栏收录该内容

2 篇文章

订阅专栏

本文档介绍了如何在腾讯云获取和使用OCR API，包括申请API key、安装Python环境、安装相关包以及设置Jupyter Notebook。此外，还提供了Python代码示例用于图片的文字识别，并将识别结果保存为TXT文件。整个流程详细且适用于初学者。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1.申请API key
腾讯云目前提供每个月1000次图片识别api调用次数，
开通文字识别api地址如下https://console.cloud.tencent.com/ocr/overview，找不到的话在云产品下找通用文字识别
获取api密钥：https://console.cloud.tencent.com/cam/capi，如图所示

请添加图片描述
2.安装python环境
下载地址如下：https://www.python.org/downloads/，安装过程需要选中添加环境变量，然后一路回车即可
在这里插入图片描述
win10菜单搜索【管理应用执行别名】关闭下面两个按钮【应用安装程序】，如下图：

3.安装相关的包
win建+cmd在命令行中运行如下内容：

python -m pip install --upgrade pip
pip install jupyter -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
pip install jupyterlab -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
pip install tencentcloud-sdk-python -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
jupyter lab --ip='*' --port=8888 --no-browser --allow-root

然后浏览器访问127.0.0.1:8888并输入token，token位置如下：
在这里插入图片描述

4.目录结构
请添加图片描述
from目录为原始图片位置
to目录为输出txt文件夹位置
ipython为主程序

新建ipython页面并将代码复制如下：
注：需要使用自己的SecretId和SecretKey替换这一行cred = credential.Credential(“SecretId”, “SecretKey”)：

import base64
import os
import json
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models
def translate(image_base64):
    try:
        cred = credential.Credential("SecretId", "SecretKey")
        httpProfile = HttpProfile()
        httpProfile.endpoint = "ocr.tencentcloudapi.com"

        clientProfile = ClientProfile()
        clientProfile.httpProfile = httpProfile
        client = ocr_client.OcrClient(cred, "ap-shanghai", clientProfile)

        req = models.EnglishOCRRequest()
        params = {"ImageBase64": image_base64,
            "Preprocess": True
        }
        req.from_json_string(json.dumps(params))

        resp = client.EnglishOCR(req)
        return json.loads(resp.to_json_string())
        #print(resp.to_json_string())

    except TencentCloudSDKException as err:
        print(err)
        
        

image_dir = r'from'
txt_dir = r'to'
images= os.listdir(image_dir)
s = []
for image in images:
    if not os.path.isdir(image): 
        image_path=image_dir+"/"+image
        type1=str.lower(image_path.split(".")[1])
        with open(image_path, 'rb') as f:
            imagefile = f.read()
        image_base64 = "data:image/"+type1+";base64,"+str(base64.b64encode(imagefile), encoding='utf-8')
        dict1=translate(image_base64)
        #print(dict1)
        str1=''
        for i in dict1['TextDetections']:
            str1=str1+i['DetectedText']+'\n'
        txt_path=txt_dir+"/"+image.split(".")[0]+".txt"
        with open(txt_path, "w", encoding='utf-8') as f:
            f.write(str(str1))
            f.close()
        #print(image_base64)