使用版本介绍:
显卡:A5000
CUDA:11.8
cudnn=8.9.2 or 8.7
torch=2.1.1+cu118
torchaudio=2.1.2+cu118
torchvision=0.16.2+cu118
numpy=2.0.1
paddleocr=3.1.0
PaddleOCR-release-3.1
paddlepaddle-gpu=3.1.0
paddlex=3.1.3
pandas=2.3.0
检测配置:config/PP-OCRv5/PP-OCRv5_mobile_det.yml
识别配置: config/PP-OCRv5/PP-OCRv5_mobile_rec.yml
检测训练模型:PP-OCRv5_mobile_det_pretrained.pdparams
识别训练模型:PP-OCRv5_mobile_det_pretrained.pdparams
两次训练:文字检测训练模型det,文字识别训练模型rec
导出训练模型:导出名称可自定义分别位infer_det,infer_rec导出模型文件夹里面有3个文件分别是.json,.yml,.pdiparams,其中yml文件有大坑,Global:model_name有特定名称要求,model_name必须在下面这些范围内,不然会报错:Error: Model name mismatch
[STFPM, PP-DocBee-2B, PP-DocBee-7B, PP-Chart2Table, PP-DocBee2-3B, PP-ShiTuV2_rec,
PP-ShiTuV2_rec_CLIP_vit_base, PP-ShiTuV2_rec_CLIP_vit_large, MobileFaceNet,
ResNet50_face, LaTeX_OCR_rec, UniMERNet, PP-FormulaNet-S, PP-FormulaNet-L,
PP-FormulaNet_plus-S, PP-FormulaNet_plus-M, PP-FormulaNet_plus-L,
CLIP_vit_base_patch16_224, CLIP_vit_large_patch14_224, ConvNeXt_tiny, ConvNeXt_small,
ConvNeXt_base_224, ConvNeXt_base_384, ConvNeXt_large_224, ConvNeXt_large_384,
MobileNetV1_x0_25, MobileNetV1_x0_5, MobileNetV1_x0_75, MobileNetV1_x1_0,
MobileNetV2_x0_25, MobileNetV2_x0_5, MobileNetV2_x1_0, MobileNetV2_x1_5,
MobileNetV2_x2_0, MobileNetV3_large_x0_35, MobileNetV3_large_x0_5,
MobileNetV3_large_x0_75, MobileNetV3_large_x1_0, MobileNetV3_large_x1_25,
MobileNetV3_small_x0_35, MobileNetV3_small_x0_5, MobileNetV3_small_x0_75,
MobileNetV3_small_x1_0, MobileNetV3_small_x1_25, MobileNetV4_conv_small,
MobileNetV4_conv_medium, MobileNetV4_conv_large, MobileNetV4_hybrid_medium,
MobileNetV4_hybrid_large, PP-HGNet_tiny, PP-HGNet_small, PP-HGNet_base,
PP-HGNetV2-B0, PP-HGNetV2-B1, PP-HGNetV2-B2, PP-HGNetV2-B3, PP-HGNetV2-B4,
PP-HGNetV2-B5, PP-HGNetV2-B6, PP-LCNet_x0_25, PP-LCNet_x0_25_textline_ori, PP-LCNet_x0_35,
PP-LCNet_x0_5, PP-LCNet_x0_75, PP-LCNet_x1_0, PP-LCNet_x1_0_doc_ori,
PP-LCNet_x1_0_textline_ori, PP-LCNet_x1_5, PP-LCNet_x2_0, PP-LCNet_x2_5,
PP-LCNetV2_small, PP-LCNetV2_base, PP-LCNetV2_large, ResNet101, ResNet152,
ResNet18, ResNet34, ResNet50, ResNet200_vd, ResNet101_vd, ResNet152_vd,
ResNet18_vd, ResNet34_vd, ResNet50_vd, SwinTransformer_tiny_patch4_window7_224,
SwinTransformer_small_patch4_window7_224, SwinTransformer_base_patch4_window7_224,
SwinTransformer_base_patch4_window12_384, SwinTransformer_large_patch4_window7_224,
SwinTransformer_large_patch4_window12_384, StarNet-S1, StarNet-S2, StarNet-S3,
StarNet-S4, FasterNet-L, FasterNet-M, FasterNet-S, FasterNet-T0, FasterNet-T1,
FasterNet-T2, PP-LCNet_x1_0_table_cls, ResNet50_ML, PP-LCNet_x1_0_ML,
PP-HGNetV2-B0_ML, PP-HGNetV2-B4_ML, PP-HGNetV2-B6_ML, CLIP_vit_bE_plus-S,
PP-YOLOE_plus-X, RT-DETR-H, RT-DETR-L, RT-DETR-R18, RT-DETR-R50, RT-DETR-X,
PicoDet_layout_1x, PicoDet_layout_1x_table, PicoDet-S_layout_3cls,
PicoDet-S_layout_17cls, PicoDet-L_layout_3cls, PicoDet-L_layout_17cls,
RT-DETR-H_layout_3cls, RT-DETR-H_layout_17cls, YOLOv3-DarkNet53, YOLOv3-MobileNetV3,
YOLOv3-ResNet50_vd_DCN, YOLOX-L, YOLOX-M, YOLOX-N, YOLOX-S, YOLOX-T, YOLOX-X,
FasterRCNN-ResNet34-FPN, FasterRCNN-ResNet50, FasterRCNN-ResNet50-FPN,
FasterRCNN-ResNet50-vd-FPN, FasterRCNN-ResNet50-vd-SSLDv2-FPN, FasterRCNN-ResNet101,
FasterRCNN-ResNet101-FPN, FasterRCNN-ResNeXt101-vd-FPN, FasterRCNN-Swin-Tiny-FPN,
Cascade-FasterRCNN-ResNet50-FPN, Cascade-FasterRCNN-ResNet50-vd-SSLDv2-FPN, PicoDet-M,
PicoDet-XS, FCOS-ResNet50, DETR-R50, PP-ShiTuV2_det, PP-YOLOE-L_human,
PP-YOLOE-S_human, PP-YOLOE-L_vehicle, PP-YOLOE-S_vehicle, PP-YOLOE_plus_SOD-L,
PP-YOLOE_plus_SOD-S, PP-YOLOE_plus_SOD-largesize-L, CenterNet-DLA-34,
CenterNet-ResNet50, PicoDet_LCNet_x2_5_face, BlazeFace, BlazeFace-FPN-SSH,
PP-YOLOE_plus-S_face, PP-YOLOE-R-L, Co-Deformable-DETR-R50,
Co-Deformable-DETR-Swin-T, Co-DINO-R50, Co-DINO-Swin-L,
RT-DETR-L_wired_table_cell_det, RT-DETR-L_wireless_table_cell_det,
PP-DocLayout-L, PP-DocLayout-M, PP-DocLayout-S, PP-DocLayout_plus-L,
PP-DocBlockLayout, Mask-RT-DETR-S, Mask-RT-DETR-M, Mask-RT-DETR-X,
Mask-RT-DETR-H, Mask-RT-DETR-L, SOLOv2, MaskRCNN-ResNet50, MaskRCNN-ResNet50-FPN,
MaskRCNN-ResNet50-vd-FPN, MaskRCNN-ResNet101-FPN, MaskRCNN-ResNet101-vd-FPN,
MaskRCNN-ResNeXt101-vd-FPN, MaskRCNN-ResNet50-vd-SSLDv2-FPN,
Cascade-MaskRCNN-ResNet50-FPN, Cascade-MaskRCNN-ResNet50-vd-SSLDv2-FPN,
PP-YOLOE_seg-S, PP-TinyPose_128x96, PP-TinyPose_256x192, BEVFusion,
whisper_large, whisper_medium, whisper_base, whisper_small, whisper_tiny,
GroundingDINO-T, YOLO-Worldv2-L, SAM-H_point, SAM-H_box, Deeplabv3_Plus-R101,
Deeplabv3_Plus-R50, Deeplabv3-R101, Deeplabv3-R50, OCRNet_HRNet-W48,
OCRNet_HRNet-W18, PP-LiteSeg-T, PP-LiteSeg-B, SegFormer-B0, SegFormer-B1,
SegFormer-B2, SegFormer-B3, SegFormer-B4, SegFormer-B5, SeaFormer_base,
SeaFormer_tiny, SeaFormer_small, SeaFormer_large, MaskFormer_tiny, MaskFormer_small,
SLANet, SLANet_plus, SLANeXt_wired, SLANeXt_wireless, PP-OCRv5_mobile_det,
PP-OCRv5_server_det, PP-OCRv4_mobile_det, PP-OCRv4_server_det,
PP-OCRv4_mobile_seal_det, PP-OCRv4_server_seal_det, PP-OCRv3_mobile_det,
PP-OCRv3_server_det, PP-OCRv3_mobile_rec, en_PP-OCRv3_mobile_rec,
korean_PP-OCRv3_mobile_rec, japan_PP-OCRv3_mobile_rec, chinese_cht_PP-OCRv3_mobile_rec,
te_PP-OCRv3_mobile_rec, ka_PP-OCRv3_mobile_rec, ta_PP-OCRv3_mobile_rec,
latin_PP-OCRv3_mobile_rec, arabic_PP-OCRv3_mobile_rec, cyrillic_PP-OCRv3_mobile_rec,
devanagari_PP-OCRv3_mobile_rec, PP-OCRv4_mobile_rec, PP-OCRv4_server_rec,
en_PP-OCRv4_mobile_rec, PP-OCRv4_server_rec_doc, ch_SVTRv2_rec, ch_RepSVTR_rec,
PP-OCRv5_server_rec, PP-OCRv5_mobile_rec, latin_PP-OCRv5_mobile_rec,
eslav_PP-OCRv5_mobile_rec, korean_PP-OCRv5_mobile_rec, AutoEncoder_ad, DLinear_ad,
Nonstationary_ad, PatchTST_ad, TimesNet_ad, TimesNet_cls, DLinear, NLinear,
Nonstationary, PatchTST, RLinear, TiDE, TimesNet, PP-TSM-R50_8frames_uniform,
PP-TSMv2-LCNetV2_8frames_uniform, PP-TSMv2-LCNetV2_16frames_uniform, YOWO]
运行自己模型:
from paddleocr import PaddleOCR
ocr = PaddleOCR(
text_detection_model_name="PP-OCRv5_mobile_det",
text_recognition_model_name="PP-OCRv5_mobile_rec",
text_detection_model_dir=r"C:\Users\Administrator\Desktop\Mo_paddleocr\PaddleOCR-release-3.1\output\output_infer_det",
text_recognition_model_dir=r"C:\Users\Administrator\Desktop\Mo_paddleocr\PaddleOCR-release-3.1\output\output_infer_rec",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
use_textline_orientation=False,
)
result = ocr.predict("train_data\\icdar2015\\text_localization\\det\\test\\28.jpg")
for res in result:
res.print()
res.save_to_img("output")
res.save_to_json("output")
text_detection_model_name="PP-OCRv5_mobile_det",
text_recognition_model_name="PP-OCRv5_mobile_rec",
这两个必须和yml文件中的Madel_name对应
模型导出
模型导出往往是利用官方提供的命令行,博主比较闲麻烦直接修改tools/export_model.py进行模型导出,修改后直接运行export_model.py文件即可
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
sys.path.insert(0, os.path.abspath(os.path.join(__dir__, "..")))
import argparse
from tools.program import load_config, merge_config, ArgsParser
from ppocr.utils.export_model import export
def main():
# FLAGS = ArgsParser().parse_args()
# config = load_config(FLAGS.config)
# config = merge_config(config, FLAGS.opt)
# 训练完成后的config文件
my_config = "C:\\Users\\Administrator\\Desktop\\Mo_paddleocr\\PaddleOCR-release-3.1\\output\\Msy_test_ppocr_rec\\config.yml"
config = load_config(my_config)
my_opt = {"Global.pretrained_model":"C:\\Users\\Administrator\\Desktop\\Mo_paddleocr\\PaddleOCR-release-3.1\\output\\Msy_test_ppocr_rec\\best_model\\model.pdparams",
"Global.save_inference_dir":"C:\\Users\\Administrator\\Desktop\\Mo_paddleocr\\PaddleOCR-release-3.1\\output_infer_rec"}
config = merge_config(config, my_opt)
# export model
export(config)
if __name__ == "__main__":
main()
1183

被折叠的 条评论
为什么被折叠?



