Crnn_chinese_characters 中文字符识别

本文详细介绍了一种基于CRNN的中文字符识别方法,并提供了从环境搭建到模型测试的完整流程。包括离线环境下anaconda、torch等依赖的安装,以及crnn_chinese_characters_rec项目的配置与运行。
部署运行你感兴趣的模型镜像

Crnn_chinese_characters 中文字符识别

重要的源码地址:

一、实验环境

没网、没权限、centos 64

二、实验步骤

1、离线安装 anaconda

anaconda清华镜像源

先下载anaconda3-4.2对应py3.5, Anaconda3-4.2.0-Linux-x86_64.sh

相关博客

2、离线安装 torch

再下载torch-0.4.0-cp35-cp35m-linux_x86_64.whl

3、配置环境变量

修改~/.bashrc 增加如下环境变量

## WARP_CTC
export CUDA_HOME="/usr/local/cuda"
export TENSORFLOW_SRC_PATH="/data/home/douglaswang/anaconda3/lib/python3.5/site-packages:$TENSORFLOW_SRC_PATH"
export WARP_CTC_PATH="/data/home/douglaswang/2019-01/warp-ctc/build:$WARP_CTC_PATH"
4、安装warp-ctc
git clone https://github.com/SeanNaren/warp-ctc.git
cd warp-ctc
mkdir build; cd build
cmake ..
make

然后安装bindings

cd pytorch_binding
python setup.py install
5、在Crnn_chinese_characters_rec目录下执行python test.py,进行测试

因为当前环境离线,会提出缺少相关whl文件,下载地址pypi,安装方式如下:

  • torchvision-0.1.8-py2.py3-none-any.whl
    • pip install torchvision-0.1.8-py2.py3-none-any.whl
  • lmdb-0.94.tar.gz (源码安装)
    • tar xzvf lmdb-0.94.tar.gz
    • cd lmdb-0.9.4
    • python setup.py install

效果如下:

[douglaswang@Tencent-SNG ~/2019-01/crnn_chinese_characters_rec]$ python test.py
loading pretrained model from trained_models/mixed_second_finetune_acc97p7.pth
results: 男装、女装、童装、婴儿装、内衣、服饰、泳衣、家用饰品、针纺织品、服装面料及辅
elapsed time: 0.0521390438079834

三、安装pytorch: warp-ctc遇到的问题

Q: fatal error: torch/extension.h: No such file or directory
src/binding.cpp:6:10: fatal error: torch/extension.h: No such file or directory
 #include <torch/extension.h>
 
代码版本不一致导致的问题,将代码回退到0.4版本	`git checkout ac045b6072b9bc3454fb9f9f17674f0d59373789`
Q:THC_API cudaError_t THCudaMalloc(THCState *state, void **ptr, size_t size);
THC_API cudaError_t THCudaMalloc(THCState *state, void **ptr, size_t size);
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1。

修改pytorch_binding/src/binding.cpp文件为如下:
1 at 92 line
    int probs_size = THCudaTensor_size(state, probs, 2);
2 at l05 lines
    void* gpu_workspace;
    THCudaMalloc(state, &gpu_workspace, gpu_size_bytes);

安装 caffe遇到部分问题

Q: ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory
A:  sudo vim ~/.bashrc 
    export LD_LIBRARY_PATH=”$LD_LIBRARY_PATH:/usr/local/cuda/lib64” 
    export CUDA_HOME=/usr/local/cuda 
    source ~/.bashrc`
Q: ‘kEmptyString’ is not a member of ‘google::protobuf::internel’
这时由于protoc编译器版本和protobuf头文件不对应的问题,原因就是系统里可能存在多个protoc的版本,但是protobuf的包含文件可能只有一种,所以就会造成这种问题,解决问题的方案就是,在Makefile里面重新指定protoc的版本

#$(Q)protoc --proto_path=$(PROTO_SEC_DIR) --cpp_out=$(PROTO_BUILD_DIR)
$(Q)/usr/bin/protoc --proto_path=$(PROTO_SEC_DIR) --cpp_out=$(PROTO_BUILD_DIR)

在Makefile 中修改这两句:
    $(Q)protoc --proto_path=$(PROTO_SRC_DIR) --cpp_out=$(PROTO_BUILD_DIR) $<
    $(Q)protoc --proto_path=$(PROTO_SRC_DIR) --python_out=$(PY_PROTO_BUILD_DIR) $<
为
    $(Q)/usr/bin/protoc --proto_path=$(PROTO_SRC_DIR) --cpp_out=$(PROTO_BUILD_DIR) $<
    $(Q)/usr/bin/protoc --proto_path=$(PROTO_SRC_DIR) --python_out=$(PY_PROTO_BUILD_DIR) $<

Caffe 训练问题

Q:F0107 18:58:32.448169 21800 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory
A:train的batch_size设置过大,改小后可以。 http://blog.sina.com.cn/s/blog_141f234870102w8is.html

some tricks

  • 可以在Makefile文件中直接指定protoc路径
    • 命令 whereis protoc 可以查看哪些路径下安装了protoc
    • 命令which protoc 可以查看默认选用protoc的路径
    • 命令 protoc --version 可以查看当前protoc版本

您可能感兴趣的与本文相关的镜像

PyTorch 2.9

PyTorch 2.9

PyTorch
Cuda

PyTorch 是一个开源的 Python 机器学习库,基于 Torch 库,底层由 C++ 实现,应用于人工智能领域,如计算机视觉和自然语言处理

报错: ��Ϣ: ���ṩ��ģʽ�޷��ҵ��ļ��� D:\Exp\Python38\lib\site-packages\paddle\utils\cpp_extension\extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) [2025-07-02 23:01:54,244] [ WARNING] - The home directory contains Chinese characters which may cause unknown exceptions in the execution of some modules. Please set another path through the set HUB_HOME command. Download https://bj.bcebos.com/paddlehub/paddlehub_dev/chinese_ocr_db_crnn_mobile_1.2.0.zip [##################################################] 100.00% Decompress C:\Users\田龙飞\.paddlehub\tmp\tmphyd3kin5\chinese_ocr_db_crnn_mobile_1.2.0.zip [##################################################] 100.00% [2025-07-02 23:02:05,465] [ INFO] - Successfully installed chinese_ocr_db_crnn_mobile-1.2.0 Traceback (most recent call last): File "E:\Project\Python\SJZ-GC\paddle_ocr.py", line 5, in <module> ocr = hub.Module(name="chinese_ocr_db_crnn_mobile") File "D:\Exp\Python38\lib\site-packages\paddlehub\module\module.py", line 393, in __new__ module = cls.init_with_name( File "D:\Exp\Python38\lib\site-packages\paddlehub\module\module.py", line 516, in init_with_name return user_module_cls(**kwargs) File "C:\Users\田龙飞\.paddlehub\modules\chinese_ocr_db_crnn_mobile\module.py", line 57, in __init__ self.rec_predictor, self.rec_input_tensor, self.rec_output_tensors = self._set_config( File "D:\Exp\Python38\lib\site-packages\paddlehub\compat\paddle_utils.py", line 221, in runner return func(*args, **kwargs) File "C:\Users\田龙飞\.paddlehub\modules\chinese_ocr_db_crnn_mobile\module.py", line 69, in _set_config config = Config(model_file_path, params_file_path) RuntimeError: (NotFound) Cannot open file C:\Users\田龙飞\.paddlehub\modules\chinese_ocr_db_crnn_mobile\inference_model\character_rec\model.pdmodel, please confirm whether the file is normal. [Hint: Expected paddle::inference::IsFileExists(prog_file_) == true, but received paddle::inference::IsFileExists(prog_file_):0 != true:1.] (at ..\paddle\fluid\inference\api\analysis_config.cc:117)
07-03
报错:[2025-07-02 23:12:53,128] [ WARNING] - The home directory contains Chinese characters which may cause unknown exceptions in the execution of some modules. Please set another path through the set HUB_HOME command. E0702 23:12:53.176828 26960 analysis_config.cc:169] Please use PaddlePaddle with GPU version. E0702 23:12:53.361079 26960 analysis_config.cc:169] Please use PaddlePaddle with GPU version. 识别中文文本结果: Download https://bj.bcebos.com/paddlehub/paddlehub_dev/chinese_text_detection_db_mobile_1.1.0.zip [##################################################] 100.00% Decompress C:\Users\田龙飞\.paddlehub\tmp\tmpqgas7s5_\chinese_text_detection_db_mobile_1.1.0.zip [##################################################] 100.00% [2025-07-02 23:13:01,965] [ INFO] - Successfully installed dependent packages. [2025-07-02 23:13:02,031] [ INFO] - Successfully installed chinese_text_detection_db_mobile-1.1.0 Traceback (most recent call last): File "E:\Project\Python\SJZ-GC\paddle_ocr.py", line 63, in <module> chinese_texts = recognize_chinese(image_path) File "E:\Project\Python\SJZ-GC\paddle_ocr.py", line 23, in recognize_chinese results = ocr.recognize_text(images=[img], use_gpu=True) File "D:\Exp\Python38\lib\site-packages\paddlehub\compat\paddle_utils.py", line 221, in runner return func(*args, **kwargs) File "E:\PaddleOCR\paddle_hub_models\chinese_ocr_db_crnn_mobile\module.py", line 234, in recognize_text detection_results = self.text_detector_module.detect_text(images=predicted_data, File "D:\Exp\Python38\lib\site-packages\paddlehub\module\module.py", line 95, in __getattribute__ _attr = object.__getattribute__(self, attr) File "E:\PaddleOCR\paddle_hub_models\chinese_ocr_db_crnn_mobile\module.py", line 108, in text_detector_module self._text_detector_module = hub.Module(name='chinese_text_detection_db_mobile', File "D:\Exp\Python38\lib\site-packages\paddlehub\module\module.py", line 393, in __new__ module = cls.init_with_name( File "D:\Exp\Python38\lib\site-packages\paddlehub\module\module.py", line 516, in init_with_name return user_module_cls(**kwargs) File "C:\Users\田龙飞\.paddlehub\modules\chinese_text_detection_db_mobile\module.py", line 61, in __init__ self._set_config() File "D:\Exp\Python38\lib\site-packages\paddlehub\compat\paddle_utils.py", line 221, in runner return func(*args, **kwargs) File "C:\Users\田龙飞\.paddlehub\modules\chinese_text_detection_db_mobile\module.py", line 78, in _set_config config = Config(model_file_path, params_file_path) RuntimeError: (NotFound) Cannot open file C:\Users\田龙飞\.paddlehub\modules\chinese_text_detection_db_mobile\inference_model\model.pdmodel, please confirm whether the file is normal. [Hint: Expected paddle::inference::IsFileExists(prog_file_) == true, but received paddle::inference::IsFileExists(prog_file_):0 != true:1.] (at ..\paddle\fluid\inference\api\analysis_config.cc:117)
最新发布
07-03
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值