centos7安装tesseract 4.1.1

安装leptonica

先安装第三方包:

yum install libtiff-devel libjpeg-devel libpng-devel -y

wget http://www.leptonica.org/source/leptonica-1.78.0.tar.gz
tar -xzvf leptonica-1.78.0.tar.gz
cd leptonica-1.78.0
./configure
make && make install

安装Tesseract-OCR

wget https://codeload.github.com/tesseract-ocr/tesseract/tar.gz/4.1.1
tar -xvf 4.1.1
cd tesseract-4.1.1/
./autogen.sh
./configure
make && make install
sudo ldconfig

./configure这一步可能的报错以及解决

问题一
报错:
[root@iZwz9bpg2u1r39ml9st8qzZ tesseract-master]# ./autogen.sh 
Unable to find a valid copy of libtoolize or glibtoolize in your PATH!
./autogen.sh: line 59: bail_out: command not found
Running aclocal
./autogen.sh: line 88: aclocal: command not found
Something went wrong, bailing out!
解决:yum install automake -y
问题二
报错:
Unable to find a valid copy of libtoolize or glibtoolize in your PATH!
./autogen.sh: line 59: bail_out: command not found
Running aclocal
Running 
./autogen.sh: line 87: -f: command not found
Something went wrong, bailing out!
解决:yum install libtool -y
问题三
报错:
Leptonica 1.74 or higher is required. Try to install libleptonica-dev package
解决:
配置一下leptonica的环境变量
export LD_LIBRARY_PATH=$LD_LIBRARY_PAYT:/usr/local/lib
export LIBLEPT_HEADERSDIR=/usr/local/include
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

安装语言包

#获取所有语言包
git clone https://github.com/tesseract-ocr/tessdata.git
下载地址:https://github.com/tesseract-ocr/tessdata
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata

chi_sim.traineddata  中文
eng.traineddata      英文
enm.traineddata      数字

一般默认的语言包路径是:/usr/local/share/tessdata

也就是把需要用的语言包移动到此目录下即可

执行命令

 tesseract /home/aa.jpg stdout -l chi_sim+eng

总结

识别效果不咋的,应该是训练的模型对中文支持不是很好

参考一下两篇文章:

centos编译安装tesseract-ocr 3.05

centos安装tesseract

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值