build tesseract opencv

本文详细介绍如何在OpenCV中搭建OCR模块,并实现文本识别功能。主要步骤包括:下载并构建Leptonica和Tesseract,配置OpenCV解决方案,以及创建测试项目等。适用于希望在图像处理应用中加入文字识别功能的开发者。

Background.

AOI software needs to use the OCR feature to recognize the texts on the chips. Because our vision software is based on OpenCV, so the first choice is text module in opencv_contrib.

Procedures.

  1. OCR module is not in standard OpenCV package. It is in text module of OpenCV_Contrib. It can be downloaded from opencv_contrib.

  2. The core of OCR is using Tesseract, and Tesseract depends on Leptonica, so need to build Leptonica and Tesseract first.

  3. Get the Leptonica from https://github.com/charlesw/tesseract-vs2012. This project can directly build. The output is liblept171d.dll and liblept171d.lib.

  4. Get the Tesseract from https://github.com/tesseract-ocr/tesseract. Copy liblept171d.lib to .\tesseract\lib folder. Create the .\tesseract\include\leptonica folder, copy all the header file from .\tesseract-vs2012\liblept\include (The root folder is in step 3). Set the project property of tesseract, change the include folder path "......\include" and "......\include\leptonica" to "....\include" and "....\include\leptonica". Then can build the Tessrect project, the output is libtesseract304d.dll and libtesseract304d.lib.

  5. Use CMake to config the OpenCV solution. Copy the text module from opencv_contrib to .\OpenCV\sources\modules. Run Cmake_Gui, there are 3 options need to set. Lept_library, Tesseract_Include_Dir, Tesseract_Library. Tesseract_Include_Dir set to ...../tesseract/API. After set, can run CMake to config and generate the solution.

  6. Build the OpenCV solution. If there are header files can not find errors, find and copy them from tesseract to the API folder. There may be a compile error with std::numeric_limits<double>::min(); Add below code before the function use it.

#undef max #undef min

  1. Download the language test data from https://github.com/tesseract-ocr/tessdata. What i use is the eng.traineddata. Put it to .\tesseract\tessdata.

  2. After build OpenCV successfully, then you can create the TestOpenCV project with the below function, before running it, need to copy the liblept171d.dll and libtesseract304d.dll to the output folder(where the exe file is put).

using OCRTesseract =  cv::text::OCRTesseract;
void TestOCR()
{    
    cv::Mat mat = cv::imread(".\\data\\OCRTest.png");
    if ( mat.empty() )
        return;

    std::string output_text;
    char *dataPath = "C:/tesseract-build/tesseract/tessdata";
    cv::Ptr<OCRTesseract> ptrOcr = OCRTesseract::create(dataPath);
    ptrOcr->run(mat, output_text );
    cout << output_text << endl;
}

转载于:https://my.oschina.net/u/1177171/blog/782622

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值