python3网络爬虫开发实战

本文详细记录了解决在Windows环境下使用pip安装tesserocr时遇到的错误过程,包括环境配置、版本匹配以及最终发现tesserocr已更名的问题。通过更换为pyteseract并进行相应调整,成功完成安装。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

适用命令行pip install tesserocr 安装tesserocr时出现报错Microsoft Windows [版本 6.1.7601]
版权所有 © 2009 Microsoft Corporation。保留所有权利。

C:\Users\Administrator>pip install tesserocr
Collecting tesserocr
Using cached tesserocr-2.5.0.tar.gz (54 kB)
Building wheels for collected packages: tesserocr
Building wheel for tesserocr (setup.py) … error
ERROR: Command errored out with exit status 1:
command: ‘c:\users\administrator\appdata\local\programs\python\python38-32\py
thon.exe’ -u -c ‘import sys, setuptools, tokenize; sys.argv[0] = ‘"’"‘C:\Users
\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tesserocr\setup.py’
"’"’; file=’"’"‘C:\Users\Administrator\AppData\Local\Temp\pip-install-
kv0214pl\tesserocr\setup.py’"’"’;f=getattr(tokenize, ‘"’"‘open’"’"’, open)(f
ile
);code=f.read().replace(’"’"’\r\n’"’"’, ‘"’"’\n’"’"’);f.close();exec(compil
e(code, file, ‘"’"‘exec’"’"’))’ bdist_wheel -d ‘C:\Users\Administrator\AppDa
ta\Local\Temp\pip-wheel-ky53j_jh’
cwd: C:\Users\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tesse
rocr
Complete output (12 lines):
C:\Users\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tesserocr\setup
.py:72: SyntaxWarning: “is not” with a literal. Did you mean “!=”?
if subversion is not None and subversion is not “”:
C:\Users\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tesserocr\setup
.py:134: DeprecationWarning: The ‘warn’ method is deprecated, use ‘warning’ inst
ead
_LOGGER.warn(‘Failed to extract tesseract version from executable: {}’.forma
t(e))
Failed to extract tesseract version from executable: [WinError 2] 系统找不到指
定的文件。
Supporting tesseract v3.04.00
Building with configs: {‘libraries’: [‘tesseract’, ‘lept’], ‘cython_compile_ti
me_env’: {‘TESSERACT_VERSION’: 50593792}}
running bdist_wheel
running build
running build_ext
building ‘tesserocr’ extension
error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C+

  • Build Tools": https://visualstudio.microsoft.com/downloads/

    ERROR: Failed building wheel for tesserocr
    Running setup.py clean for tesserocr
    Failed to build tesserocr
    Installing collected packages: tesserocr
    Running setup.py install for tesserocr … error
    ERROR: Command errored out with exit status 1:
    command: ‘c:\users\administrator\appdata\local\programs\python\python38-32
    python.exe’ -u -c ‘import sys, setuptools, tokenize; sys.argv[0] = ‘"’"‘C:\User
    s\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tesserocr\setup.p
    y’"’"’; file=’"’"‘C:\Users\Administrator\AppData\Local\Temp\pip-instal
    l-kv0214pl\tesserocr\setup.py’"’"’;f=getattr(tokenize, ‘"’"‘open’"’"’, open)(_
    file_);code=f.read().replace(’"’"’\r\n’"’"’, ‘"’"’\n’"’"’);f.close();exec(comp
    ile(code, file, ‘"’"‘exec’"’"’))’ install --record ‘C:\Users\Administrator\A
    ppData\Local\Temp\pip-record-8sn912s2\install-record.txt’ --single-version-exter
    nally-managed --compile --install-headers ‘c:\users\administrator\appdata\local
    programs\python\python38-32\Include\tesserocr’
    cwd: C:\Users\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tes
    serocr
    Complete output (12 lines):
    C:\Users\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tesserocr\set
    up.py:72: SyntaxWarning: “is not” with a literal. Did you mean “!=”?
    if subversion is not None and subversion is not “”:
    C:\Users\Administrator\AppData\Local\Temp\pip-install-kv0214pl\tesserocr\set
    up.py:134: DeprecationWarning: The ‘warn’ method is deprecated, use ‘warning’ in
    stead
    LOGGER.warn(‘Failed to extract tesseract version from executable: {}’.for
    mat(e))
    Failed to extract tesseract version from executable: [WinError 2] 系统找不到
    指定的文件。
    Supporting tesseract v3.04.00
    Building with configs: {‘libraries’: [‘tesseract’, ‘lept’], 'cython_compile

    time_env’: {‘TESSERACT_VERSION’: 50593792}}
    running install
    running build
    running build_ext
    building ‘tesserocr’ extension
    error: Microsoft Visual C++ 14.0 is required. Get it with “Microsoft Visual
    C++ Build Tools”: https://visualstudio.microsoft.com/downloads/

ERROR: Command errored out with exit status 1: ‘c:\users\administrator\appdata\l
ocal\programs\python\python38-32\python.exe’ -u -c ‘import sys, setuptools, toke
nize; sys.argv[0] = ‘"’"‘C:\Users\Administrator\AppData\Local\Temp\pip-ins
tall-kv0214pl\tesserocr\setup.py’"’"’; file=’"’"‘C:\Users\Administrator
\AppData\Local\Temp\pip-install-kv0214pl\tesserocr\setup.py’"’"’;f=getattr(
tokenize, ‘"’"‘open’"’"’, open)(file);code=f.read().replace(’"’"’\r\n’"’"’,
‘"’"’\n’"’"’);f.close();exec(compile(code, file, ‘"’"‘exec’"’"’))’ install -
-record ‘C:\Users\Administrator\AppData\Local\Temp\pip-record-8sn912s2\install-r
ecord.txt’ --single-version-externally-managed --compile --install-headers ‘c:\u
sers\administrator\appdata\local\programs\python\python38-32\Include\tesserocr’
Check the logs for full command output.

然后找了一上午的解决方法,包括寻找版本是否正确、配置环境变量等等!。
当时在找版本配置时我就奇怪为什么所有说版本配置的都是python3.7及以下的,完全没有看见有3.8版本的。只是以为没有找到最新版本的下载地址而已,于是继续去找了tesserocr-2.5.0.tar(ps:其实pip下载的也是这个版本,我下载的teseract是最新版的,不过下载旧版的估计也不行,因为我的python版本是3.8的)。下载下来的tesserocr-2.5.0.tar在命令行使用pip install +(文件名拖拽过来的地址)还是不行,报错的内容是不适应这个平台。这时我才反应过来,tesserocr可能有问题,但是我并不知道它的新名字。
兜兜转转又是重新下载旧版的teseract和tesserocr,以及按照博主的文章配置环境变量等等,还是失败了。直到找到19年的关于配置teseract出现了pyteseract才知道原来的tesserocr换了一个名字。
之后的安装过程就和书上差不多,只是将tesserocr换成pyteseract,然后使用pip安装,只是需要修改一点点内容,可以参见https://blog.youkuaiyun.com/Dongzizhu/article/details/100805894?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task。
还有就是image_to_text变成了image_to_string。其它的暂时没有发现问题。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值