Kali 安装scrapy爬虫框架

本文介绍了Scrapy爬虫环境的搭建过程,包括Python、Twisted、w3lib等依赖库的安装步骤,并验证了安装的成功。

参考http://www.linuxidc.com/Linux/2012-07/66236.htm

准备工作
Requirements
Python 2.5, 2.6, 2.7 (3.x is not yet supported)
Twisted 2.5.0, 8.0 or above (Windows users: you’ll need to install Zope.Interface and maybe pywin32 because of this Twisted bug)
w3lib
lxml or libxml2 (if using libxml2, version 2.6.28 or above is highly recommended)
simplejson (not required if using Python 2.6 or above)
pyopenssl (for HTTPS support. Optional, but highly recommended)
---------------------------------------------
Twisted安装过程
sudo apt-get install python-twisted python-libxml2 python-simplejson
安装完成后进入python,测试Twisted是否安装成功


pyOpenSSL
wget http://pypi.python.org/packages/source/p/pyOpenSSL/pyOpenSSL-0.13.tar.gz#md5=767bca18a71178ca353dff9e10941929
tar -zxvf pyOpenSSL-0.13.tar.gz
cd pyOpenSSL-0.13
sudo python setup.py install


pycrypto
wget http://pypi.python.org/packages/source/p/pycrypto/pycrypto-2.5.tar.gz#md5=783e45d4a1a309e03ab378b00f97b291
tar -zxvf pycrypto-2.5.tar.gz
cd pycrypto-2.5
sudo python setup.py install


测试是否安装成功
$python
>>> import Crypto
>>> import twisted.conch.ssh.transport
>>> print Crypto.PublicKey.RSA
<module 'Crypto.PublicKey.RSA' from '/usr/python/lib/python2.5/site-packages/Crypto/PublicKey/RSA.pyc'>
>>> import OpenSSL 
>>> import twisted.internet.ssl
>>> twisted.internet.ssl
<module 'twisted.internet.ssl' from '/usr/python/lib/python2.5/site-packages/Twisted-10.1.0-py2.5-linux-i686.egg/twisted/internet/ssl.pyc'>
如果出现类似提示,说明pyOpenSSL模块已经安装成功了,否则,请检查上面的安装过程(OpenSSL需要pycrypto)。


w3lib
sudo easy_install -U w3lib


Scrapy
wget http://pypi.python.org/packages/source/S/Scrapy/Scrapy-0.14.3.tar.gz#md5=59f1225f7692f28fa0f78db3d34b3850
tar -zxvf Scrapy-0.14.3.tar.gz
cd Scrapy-0.14.3
sudo python setup.py install


Scrapy安装验证
经过上面的安装和配置过程,已经完成了Scrapy的安装,我们可以通过如下命令行来验证一下:
$ scrapy
Scrapy 0.14.3 - no active project


Usage:
  scrapy <command> [options] [args]


Available commands:
  fetch         Fetch a URL using the Scrapy downloader
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy





评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

月流霜

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值