首先,更新ubuntu
sudo apt-get install update
sudo apt-get install upgrade
安装ubuntu经典gnome界面
sudo apt-get install gnome-session-fallback
安装前准备
1. 安装pip
sudo apt-get install python-pip
2. 安装setuptools
wget https://bootstrap.pypa.io/ez_setup.py -O - | sudo python
3. 安装lxml
sudo apt-get install python-lxml
sudo apt-get install libxml2-dev libxslt-dev python-dev判断是否成功:import lxml
4.安装openssl
sudo apt-get install libffi-dev
sudo apt-get install libssl-dev libssl0.9.8 libgtk2.0-dev
判断是否安装成功: import OpenSSL
5. 安装scrapy
sudo pip install scrapy
判断是否成功: scrapy
安装其他scrapy抓取相关软件:
1. python-webkit 在scrapy抓取数据时执行js代码
sudo apt-get install python-webkit
还有一些相关的包需要安装
https://wiki.python.org/moin/PythonWebKit
apt-get install python-dev python-ply
http://www.gnu.org/software/pythonwebkit/
libwebkitgtk最新的是3.0
sudo apt-get install libwebkitgtk-3.0-0
2. 安装 jswebkit 这个有可能会在上一步没装上, 需要自己重新装一下
sudo apt-get install python-jswebkit
3. 安装 pyjamas
https://github.com/pyjs/pyjs/wiki/pyjamasubuntuwebkitgtk
Install libwebkit for it's dependencies, and then remove it:
sudo apt-get build-dep libwebkit-dev sudo apt-get install libwebkit-dev sudo apt-get remove libwebkit-dev
sudo apt-get install pyjamas
https://wiki.python.org/moin/PyjamasDesktop
如果上面的命令无法安装则到https://launchpad.net/ubuntu/+source/pyjamas找到deb包安装
4. 安装 PyWebkitDFB
sudo apt-get install libdirectfb-dev
http://www.gnu.org/software/pythonwebkit/
5. 安装 libdirectfb-extras 这个包含一个X11的插件, 通过编辑~/.directfb 并且加入下面两行
system=x11
force-windowed
软件包如下:
libdirectfb-1.2-9-dbg
libdirectfb-extra-dbg
6. 安装ibcurl4
sudo apt-get install libcurl4-gnutls-dev
7. Xvfb 适用非Xwindows环境时
sudo apt-get install xvfb
8. beautifulsoup python html/xml parser
sudo apt-get install python-bs4