python编程快速上手(持续更新中…)
推荐系统基础
文章目录
一、准备工作
1.安装编译相关工具
yum -y groupinstall “Development tools”
yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel
yum install libffi-devel -y
2.下载Python安装包
cd /opt/
wget https://www.python.org/ftp/python/3.8.3/Python-3.8.3.tgz
tar -zxvf Python-3.8.3.tgz -C /opt/
二、Python安装步骤
1.编译安装Python
mkdir /usr/local/python3 #创建编译安装目录
cd Python-3.8.3
./configure --prefix=/usr/local/python3
make && make install
安装过,出现下面两行就成功了
Installing collected packages: setuptools, pip
Successfully installed pip-19.2.3 setuptools-41.2.0
2.创建软连接
# 查看当前python软连接
ll /usr/bin/ |grep python
lrwxrwxrwx 1 root root 7 Nov 26 2018 python -> python2
lrwxrwxrwx 1 root root 9 Nov 26 2018 python2 -> python2.7
-rwxr-xr-x 1 root root 7216 Jul 13 2018 python2.7
默认系统安装的是python2.7
删除python软连接
rm -rf /usr/bin/python
配置软连接为python3
#添加python3的软链接
ln -s /usr/local/python3/bin/python3 /usr/bin/python
查看python默认版本
python -V
删除默认pip软连接,并添加pip3新的软连接
rm -rf /usr/bin/pip
#添加 pip3 的软链接
ln -s /usr/local/python3/bin/pip3 /usr/bin/pip
3.更改yum配置
因为其要用到python2才能执行,否则会导致yum不能正常使用(不管安装 python3的那个版本,都必须要做的)
vi /usr/bin/yum
把 #! /usr/bin/python 修改为 #! /usr/bin/python2
vi /usr/libexec/urlgrabber-ext-down
把 #! /usr/bin/python 修改为 #! /usr/bin/python2
vi /usr/bin/yum-config-manager
#!/usr/bin/python 改为 #!/usr/bin/python2
三、虚拟环境安装与使用
1.为什么需要虚拟环境
运行不同环境的项目
2.设置国内镜像源
# 查看当前源地址
$ pip config list | grep global.index-url
# 设置 pip 为清华源
$ pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
Writing to /root/.config/pip/pip.conf
# 确认源地址
$ pip config list | grep global.index-url
global.index-url='https://pypi.tuna.tsinghua.edu.cn/simple'
3.安装虚拟环境
a.安装虚拟环境
pip install virtualenv
#管理虚拟环境
#先安装环境
yum install python-setuptools python-devel
pip install vitrualenvwrapper
b.创建目录存放虚拟环境
mkdir $HOME/.virtualenvs
c.查找.virtualenvs
find / -name .virtualenvs
/root/.virtualenvs
d.查找 virtualenvwrapper.sh
find / -name virtualenvwrapper.sh
/usr/local/python3/bin/virtualenvwrapper.sh
e.使用命令:vim ~/.bashrc 在里面添加下面内容
if [ “${VIRTUALENVWRAPPER_PYTHON:-}” = “” ]
then
VIRTUALENVWRAPPER_PYTHON=/usr/local/python3/bin/python3
fi
f.启动配置
source ~/.bashrc
4.创建虚拟环境
第一种安装:
mkvirtualenv myvm
第二种指定版本
mkvirtualenv -p python3 mywork
第三种创建不需下载:
mkvirtualenv mywork --no download
python@ubuntu:~$ which python3
/usr/bin/python3
python@ubuntu:~$ mkvirtualenv -p /usr/bin/python3 test
Already using interpreter /usr/bin/python3
Using base prefix ‘/usr’
New python executable in /home/python/.virtualenvs/test/bin/python3
Also creating executable in /home/python/.virtualenvs/test/bin/python
Installing setuptools, pkg_resources, pip, wheel…done.
Done 安装成功后,自动切换 mywork 虚拟环境
注意:
1、虚拟机能联网
2、创建成功会自动工作在这个虚拟环境中
3、工作在虚拟环境中,提示符前面会有“虚拟环境名称”
5.虚拟环境使用
a.进入虚拟环境
workon mywork
b.退出虚拟环境(活跃)
deactivate
c.查看虚拟环境
workon
d.删除虚拟环境(退出才能删除)
rmvirtualenv mywork
e.查看虚拟环境目录
which python
6.虚拟环境中安装包
mrjob 安装,使用pip安装
pip install mrjob
7.虚拟环境中运行程序
A.上传数据与脚本到tmp
cd /tmp
数据:test.txt
hadoop test
this is a test file
hello word count
hadoop streaming
spark spark sql
脚本
mapper.py
import sys
# 标准输入stdin
for line in sys.stdin:
line = line.strip()
words = line.split()
for word in words:
print("%s\t%s"%(word, 1))
reducer.py
import sys
current_word = None
current_count = 0
word = None
for line in sys.stdin:
line = line.strip()
word,count = line.split()
try:
count = int(count)
except ValueError:
continue
if current_word == word:
current_count += count
else :
if current_word :
print('%s\t%s' % (current_word, current_count))
current_count = count
current_word = word
if current_word == word:
print('%s\t%s' % (current_word, current_count))
B.执行脚本
a.进入虚拟环境
workon mywork
b.通过管道将数据传递给程序
cat test.txt | python mapper.py
c.通过管道将数据传递给程序-排序
cat test.txt | python mapper.py | sort
d.通过管道将数据传递给程序-排序-统计词频
cat test.txt | python mapper.py | sort | python reducer.py