推荐系统Crab for Python2.7的搭建

最新推荐文章于 2024-06-23 00:24:21 发布

i_with_u

最新推荐文章于 2024-06-23 00:24:21 发布

阅读量4.4k

点赞数 3

CC 4.0 BY-SA版权

分类专栏： Python 文章标签： python 64位 32位库 exe

本文链接：https://blog.youkuaiyun.com/i_with_u/article/details/45460661

Python 专栏收录该内容

0 篇文章

订阅专栏

声明：

本文欢迎转载，但转载请标注作者及出处。

作者：周秀泽

本人最近在弄一篇推荐系统的论文，需要一个平台来验证研究算法。看到Crab，感觉还不错，所以决定搭建一个环境。

首先简单介绍一下Crab（http://muricoca.github.io/crab/），Crab是基于Python开发的开源推荐软件，它提供了一些常用的推荐算法，例如协同过滤（CF）、Slope One等，并且自带了几个数据集，非常方便。win32位系统搭建并没有太大难度，可是win64位的系统，相应的库比较难找，对应的资料有不足，所以环境搭建、配置会比较麻烦。根据http://muricoca.github.io/crab/install.html说明，我尝试了几天，终于成功了！

下面我将整个过程呈现一遍，把我遇到的问题和注意事项也都列出来，希望对以后学习推荐系统的人有所帮助。

1.电脑配置

本人的电脑是window64位系统，Python是2.75。

每个库我留了两个链接：第一个是exe链接，适合我的电脑（估计会适合全部的Python2.7版本）；第二个是官网链接，其他Python版本和系统可以从官网下载。Python版本不同，配置过程也差不多。

配置前，请根据操作系统和Python版本去下载相对应库的版本！

2.库安装

以下过程按步骤来吧，虽然对最后的效果可能没有任何影响。

1).Setuptools

ez_setup.py for 64位下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657343

官网下载下载：https://pypi.python.org/pypi/setuptools

安装这个后，后面可以用easy_install指令来安装，也可以更新到最新的库。32位系统可以下载setuptools，但是64位系统必须使用ez_setup.py进行安装。

方法：下载ez_setup.py后，在cmd下执行：

python ez_setup.py

，即可自动安装setuptools。目前没有直接的exe安装版本。

安装完毕后，此后都可以用指令：

easy_installPackageName

(注：installPackageName是你要下载的库的名字)来自动安装库。下面我再介绍已安装exe来做说明。其他方法可以用指令来安装，如.tar.gz包解压后用CMD输入：

python setup.py intall

whl包可以用：

pip install PackageName

指令来实现。

2).Numpy

Numpy是Python的一个科学计算的库，提供了矩阵运算的功能，可以像matlab那样使用矩阵，很方便。

64位系统exe下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657363

官网下载下载：https://pypi.python.org/pypi/numpy

一直下一步，直到安装完毕。然后在Python窗口输入：

import numpy

回车，如果没有红的字体错误，意味着安装成功。

3).Scipy

64位系统exe下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657395

官网下载下载：https://pypi.python.org/pypi/scipy

SciPy函数库在NumPy库的基础上增加了众多的数学、科学以及工程计算中常用的库函数。例如线性代数、常微分方程数值求解、信号处理、图像处理、稀疏矩阵等等。

4).Matplotlib

Matplotlib 是Python最著名的绘图库，它提供了一整套和matlab相似的命令API，十分适合交互式地进行制图。而且也可以方便地将它作为绘图控件，嵌入GUI应用程序中。

Matplotlib需要其他库的支持，所以安装之前先把它所需的库搭建好。

(i)Dateutil

64位系统exe下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657421

官网下载下载：https://pypi.python.org/pypi/python-dateutil

(ii)Pyparsing

exe下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657427

官网下载下载：https://pypi.python.org/pypi/pyparsing

(iii)Matplotlib

exe下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657439

官网下载下载：https://pypi.python.org/pypi/matplotlib

5)检测

检测以上几个库是否成功安装，我们先进行测试，为后续做保障。

<span style="font-size:14px;">import numpy as np
import matplotlib.pyplot as plt

X = np.arange(-5.0, 5.0, 0.1)
Y = np.arange(-5.0, 5.0, 0.1)

x, y = np.meshgrid(X, Y)
f = 17 * x ** 2 - 16 * np.abs(x) * y + 17 * y ** 2 - 225

fig = plt.figure()
cs = plt.contour(x, y, f, 0, colors = 'r')
plt.show()</span>

<span style="font-size:14px;">#(注：此处代码参考自KingsLanding的博客：http://www.cnblogs.com/zhuyp1015/archive/2012/07/17/2596495.html)</span>

如果出现以上心形，意味着以上库已经安装正确。否者，根据错误提示，重新安装库。

6).Scikits.learn

(i)安装

64位系统exe下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657489

官网下载下载：https://github.com/scikit-learn/scikit-learn

(ii).检查

输入：

import sklearn.svm

如果没有红色错误，表示安装成功。

7).Nose

(i)安装

安装包下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657545

官网下载下载：https://pypi.python.org/pypi/nose

解压后，cmd到该文件夹下，输入命令：

python setup.py install

来安装该库。

或者不用下载直接：

easy_install nose

安装

(ii)检查

import nose  
result = nose.run()  
print result

如果返回True或False，意味着安装完成。以后，在Python开发可用使用nose 做单元测试。

8).Crab

这是最后一步，也是最关键的一步！

(i)安装

安装包下载地址：http://download.youkuaiyun.com/detail/i_with_u/8657577

官网下载下载：https://github.com/muricoca/crab

解压后，cmd到该文件夹下，输入命令：

python setup.py install

来安装该库。接着出入命令：

easy_install -U crab

将该源码升级到最新版本。

(ii)检测一

检查：python输入以下指令

from scikits.crab import datasets
movies = datasets.load_sample_movies()
songs = datasets.load_sample_songs()

如果没有错误，意味着安装已经基本完成！

但是，当输入：

from scikits.crab.recommenders.knn import UserBasedRecommender

出错，错误为：

<span style="color:#ff0000;">ImportError: No module named learn.base</span>

这个问题困扰了我好久。以至于我怀疑这之前的库版本是不是有问题，所以将之前的库又卸了再装，重头再来。结果还是一样。。。

后来，我突然想scikits.learn.base不就是之前Scikits.learn的包么，我查看安装目录，发现scikits.learn安装后的文件名字为：sklearn！不是认为的scikits，该文件是carb创造的，而非scikits.learn创造！！！

于是，我将错误中提到的scikits.crab下的base.py打开，将from “scikits.learn.base import BaseEstimator”替换成“from sklearn.base import BaseEstimator”。再次运行，换了一个错误。说明这个修改是正确的。

悲剧，这次错误换成了：

No Attribute named _set_params

不过，这个错误，我在github上找到一个老外的解决方法：打开错误中的提到的scikits\crab\recommenders\knn\class.py，将第138和600行的“self._set_params(**params)”替换成“self.set_params(**params)”。

再次运行

from scikits.crab.recommenders.knn import UserBasedRecommender

没有错误了！

(iii)检测二

#!/usr/bin/env python
#coding=utf-8   
 
def base_demo():
    # 基础数据-测试数据
    from scikits.crab import datasets
    movies = datasets.load_sample_movies()
    #print movies.data
    #print movies.user_ids
    #print movies.item_ids
 
    #Build the model
    from scikits.crab.models import MatrixPreferenceDataModel
    model = MatrixPreferenceDataModel(movies.data)
 
    #Build the similarity
    # 选用算法 pearson_correlation
    from scikits.crab.metrics import pearson_correlation
    from scikits.crab.similarities import UserSimilarity
    similarity = UserSimilarity(model, pearson_correlation)
 
    # 选择 基于User的推荐
    from scikits.crab.recommenders.knn import UserBasedRecommender
    recommender = UserBasedRecommender(model, similarity, with_preference=True)
    print recommender.recommend(5) # 输出个结果看看效果 Recommend items for the user 5 (Toby)
 
    # 选择 基于Item 的推荐(同样的基础数据，选择角度不同)
    from scikits.crab.recommenders.knn import ItemBasedRecommender
    recommender = ItemBasedRecommender(model, similarity, with_preference=True)
    print recommender.recommend(5) # 输出个结果看看效果 Recommend items for the user 5 (Toby)
 
def itembase_demo():
    from scikits.crab.models.classes import MatrixPreferenceDataModel
    from scikits.crab.recommenders.knn.classes import ItemBasedRecommender
    from scikits.crab.similarities.basic_similarities import ItemSimilarity
    from scikits.crab.recommenders.knn.item_strategies import ItemsNeighborhoodStrategy
    from scikits.crab.metrics.pairwise import euclidean_distances
    movies = {
            'Marcel Caraciolo': \
                {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5, 'Just My Luck': 3.0, 'Superman Returns': 3.5, 'You, Me and Dupree': 2.5, 'The Night Listener': 3.0}, \
            'Paola Pow': \
                {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5, 'Just My Luck': 1.5, 'Superman Returns': 5.0, 'The Night Listener': 3.0, 'You, Me and Dupree': 3.5}, \
            'Leopoldo Pires': \
                {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.0, 'Superman Returns': 3.5, 'The Night Listener': 4.0}, 
            'Lorena Abreu': \
                {'Snakes on a Plane': 3.5, 'Just My Luck': 3.0, 'The Night Listener': 4.5, 'Superman Returns': 4.0, 'You, Me and Dupree': 2.5}, \
            'Steve Gates': \
                {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0, 'Just My Luck': 2.0, 'Superman Returns': 3.0, 'The Night Listener': 3.0, 'You, Me and Dupree': 2.0}, \
            'Sheldom':\
                {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0, 'The Night Listener': 3.0, 'Superman Returns': 5.0, 'You, Me and Dupree': 3.5}, \
            'Penny Frewman': \
                {'Snakes on a Plane':4.5,'You, Me and Dupree':1.0, 'Superman Returns':4.0}, 'Maria Gabriela': {}
            }
    model = MatrixPreferenceDataModel(movies)
    items_strategy = ItemsNeighborhoodStrategy()
    similarity = ItemSimilarity(model, euclidean_distances)
    recsys = ItemBasedRecommender(model, similarity, items_strategy)
     
    print recsys.most_similar_items('Lady in the Water')
    #Return the recommendations for the given user.
    print recsys.recommend('Leopoldo Pires')
    #Return the 2 explanations for the given recommendation.
    print recsys.recommended_because('Leopoldo Pires', 'Just My Luck', 2)
    #Return the similar recommends
    print recsys.most_similar_items('Lady in the Water')
    #估算评分
    print recsys.estimate_preference('Leopoldo Pires','Lady in the Water')    
     
base_demo()
itembase_demo()<pre name="code" class="python" style="color: rgb(51, 51, 51); line-height: 25.2000007629395px;"> #(注:此处代码来自深蓝苹果的博客：http://my.oschina.net/kakablue/blog/260749)

如果没错，就大功告成了！

好好研究推荐系统吧，大有可为呀！