instance: (u,i,rui,tui)(u, i, r_{ui}, t_{ui})(u,i,rui,tui)
describes user uuu applies item iii on time tuit_{ui}tui with score ruir_{ui}rui.
formular:
Sij=1kjλ∑uruirujkuρg(tui,tuj)g(t1,t2)=exp[−(t1−t2)22τ2] S_{ij} = \frac 1 {k_j^\lambda} \sum_u \frac {r_{ui} r_{uj}} {k_u^\rho} g(t_{ui}, t_{uj}) \\
g(t_1, t_2) = \exp[ - \frac {(t_1 - t_2)^2} {2\tau^2}]Sij=kjλ1u∑kuρruirujg(tui,tuj)g(t1,t2)=exp[−2τ2(t1−t2)2]
parameters: λ,ρ,τ\lambda, \rho, \tauλ,ρ,τ
source data:
(u,p,r,t)(u, p, r, t)(u,p,r,t)
- MAP:
p:(u,r,t) p: (u,r,t)p:(u,r,t)
REDUCE:
p:[(u,r,t),(),...] p: [(u,r,t), (), ...]p:[(u,r,t),(),...]
calc: kp=(∑r)λk_p = (\sum r)^\lambdakp=(∑r)λ,
u:(p,r,t,kp) u:(p,r,t, k_p)u:(p,r,t,kp) - REDUCE:
u:[(p,r,t,kp),(),...]u: [(p,r,t, k_p), (), ...]u:[(p,r,t,kp),(),...] - Map
calc: ku=(∑r)ρk_u = (\sum r)^\rhoku=(∑r)ρ,
p0:{(pi,si),⋯ ,}p_0: \{(p_i, s_i), \cdots,\}p0:{(pi,si),⋯,}
with : si→r0rikukpig(t0−ti)s_i \to \frac {r_0r_i} {k_u k_{pi} }g(t_0 - t_i)si→kukpir0rig(t0−ti)
Reduce:
p0:[(pi,si),(),...]p_0: [(p_i,s_i), (), ...]p0:[(pi,si),(),...]
with: si→∑j=isjs_i \to \sum_{j = i} s_jsi→∑j=isj
Note
map(3)可合并到Reduce(2)中,但会极大增加2结果文件的大小,且会略微增加总耗时
时间对比:
#合并前
job1 time: 115s 1.9m
job2 time: 110s 1.83m
job3 time: 4551s 75.8m
total time: 4776s 79.6m
# 合并后
job1 time: 110s 1.83m
job2 time: 1217s 20.3m
job3 time: 3656s 60.9m
total time: 4983s 83m
本文介绍了一种结合用户行为评分与时间因素的协同过滤推荐算法。该算法通过计算用户之间的相似度,并引入时间衰减函数来衡量不同时间段内评分的相关性,从而提高推荐系统的准确性和时效性。
1910

被折叠的 条评论
为什么被折叠?



