标签:style blog http ext color com
w(ua, ci) 是用户a对类别Ci的相关度,后项I为类别Ci内用户点击Sk的数量和。
Fetch the clusters that this user belongs to, for each cluster lookup how many times (discounted by age) did members of this cluster click on the story s (normalized by the total number of clicks made by members of this cluster), finally add these numbers to compute the recommendation score. The recommendation scores thus obtained are normalized (by simple scaling) so that they all lie between 0 and 1.
对用户的聚类采用minHash 和PLsa方法。
minHash: http://my.oschina.net/pathenon/blog/65210
把文章看做无向图,边看做同时访问的次数,用户访问了Sk, 遍历用户访问history,更新文章Sk与history中文章同时出现的次数。详细推荐过程
We fetch the user’s recent click history Cui , limited to past few hours or days8. For every item si in the user’s click history, we lookup the entry for the pair si, s in the adjacency list for si stored in the Bigtable. To the recommendation score we add the value stored in this entry normalized by the sum of all entries for si. Finally, all the covisitation scores are normalized to a value between 0 and 1 by linear scaling.
两个表 UT 统计用户点击数量和记录用户-类别关系,ST记录文章之间的点击数和记录文章-类别关系,
标签:style blog http ext color com