码迷,mamicode.com
首页 > 其他好文 > 详细

基于谷本系数计算相似度

时间:2015-05-27 21:10:24      阅读:638      评论:0      收藏:0      [点我收藏+]

标签:

//这段程序写的是忽略偏好值基于谷本系数计算相似度
//这个算法是基于谷本系数。
//这个值也叫做Jaccard系数,由两个用户共同表达过偏好的物品数目除以至少
//一个用户表达过偏好的物品数目而得(就是两者得交集除以两者得并集)
package byuser;

import java.io.File;
import java.io.IOException;

import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.eval.RecommenderBuilder;
import org.apache.mahout.cf.taste.eval.RecommenderEvaluator;
import org.apache.mahout.cf.taste.impl.eval.AverageAbsoluteDifferenceRecommenderEvaluator;
import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.CachingUserSimilarity;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.impl.similarity.SpearmanCorrelationSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import org.apache.mahout.cf.taste.similarity.precompute.example.GroupLensDataModel;
import org.apache.mahout.math.hadoop.similarity.cooccurrence.measures.TanimotoCoefficientSimilarity;

public class TanimotoCoefficientSimilarityTest {

	public TanimotoCoefficientSimilarityTest() throws IOException, TasteException{
		DataModel model = new GroupLensDataModel(new File("E:\\mahout项目\\examples\\ratings.dat"));
		RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
		RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
			@Override
			public Recommender buildRecommender(DataModel model) throws TasteException {
				UserSimilarity similarity = new org.apache.mahout.cf.taste.impl.similarity.TanimotoCoefficientSimilarity(model);
				UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
				return new GenericUserBasedRecommender(model, neighborhood, similarity);
			}
		};
		double score = evaluator.evaluate(recommenderBuilder, null, model, 0.95, 0.05);
		System.out.println("采用偏好基于谷本系数计算相似度的推荐引擎的评测得分是: " + score);
	}
	
	public static void main(String[] args) throws TasteException, IOException {
		// TODO Auto-generated method stub
		TanimotoCoefficientSimilarityTest tt = new TanimotoCoefficientSimilarityTest();
	}

}

结果:


技术分享

基于谷本系数计算相似度

标签:

原文地址:http://blog.csdn.net/u012965373/article/details/46051321

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!