标签:
计算数据集的香农熵
from math import log def calcShannonEnt(dataSet): numEntries = len(dataset) labelCounts = {} for featVec in dataset: currentLabel = featVec[-1] if currentLabel not in labelCounts.keys(): labelCountspcurrentLabel] = 0 labelCounts[currentLabel] += 1 shannonEnt = 0.0 for key in labelCounts: prob = float(labelCounts[key])/numEntries shannonEnt -= prob * log(prob,2) return shannonEnt
H=-∑p(xi)log(2,p(xi)) (i=1,2,..n)
标签:
原文地址:http://www.cnblogs.com/battle-lee/p/4548768.html