用Python求均值与方差,可以自己写,也可以借助于numpy,不过到底哪个快一点呢?
我做了个实验,首先生成9百万个样本:
1
2
3 |
nlist = range ( 0 , 9000000 ) nlist = [ float (i) / 1000000
for i in
nlist] N = len (nlist) |
第二行是为了让样本小一点,否则从1加到9百万会溢出的。
自己实现,遍历数组来求均值方差:
1
2
3
4
5
6
7 |
sum1 = 0.0 sum2 = 0.0 for
i in
range (N): sum1 + = nlist[i] sum2 + = nlist[i] * * 2 mean = sum1 / N var = sum2 / N - mean * * 2 |
用时5.3s
借助numpy的向量运算来求:
1
2
3
4
5
6
7 |
import
numpy narray = numpy.array(nlist) sum1 = narray. sum () narray2 = narray * narray sum2 = narray2. sum () mean = sum1 / N var = sum2 / N - mean * * 2 |
用时1.0s
结论:还是用numpy吧~毕竟针对性优化过就是不一样~
原文地址:http://www.cnblogs.com/plwang1990/p/3774744.html