标签:
定义:优点是:即使在样本空间的概率密度急剧变化的情况,层次抽样也能保证不同(概率密度)层次的样本的抽取概率的精确性。
If population density varies greatly within a region, stratified sampling will ensure that estimates can be made with equal accuracy in different parts of the region, and that comparisons of sub-regions can be made with equal statistical power.
Randomized stratification can also be used to improve population representativeness in a study.
Stratified sampling is not useful when the population cannot be exhaustively partitioned into disjoint subgroups. It would be a misapplication of the technique to make subgroups‘ sample sizes proportional to the amount of data available from the subgroups, rather than scaling sample sizes to subgroup sizes (or to their variances, if known to vary significantly
spark MLlib 概念 2:Stratified sampling 层次抽样
标签:
原文地址:http://www.cnblogs.com/zwCHAN/p/4265743.html