Exponential family: 指数分布族

时间：2015-04-03 17:11:42 阅读：233 评论：0 收藏：0 [点我收藏+]

标签：

Exponential family(指数分布族)是一个经常出现的概念，但是对其定义并不是特别的清晰，今天好好看了看WIKI上的内容，有了一个大致的了解，先和大家分享下。本文基本是WIKI上部分内容的翻译。

1. 几个问题

什么是指数分布族？

既然是”族“，那么族内的共同特点是什么？

为何指数分布族被广泛应用？是指数分布族选择了我们，还是我们选择了指数分布族？(这个问题没有回答，需要结合具体实例分析)

2. 参考

Exponential family. (2015, February 26). In Wikipedia, The Free Encyclopedia. Retrieved 05:00, April 3, 2015, from http://en.wikipedia.org/w/index.php?title=Exponential_family&oldid=648989632

3. 指数分布族: 定义

指数分布族指概率分布满足以下形式的分布

其中（$\theta$,$x$也可以是标量）

Exponential family ，也称 Exponential Class，包括了很多常见的分布。譬如

normal, exponential, gamma, chi-squared, beta, Dirichlet, Bernoulli, categorical, Poisson, Wishart, Inverse Wishart.

分布函数中的T (x ) , η (θ ) 和 A(η )并不是任意定义的，每一部分都有其特殊的意义。

T (x )是分布的充分统计量（sufficient statistic ）

η 是自然参数。对于有限的函数而言， η 的集合被称为自然参数空间。

A(η )被称为对数配分函数（partition function ），实际上它是归一化因子的对数形式。它使得概率分布积分为1的条件得到满足。

上式可以看出，通过对 A(η )求导，容易得到充分统计量T (x )的均值，方差和其他性质。（怎么求？）

4. 指数分布族：性质

指数分布族具有很多性质，这些性质使得指数分布族在统计分析具有重要作用。并且在很多情况下，只有指数分布族具有那些性质。其中包括

不太懂，怕弄错，还是给原文好了。

Exponential families have sufficient statistics that can summarize arbitrary amounts of independent identically distributed data using a fixed number of values.

Exponential families have conjugate priors, an important property in Bayesian statistics.

The posterior predictive distribution of an exponential-family random variable with a conjugate prior can always be written in closed form (provided that the normalizing factor of the exponential-family distribution can itself be written in closed form). Note that these distributions are often not themselves exponential families. Common examples of non-exponential families arising from exponential ones are the Student‘s t-distribution, beta-binomial distribution and Dirichlet-multinomial distribution.

In the mean-field approximation in variational Bayes (used for approximating the posterior distribution in large Bayesian networks), the best approximating posterior distribution of an exponential-family node (a node is a random variable in the context of Bayesian networks) with a conjugate prior is in the same family as the node.

具体解释如下（没看到的就不解释了……）：

（1）指数函数的充分统计量的可以从大量的i.i.d.数据中归结为估计的几个值（即T (x )），这点在 sufficient statistics中也有说明

According to the Pitman–Koopman–Darmois theorem, among families of probability distributions whose domain does not vary with the parameter being estimated, only in exponential families is there a sufficient statistic whose dimension remains bounded as sample size increases. Less tersely, suppose are independent identically distributed random variables whose distribution is known to be in some family of probability distributions. Only if that family is an exponential family is there a (possibly vector-valued) sufficient statistic whose number of scalar components does not increase as the sample size n increases.

This theorem shows that sufficiency (or rather, the existence of a scalar or vector-valued of bounded dimension sufficient statistic) sharply restricts the possible forms of the distribution.

（2）指数分布族具有共轭先验特性。可参考本文“术语解释”。