User:Zvika/Sandbox

From Wikipedia, the free encyclopedia

[edit] Generalized Bayes estimator

The prior distribution π has thus far been assumed to be a true probability distribution, in that

\int \pi(\theta) d\theta = 1.

However, occasionally this can be a restrictive requirement. For example, there is no distribution having equal probabilities for every real number. Yet, in some sense, such a "distribution" seems like a natural choice for a non-informative prior, i.e., a prior distribution which does not imply a preference for any particular value of the unknown parameter. One can still define a function π(θ) = 1, but this would not be a proper probability distribution since it has infinite mass,

\int{\pi(\theta)d\theta}=\infty.

Such measures π(θ), which are not probability distributions, are referred to as improper priors.

The use of an improper prior typically results in infinite Bayes risk. As a consequence, it is no longer meaningful to speak of an estimator which minimizes the Bayes risk. Nevertheless, in many cases, one can define the posterior distribution

\pi(\theta|x) = \frac{p(x|\theta) \pi(\theta)}{\int p(x|\theta) \pi(\theta) d\theta}.

This is a definition, and not an application of Bayes' theorem, since Bayes' theorem can only be applied when all distributions are proper. However, it is not uncommon for the resulting "posterior" to be a valid probability distribution. In this case, the posterior expected loss

 \int{L(\theta,a)\pi(\theta|x)d\theta}

is typically well-defined and finite. Recall that, for a proper prior, the Bayes estimator minimizes the posterior expected loss. When the prior is improper, an estimator which minimizes the posterior expected loss is referred to as a generalized Bayes estimator.

[edit] Example

A typical example concerns the estimation of a location parameter with a loss function of the type L(a − θ). Here θ is a location parameter, i.e., p(x | θ) = f(x − θ).

It is common to use the improper prior π(θ) = 1 in this case, especially when no other more subjective information is available. This yields

π(θ | x) = p(x | θ) = f(x − θ)

so the posterior expected loss equals

E[L(a-\theta)]=\int{L(a-\theta)f(x-\theta)d\theta}=\int{L(a-x+y)f(y)dy}

where a change of variables y = a − θ has been used. Defining C = ax, we get

E[L(a-\theta)]=\int{L(C+y)f(y)dy}=E[L(y+C)]

therefore the Generalized Bayes estimator is x+C where C is a constant minimizing E[L(y+C)].
Under MSE, as a private case, C=E[y]=\int{yf(y)dy} and the generalized Bayes estimator is δ(x)=x-E[y].
Assuming for example gaussian samples X|θ~N(θ,Ip) where X=(x1,...,xp) and θ=(θ1,...,θp) , then the generalized Bayes estimator of θ is δ(X)=X .