Chebychev's Inequality

March 30, 2023 - A fundamental concept that establishes a bound on the probability of a random variable deviating from its mean. The theorem is named after the Russian mathematician Pafnuty Chebyshev, who first introduced it in 1867.

In this inequality, the variance of a distribution is used to set an upper limit on the probability that a random variable will be a certain distance away from its mean. This says that Chebyshev's inequality allows us to make statements about the likelihood of extreme events, which is particularly useful in risk analysis and decision-making contexts. In this way, Chebyshev's inequality is an important concept that forms the foundation of many statistical techniques and applications.

Theorem: Chebychev's Inequality

For a defined expectation of X, meaning: $$E[X] < \infty$$ and $$\lambda > 0$$ we have: $$ p( | X - E[X] | \geq \lambda) \leq \frac{Var(X)}{\lambda^2}$$ where Var(X) denotes the variance of X.

Proof of Chebychev's Inequality:

The proof below uses Markov's Inequality, stating the following:

For X being non-negative: $$X \geq 0$$ and a > 0: $$p(X \geq a) \leq \frac{E[X]}{a}$$

Define: $$Y = (X-E[X])^2$$ As it is a squared term, we know Y is Non-Negative, so by Markov's Inequality for any a > 0: $$p(Y \geq a) \leq \frac{E[Y]}{a}$$ Now rewrite this equation via: $$1. \quad E[Y] = E[(X-E[X])^2] = Var[X]$$ $$2. \quad Y \geq a \Longleftrightarrow (X - E[X])^2 \geq a \Longleftrightarrow |X - E[X]| \geq \sqrt{a}$$ We get: $$ p(|X-E[X]| \geq \sqrt{a}) \leq \frac{Var(X)}{a}$$ Set $$ \lambda = \sqrt{a}$$ and we arrive at Chebychev's Inequality defined on top: $$ p( | X - E[X] | \geq \lambda) \leq \frac{Var(X)}{\lambda^2}$$

What does it actually mean?

Chebychev's Inequality bounds the probability of oberserving an "unexpected" outcome with unexpected being defined as a certain difference to the mean.

It is unlikely that we move away from the mean by a large amount:

If |X - E[X]| is large, we divide by a large number: $$ \frac{Var(X)}{\lambda^2} $$ We get a smaller probability if |X - E[X]| is increasing. In other words: It is getting more unlikely to move further away from the expected value of X.

How can we use it?

One example scenario: Suppose we have a monte carlo estimator which is by definition unbiased.

$$ \hat{\theta} : Monte \, Carlo \, Estimator$$ $$ E[\hat{\theta}] = \theta $$ We can easily upper bound the probability: $$p(|\hat{\theta} - E[\hat{\theta}]|\geq \lambda) \Longleftrightarrow p(|\hat{\theta} - \theta| \geq \lambda)$$ via Chebychev.