Jonathan Peck

Ph.D. student, Ghent University

A simple dispersion inequality

You’ve heard of concentration inequalities, now get ready for dispersion inequalities!
Published: Monday, June 28, 2021
« Back

In the field of statistics, we are very often interested in the concentration properties of random variables. That is, we like to know what values the random variable will tend to take on with high probability. This is usually because we don’t have a nice, simple formula for the distribution of the random variable nor an intuitive understanding of what the variable represents. Perhaps the simplest example of a concentration inequality is Markov’s inequality:

Markov’s inequality. Let $X$ be a non-negative random variable with mean $\mu$ and let $t \geq \mu$. Then

$$\Pr[X \geq t] \leq \frac{\mu}{t}.$$


$$\Pr[X \leq t] \geq 1 - \frac{\mu}{t}.$$

Essentially all concentration inequalities will be of this form: they will give an upper bound on the probability that the random variable exceeds a certain value or, equivalently, a lower bound on the probability that the random value will lie within some specified range. Another example is Chebyshev’s inequality:

Chebyshev’s inequality. Let $X$ be a random variable with mean $\mu$ and variance $\sigma^2$. Then, for any $t \geq \sigma$,

$$\Pr[|X - \mu| \geq t] \leq \frac{\sigma^2}{t^2}.$$


$$\Pr[|X - \mu| \leq t] \geq 1 - \frac{\sigma^2}{t^2}.$$

Again, this concentration inequality provides an upper bound on the probability that the random variable exceeds a certain value. Equivalently, it provides a lower bound on the probability that the random variable will lie in some neighborhood around the mean. However, intuitively, we would expect that for a random variable to have non-zero variance, there must also be a lower bound on this probability. Indeed, the variance is one of the most basic measures of spread, and it indicates that the random variable is dispersed around the mean to some degree. Searching through the literature did not yield any useful results, though, so I will present a very simple result reminiscent of Chebyshev’s inequality that provides such a bound. Since this bound actually gives the opposite of a concentration inequality, I refer to it as a “dispersion inequality:”

Dispersion inequality. Let $X$ be a bounded random variable supported on the interval $[a,b]$ with mean $\mu$ and variance $\sigma^2$. Let $u = \max\{ |a - \mu|, |b - \mu| \}$. Then, for any real number $t$ such that $\sigma^2 - u^2 \leq t^2 \leq \sigma^2$,

$$\Pr[|X - \mu| \geq t] \geq \frac{\sigma^2 - t^2}{u^2}.$$


$$\Pr[|X - \mu| \leq t] \leq 1 - \frac{\sigma^2 - t^2}{u^2}.$$

The proof consists of a few simple calculations and is very similar to the proof that is typically given for Chebyshev’s inequality. First, we recall the definition of the variance:

$$\sigma^2 = \mathbb{E}[(X - \mu)^2].$$

This expectation can be decomposed into two parts based on whether $|X - \mu| \leq t$ or $|X - \mu| > t$. These are mutually exclusive events, and so we can easily condition on them to decompose the above expectation as follows:

$$\mathbb{E}[(X - \mu)^2] = p_1E_1 + p_2E_2,$$


$$\begin{aligned} p_1 &= \Pr[|X - \mu| \geq t], & p_2 &= \Pr[|X - \mu| < t],\\
E_1 &= \mathbb{E}[(X - \mu)^2 \mid |X - \mu| \geq t], & E_2 &= \mathbb{E}[(X - \mu)^2 \mid |X - \mu| < t]. \end{aligned}$$

Note that if $|X - \mu| < t$, then $(X - \mu)^2 < t^2$ and so $E_2 < t^2$. Also, since $p_2$ is a probability, $p_2 \leq 1$. It now remains to bound $E_1$ from above. Since $X$ is bounded within $[a,b]$, it holds that $|X - \mu| \leq \max\{ |a - \mu|, |b - \mu| \} = u$. Hence $E_1 \leq u^2$ and so

$$\sigma^2 \leq p_1u^2 + t^2.$$

Rearranging the terms then yields the stated result. $\blacksquare$

This inequality confirms our intuition about the variance: a higher variance $\sigma^2$ means that for the same value of $t$, the lower bound given by the dispersion inequality will increase, indicating a wider dispersion of the random variable around the mean.

Note that, unfortunately, we cannot simply combine the dispersion inequality with Chebyshev’s inequality to yield a two-sided bound

$$\frac{\sigma^2 - t^2}{u^2} \leq \Pr[|X - \mu| \geq t] \leq \frac{\sigma^2}{t^2}.$$

This is because Chebyshev’s inequality requires $t \geq \sigma$ whereas the dispersion inequality requires $t \leq \sigma$, so they can only be used simultaneously when $t = \sigma$. In that case, however, the bounds become trivial: the lower bound is zero and the upper bound is one.

Example: the uniform distribution. Suppose $X$ is a uniform random variable on the unit interval $[0,1]$. Then $\mu = 1/2$, $\sigma^2 = 1/12$ and $u = 1/2$. For $0 \leq t^2 \leq 1/12$,

$$\Pr[|X - 1/2| \geq t] \geq \frac{1}{3} - 4t^2.$$

This lower bound will therefore vary between $\frac{1}{3} \approx 33\%$ for $t = 0$ and $\frac{11}{36} \approx 30\%$ for $t = 1/\sqrt{12}$. The uniform distribution is simple enough for us to solve this inequality exactly. Specifically, for $t \in [0, 1/2]$,

$$\begin{aligned} \Pr[|X - 1/2| \geq t] &= \Pr[X \leq 1/2 - t \vee X \geq 1/2 + t]\\
&= \Pr[X \leq 1/2 - t] + \Pr[X \geq 1/2 + t]\\
&= \int_0^{1/2 - t}\mathrm{d}x + \int_{1/2 + t}^1\mathrm{d}x\\
&= 1 - 2t. \end{aligned}$$

Therefore, the exact solution is 100% for $t = 0$ and $1-1/\sqrt{3} \approx 42\%$ for $t = 1/\sqrt{12}$. This shows that the dispersion inequality can be very loose, but that is to be expected from these types of inequalities that incorporate only very little distributional information. The uniform distribution is in some sense the worst case for our dispersion inequality, since it is the distribution that has the highest possible spread of all.


I did not choose to study this question out of any particular practical need to solve some real problem; it was merely a matter of mathematical curiosity. As such, the dispersion inequality I presented here is very rudimentary, and it is likely that much more sophisticated inequalities can be proven. This particular result was inspired by Chebyshev’s inequality, but it stands to reason that we might achieve more interesting inequalities if we employ proof techniques based on something like a Chernoff bound or a variant of Hoeffding’s lemma.

« Back