Convergence of Random Variables

Introduction

In the realm of probability theory and statistical inference, it's common to encounter situations where we aim to estimate an unobservable random variable XX through a sequence of approximations. Suppose we cannot observe XX directly, but we can perform measurements or experiments to obtain estimates X1,X2,X3,X_1, X_2, X_3, \ldots. Each subsequent estimate is derived from additional data or refined methodologies, with the hope that as nn increases, XnX_n provides a more accurate approximation of XX.

This leads us to the concept of convergence: we are interested in understanding whether and how the sequence {Xn}\{X_n\} approaches XX as nn \to \infty. In probability theory, convergence isn't a singular notion but encompasses various types, each capturing a different aspect of how XnX_n may become "close" to XX. These include:

  1. Almost Sure Convergence: XnX_n converges to XX with probability 1.
  2. Convergence in Probability: For any ϵ>0\epsilon > 0, the probability that XnX>ϵ|X_n - X| > \epsilon approaches zero as nn \to \infty.
  3. Convergence in Distribution: The distribution functions of XnX_n converge to the distribution function of XX at all continuity points.
  4. Mean Square Convergence: The expected value of XnX2|X_n - X|^2 approaches zero as nn \to \infty.

These are all different kinds of convergence. A sequence might converge in one sense but not another. Some of these convergence types are ''stronger'' than others and some are ''weaker.'' By this, we mean the following: If Type A convergence is stronger than Type B convergence, it means that Type A convergence implies Type B convergence. The below figure summarizes how these types of convergence are related. In this figure, the stronger types of convergence are on top and, as we move to the bottom, the convergence becomes weaker. For example, using the figure, we conclude that if a sequence of random variables converges in probability to a random variable XX, then the sequence converges in distribution to XX as well.

Different types of convergence and their relationship with each other


1. Almost Sure Convergence

Definition

A sequence of random variables {Xn}\{X_n\} converges almost surely to a random variable XX if:

P(limnXn=X)=1 \begin{equation} \mathbb{P}\left( \lim_{n \to \infty} X_n = X \right) = 1 \end{equation}

This means that the sequence XnX_n converges to XX for almost every outcome in the sample space. Almost sure convergence implies that, with probability 1, the sequence Xn(ω)X_n(\omega) approaches X(ω)X(\omega) as nn \to \infty. It's akin to saying that the convergence happens for "almost every" individual outcome.

Example

Consider the sequence:

Xn(ω)=ω1/n,ω[0,1] X_n(\omega) = \omega^{1/n}, \quad \omega \in [0,1]

As nn \to \infty, Xn(ω)1X_n(\omega) \to 1 for all ω[0,1]\omega \in [0,1]. Therefore, XnX_n converges almost surely to 1.


2. Convergence in Probability

Definition

Convergence in probability means that the probability of XnX_n deviating from XX by more than ϵ\epsilon becomes negligible as nn grows. A sequence {Xn}\{X_n\} converges in probability to XX if, for every ϵ>0\epsilon > 0:

limnP(XnX>ϵ)=0 \begin{equation} \lim_{n \to \infty} \mathbb{P}(|X_n - X| > \epsilon) = 0 \end{equation}

Example

Define:

Xn={n,with probability 1n0,with probability 11n X_n = \begin{cases} n, & \text{with probability } \frac{1}{n} \\ 0, & \text{with probability } 1 - \frac{1}{n} \end{cases}

Then Xn0X_n \to 0 in probability, since:

P(Xn0>ϵ)=1n0as n \mathbb{P}(|X_n - 0| > \epsilon) = \frac{1}{n} \to 0 \quad \text{as } n \to \infty

As mentioned previously, convergence in probability is stronger than convergence in distribution. That is, if XnpXX_n \xrightarrow{p} X, then XndXX_n \xrightarrow{d} X. The converse is not necessarily true.

For example, let X1,X2,X3,X_1, X_2, X_3, \dots be a sequence of i.i.d. Bernoulli (12)\left(\frac{1}{2}\right) random variables. Let also XBernoulli(12)X \sim \text{Bernoulli} \left( \frac{1}{2} \right) be independent from the XiX_i's. Then, XndXX_n \xrightarrow{d} X. However, XnX_n does not converge in probability to XX, since XnX|X_n - X| is in fact also a Bernoulli (12)\left( \frac{1}{2} \right) random variable, and

P(XnXϵ)=12,for 0<ϵ<1. P(|X_n - X| \geq \epsilon) = \frac{1}{2}, \quad \text{for } 0 < \epsilon < 1.


A special case in which the converse is true is when XndcX_n \xrightarrow{d} c, where cc is a constant. In this case, convergence in distribution implies convergence in probability. We can state the following theorem:


Theorem

If XndcX_n \xrightarrow{d} c, where cc is a constant, then XnpcX_n \xrightarrow{p} c.

An example of convergence in probability is the weak law of large numbers (WLLN).


3. Convergence in Distribution

Definition

Convergence in distribution focuses on the behavior of the distribution functions. It implies that the distributions of XnX_n approach the distribution of XX as nn \to \infty. Formally, a sequence {Xn}\{X_n\} converges in distribution to XX if, for all points xx where the cumulative distribution function (CDF) FXF_X is continuous:

limnFXn(x)=FX(x) \begin{equation} \lim_{n \to \infty} F_{X_n}(x) = F_X(x) \end{equation}

Example

Let X2,X3,X4,X_2, X_3, X_4, \dots be a sequence of random variables with the cumulative distribution function:

FXn(x)={1(11n)nxx>00otherwise F_{X_n}(x) = \begin{cases} 1 - (1 - \frac{1}{n})^{nx} & x > 0 \\ 0 & \text{otherwise} \end{cases}

Then XnX_n converges in distribution to Exponential(λ=1)\text{Exponential}(\lambda = 1).


4. Mean-Square Convergence

A sequence of random variables X1,X2,,Xn,X_1,X_2,\dots,X_n,\dots converges to a random variable XX in mean square (m.s.) if

limnE[(XnX)2]=0. \lim_{n\to\infty} \mathbb{E}\big[(X_n - X)^2\big] = 0.

We often write this as Xnm.s.XX_n \xrightarrow{m.s.} X.


5. Relationships Between Different Types of Convergence

As discussed previously, the different types of convergences are related to each other.

  • Almost Sure Convergence \Rightarrow Convergence in Probability \Rightarrow Convergence in Distribution
  • Convergence in LpL^p Norm \Rightarrow Convergence in Probability \Rightarrow Convergence in Distribution
Type of Convergence Notation Implies
Almost Sure Xna.s.XX_n \xrightarrow{a.s.} X Convergence in Probability
In Mean Square Xnm.s.XX_n \xrightarrow{m.s.} X Convergence in Probability
In Probability XnPXX_n \xrightarrow{P} X Convergence in Distribution
In Distribution XndXX_n \xrightarrow{d} X

However, the converses do not generally hold.

6. Weak Law of Large Numbers (WLLN)

Statement

Let X1,X2,X_1, X_2, \ldots be i.i.d. random variables with finite mean μ\mu. Then:

Xˉn=1ni=1nXiPμ \begin{equation} \bar{X}_n = \frac{1}{n} \sum_{i=1}^n X_i \xrightarrow{P} \mu \end{equation}

Intuition

The sample average Xˉn\bar{X}_n converges in probability to the expected value μ\mu as the sample size increases.

Example

If XiBernoulli(0.5)X_i \sim \text{Bernoulli}(0.5), then Xˉn0.5\bar{X}_n \to 0.5 in probability.

WLON plot