Central Limit Theorem
Theory
Central Limit Theorem (CLT)
The Central Limit Theorem (CLT) is one of the foundational results in probability theory and statistics. It explains how, under certain conditions, the distribution of the sample mean (or normalized sum) converges to a normal distribution, even if the original data is not normally distributed.
Statement of CLT
Let be a sequence of independent and identically distributed (i.i.d.) random variables with mean and variance . Define the normalized sum:
$ S_n = \frac{1}{\sqrt{n}} \sum_{i=1}^n \left( X_i - \mu \right) $
As , the distribution of approaches a standard normal distribution , regardless of the original distribution of (provided certain conditions, like finite mean and variance, are met):
$ S_n \xrightarrow{d} N(0, 1) $
Properties of CLT
Mean of Sample Means:
- The mean of the sample means is equal to the population mean .
Variance of Sample Means:
- The variance of the sample means is , which decreases as the sample size increases.
Normal Approximation:
- The approximation improves with larger sample sizes.
Example: Rolling a Die
- Population: Outcomes of a fair six-sided die .
- Population Mean , Variance .
- If we roll the die times repeatedly and calculate the sample means, the distribution of these means will approach a normal distribution as the number of rolls increases.
Characteristic Functions of Random Variables
The characteristic function of a random variable is a powerful tool in probability theory, defined as:
$ \phi_X(t) = \mathbb{E}\left[ e^{itX} \right] $
where is the imaginary unit and is a real parameter.
Key Properties
- Existence: The characteristic function always exists for any random variable.
- Uniqueness: It uniquely determines the probability distribution of a random variable.
- Convolution Property: The characteristic function of the sum of independent random variables is the product of their individual characteristic functions: $ \phi_{X+Y}(t) = \phi_X(t) \cdot \phi_Y(t) $
- Inversion Formula: A random variable's probability density function (PDF) can be recovered from its characteristic function using the inverse Fourier transform.
Role of Characteristic Functions in CLT
Characteristic functions simplify the proof and understanding of the Central Limit Theorem because:
- Transforming Convolution to Multiplication:
- The sum of independent random variables corresponds to the product of their characteristic functions.
- Analyzing Limiting Behavior:
- The limiting behavior of the characteristic function of the normalized sum of random variables directly leads to the Gaussian distribution.
Formal Proof Idea Using Characteristic Functions
Let be i.i.d. random variables with mean and variance . Define:
$ S_n = \frac{1}{\sqrt{n}} \sum_{i=1}^n \left( X_i - \mu \right) $
The characteristic function of , denoted , is given by:
$ \phi_{S_n}(t) = \left[ \phi_X\left( \frac{t}{\sqrt{n}} \right) \right]^n $
For large , the Taylor expansion of around can be used:
$ \phi_X(t) \approx 1 - \frac{\sigma^2 t^2}{2} + o(t^2) $
Substituting this into , it can be shown that:
$ \phi_{S_n}(t) \to e^{-t^2 / 2} \quad \text{as } n \to \infty $
This is the characteristic function of a standard normal distribution , proving the CLT.