Gaussian Random Vectors

Introduction

A Gaussian random variable (or Normal random variable) is a continuous random variable with a bell-shaped probability distribution. It is one of the most widely used distributions in probability and statistics due to the Central Limit Theorem, which states that the sum of many independent random variables tends to follow a normal distribution, regardless of the original distributions of the variables. Therefore, a simple Gaussian assumption for a naturally occuring distribution often tends to provide accurate analysis of the system. We define and study both 1-d and 2-d Gaussian Random Variables in this experiment.

1D (Univariate) Gaussian Random Variable

Definition

A 1D Gaussian random variable XX denoted as XN(μ,σ2)X \sim N(\mu, \sigma^2), is characterized by two parameters:

  • Mean μ\mu (center of the distribution)
  • Variance σ2\sigma^2 (spread of the distribution)

The probability density function (PDF) of XX is given by:

fX(x)=12πσ2exp((xμ)22σ2) \begin{equation} f_X(x) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(x - \mu)^2}{2\sigma^2} \right) \end{equation}

Properties

  1. Symmetry: The normal distribution is symmetric around its mean μ\mu.
  2. Standard Normal Distribution: When μ=0\mu = 0 and σ2=1\sigma^2 = 1, the Gaussian random variable is called a standard normal variable, denoted by ZN(0,1)Z \sim N(0, 1). Its PDF is:

fZ(z)=12πexp(z22) \begin{equation} f_Z(z) = \frac{1}{\sqrt{2\pi}} \exp\left( -\frac{z^2}{2} \right) \end{equation}

  1. Transformation of a Gaussian Random Variable:
    • If XN(μ,σ2)X \sim N(\mu, \sigma^2), and we transform Y=aX+bY = aX + b, then YN(aμ+b,a2σ2)Y \sim N(a\mu + b, a^2\sigma^2).

Cumulative Distribution Function (CDF)

The CDF of a standard normal variable is denoted by Φ(x)\Phi(x), which is the probability that XxX \leq x:

P(Xx)=Φ(xμσ)whereΦ(x)=12πxet22dt \begin{equation} P(X \leq x) = \Phi\left( \frac{x - \mu}{\sigma} \right) \quad \text{where} \quad \Phi(x) = \frac{1}{\sqrt{2 \pi}}\int_{-\infty}^{x} e^{\frac{-t^2}{2}}\cdot dt \end{equation}

However, there is no closed-form expression for Φ(x)\Phi(x); it is generally computed numerically.

Properties of CDF

Here are some properties of the Φ\Phi function that can be shown from its definition.

  1. limxΦ(x)=1,limxΦ(x)=0\lim _{x \rightarrow \infty} \Phi(x)=1, \lim _{x \rightarrow-\infty} \Phi(x)=0
  2. Φ(0)=12\Phi(0)=\frac{1}{2};
  3. Φ(x)=1Φ(x)\Phi(-x)=1-\Phi(x), for all xRx \in \mathbb{R}.
  4. About 68% of values drawn from a normal distribution are within one standard deviation σ from the mean; about 95% of the values lie within two standard deviations; and about 99.7% are within three standard deviations. More precisely, the probability that a normal deviate lies in the range between μnσ\mu -n \sigma and μ+nσ\mu + n \sigma is given by F(μ+nσ)F(μnσ)=Φ(n)Φ(n)=2Φ(n)1 \begin{equation} F(\mu+n \sigma)-F(\mu-n \sigma)=\Phi(n)-\Phi(-n)=2 \Phi(n)-1 \end{equation}

Also, since the Φ\Phi function does not have a closed form, it is sometimes useful to use upper or lower bounds. In particular we can state the following bounds. For all x0x \geq 0, 12πxx2+1exp{x22}1Φ(x)12π1xexp{x22} \begin{equation} \frac{1}{\sqrt{2 \pi}} \frac{x}{x^2+1} \exp \left\{-\frac{x^2}{2}\right\} \leq 1-\Phi(x) \leq \frac{1}{\sqrt{2 \pi}} \frac{1}{x} \exp \left\{-\frac{x^2}{2}\right\} \end{equation}

Central Limit Theorem (CLT)

The CLT roughly states that the sum (or average) of a large number of independent and identically distributed (i.i.d.) random variables tends to be normally distributed, even if the original variables are not normal. We will discuss about CLT in detail in CLT Experiment


2D Gaussian Random Variable (Bivariate Normal Distribution)

Definition

A bivariate normal distribution describes two jointly normal random variables XX and YY with the following parameters:

  • Means: μX\mu_X, μY\mu_Y
  • Variances: σX2\sigma_X^2, σY2\sigma_Y^2
  • Covariance: σXY=E[(XμX)(YμY)]\sigma_{XY} = E\left[(X-\mu_X)\cdot(Y-\mu_Y)\right]

We denote this as:

(XY)N((μXμY),(σX2σXYσXYσY2)) \begin{equation} \begin{pmatrix} X \\ Y \end{pmatrix} \sim N\left( \begin{pmatrix} \mu_X \\ \mu_Y \end{pmatrix}, \begin{pmatrix} \sigma_X^2 & \sigma_{XY} \\ \sigma_{XY} & \sigma_Y^2 \end{pmatrix} \right) \end{equation}

Joint Probability Density Function (PDF)

The joint probability density function (PDF) of XX and YY is given by:

fX,Y(x,y)=12πσXσY1ρ2exp(12(1ρ2)[(xμX)2σX2+(yμY)2σY22ρ(xμX)(yμY)σXσY]) \begin{equation} f_{X,Y}(x, y) = \frac{1}{2\pi \sigma_X \sigma_Y \sqrt{1 - \rho^2}} \exp\left( -\frac{1}{2(1 - \rho^2)} \left[ \frac{(x - \mu_X)^2}{\sigma_X^2} + \frac{(y - \mu_Y)^2}{\sigma_Y^2} - \frac{2\rho(x - \mu_X)(y - \mu_Y)}{\sigma_X \sigma_Y} \right] \right) \end{equation} Where:

  • μX\mu_X, μY\mu_Y are the means of XX and YY
  • σX2\sigma_X^2, σY2\sigma_Y^2 are the variances of XX and YY
  • ρ\rho is the correlation coefficient between XX and YY, given by ρ=σXYσXσY\rho = \frac{\sigma_{XY}}{\sigma_X \sigma_Y}

Below is an example of how pdf 2D-Gaussian Random variable looks.

Key Properties of the Bivariate Normal Distribution

  1. Marginal Distributions:

    • The marginal distribution of XX is XN(μX,σX2)X \sim N(\mu_X, \sigma_X^2).
    • The marginal distribution of YY is YN(μY,σY2)Y \sim N(\mu_Y, \sigma_Y^2).
    • This means that each random variable XX and YY follows a normal distribution independently, but their joint behavior is governed by the covariance or correlation.
  2. Independence:

    • If ρ=0\rho = 0, then XX and YY are independent. In other words, zero correlation implies independence for jointly normal random variables.

    • Example:

      • Let XX and YY be independent random variables with means 00 and standard deviations 11. This means ρ=0\rho = 0. Then the joint PDF simplifies to:

      fX,Y(x,y)=12πexp(12[x2+y2]) f_{X,Y}(x, y) = \frac{1}{2\pi} \exp\left( -\frac{1}{2} \left[ x^2 + y^2 \right] \right)

      This is simply the product of two univariate standard normal PDFs.

  3. Conditional Distribution:

    • The conditional distribution of YY given X=xX = x is normal, with the following parameters:

      • Mean: μYX=μY+ρσYσX(xμX)\mu_{Y|X} = \mu_Y + \rho \frac{\sigma_Y}{\sigma_X}(x - \mu_X)
      • Variance: σYX2=(1ρ2)σY2\sigma_{Y|X}^2 = (1 - \rho^2) \sigma_Y^2
    • Example:

      • Suppose XX and YY are normally distributed with μX=2\mu_X = 2, μY=3\mu_Y = 3, σX=1\sigma_X = 1, σY=2\sigma_Y = 2, and ρ=0.5\rho = 0.5. If X=3X = 3, then the conditional distribution of YY given X=3X = 3 is:

      YX=3N(3+0.5×21(32),(10.52)×22) \begin{equation} Y | X = 3 \sim N\left( 3 + 0.5 \times \frac{2}{1}(3 - 2), (1 - 0.5^2) \times 2^2 \right) \end{equation}

      Simplifying:

      YX=3N(4,3) \begin{equation} Y | X = 3 \sim N(4, 3) \end{equation}

      Therefore, YY given X=3X = 3 is normally distributed with a mean of 4 and variance of 3.

  4. Covariance Matrix:

    • The covariance matrix Σ\Sigma for a bivariate normal distribution summarizes the variances and covariances of the random variables:

      Σ=(σX2σXYσXYσY2) \begin{equation} \Sigma = \begin{pmatrix} \sigma_X^2 & \sigma_{XY} \\ \sigma_{XY} & \sigma_Y^2 \end{pmatrix} \end{equation}

    • In terms of the correlation coefficient ρ\rho, the covariance σXY\sigma_{XY} is given by:

      σXY=ρσXσY \begin{equation} \sigma_{XY} = \rho \sigma_X \sigma_Y \end{equation}

      Therefore, the covariance matrix can also be expressed as:

      Σ=(σX2ρσXσYρσXσYσY2) \begin{equation} \Sigma = \begin{pmatrix} \sigma_X^2 & \rho \sigma_X \sigma_Y \\ \rho \sigma_X \sigma_Y & \sigma_Y^2 \end{pmatrix} \end{equation}

Example: Computing Covariance Matrix and Joint PDF

Let XX and YY be jointly normal with the following parameters:

  • μX=1\mu_X = 1, μY=2\mu_Y = 2
  • σX=1\sigma_X = 1, σY=2\sigma_Y = 2
  • ρ=0.6\rho = 0.6
  1. Covariance Matrix: Using σXY=ρσXσY=0.6×1×2=1.2\sigma_{XY} = \rho \sigma_X \sigma_Y = 0.6 \times 1 \times 2 = 1.2, the covariance matrix is:

    Σ=(11.21.24) \begin{equation} \Sigma = \begin{pmatrix} 1 & 1.2 \\ 1.2 & 4 \end{pmatrix} \end{equation}

  2. Joint PDF: The joint PDF is:

    fX,Y(x,y)=12π1210.62exp(12(10.62)[(x1)212+(y2)22220.6(x1)(y2)12]) \begin{equation} f_{X,Y}(x, y) = \frac{1}{2\pi \cdot 1 \cdot 2 \cdot \sqrt{1 - 0.6^2}} \exp\left( -\frac{1}{2(1 - 0.6^2)} \left[ \frac{(x - 1)^2}{1^2} + \frac{(y - 2)^2}{2^2} - \frac{2 \cdot 0.6 \cdot (x - 1)(y - 2)}{1 \cdot 2} \right] \right) \end{equation}

    Simplifying the coefficient in front:

    12π210.36=14π0.64=14π0.8=13.2π \begin{equation} \frac{1}{2\pi \cdot 2 \cdot \sqrt{1 - 0.36}} = \frac{1}{4\pi \cdot \sqrt{0.64}} = \frac{1}{4\pi \cdot 0.8} = \frac{1}{3.2\pi} \end{equation}

    Therefore, the joint PDF is:

    fX,Y(x,y)=13.2πexp(11.28[(x1)2+(y2)240.6(x1)(y2)]) \begin{equation} f_{X,Y}(x, y) = \frac{1}{3.2\pi} \exp\left( -\frac{1}{1.28} \left[ (x - 1)^2 + \frac{(y - 2)^2}{4} - 0.6(x - 1)(y - 2) \right] \right) \end{equation}

Linear Combinations of Gaussian Random Variables

A key result of the bivariate normal distribution is that any linear combination of XX and YY, say Z=aX+bYZ = aX + bY, is also normally distributed.

  • The mean of ZZ is: μZ=aμX+bμY\mu_Z = a\mu_X + b\mu_Y
  • The variance of ZZ is: σZ2=a2σX2+b2σY2+2abσXY\sigma_Z^2 = a^2 \sigma_X^2 + b^2 \sigma_Y^2 + 2ab \sigma_{XY}

Example:

Let XX and YY be normally distributed with the following parameters:

  • μX=1\mu_X = 1, μY=2\mu_Y = 2
  • σX=1\sigma_X = 1, σY=2\sigma_Y = 2
  • ρ=0.5\rho = 0.5, σXY=1\sigma_{XY} = 1

Now consider Z=X+2YZ = X + 2Y.

  1. Mean: μZ=μX+2μY=1+2×2=5 \begin{equation} \mu_Z = \mu_X + 2\mu_Y = 1 + 2 \times 2 = 5 \end{equation}

  2. Variance: σZ2=12×12+22×22+2×1×2×1=1+16+4=21 \begin{equation} \sigma_Z^2 = 1^2 \times 1^2 + 2^2 \times 2^2 + 2 \times 1 \times 2 \times 1 = 1 + 16 + 4 = 21 \end{equation}

Thus, ZN(5,21)Z \sim N(5, 21).

Isocontours

A simple and intuituve way to visualize and understand bi-variate gaussian random variables is by iso-contours. Formally, for a function ff, iso-contours are defined as a set of points given by

{xR2:f(x)=c}for some cR \begin{equation} \{x \in \mathbb{R}^2 : f(x)=c\} \quad \text{for some $c \in \mathbb{R}$} \end{equation}

Now we study the shape of iso-contours, which will be helpful in visualizing 2D-Gaussian Random Variables' distribution. In order to obtain the shape of the iso-contours, we need to solve the equation p(x;μ,Σ)=cp(x;\mu,\Sigma)=c for some constant cRc \in \mathbb{R}.

p(x;μ,Σ)=12πσ1σ2exp(12σ12(x1μ1)212σ22(x2μ2)2) \begin{equation} p(x ; \mu, \Sigma)=\frac{1}{2 \pi \sigma_1 \sigma_2} \exp \left(-\frac{1}{2 \sigma_1^2}\left(x_1-\mu_1\right)^2-\frac{1}{2 \sigma_2^2}\left(x_2-\mu_2\right)^2\right) \end{equation}

Now, let's consider the level set consisting of all points where p(x;μ,Σ)=cp(x ; \mu, \Sigma)=c for some constant cRc \in \mathbf{R}. In particular, consider the set of all x1,x2Rx_1, x_2 \in \mathbf{R} such that c=12πσ1σ2exp(12σ12(x1μ1)212σ22(x2μ2)2)2πcσ1σ2=exp(12σ12(x1μ1)212σ22(x2μ2)2)log(2πcσ1σ2)=12σ12(x1μ1)212σ22(x2μ2)2log(12πcσ1σ2)=12σ12(x1μ1)2+12σ22(x2μ2)21=(x1μ1)22σ12log(12πcσ1σ2)+(x2μ2)22σ22log(12πcσ1σ2) \begin{equation} \begin{aligned} c & =\frac{1}{2 \pi \sigma_1 \sigma_2} \exp \left(-\frac{1}{2 \sigma_1^2}\left(x_1-\mu_1\right)^2-\frac{1}{2 \sigma_2^2}\left(x_2-\mu_2\right)^2\right) \\ 2 \pi c \sigma_1 \sigma_2 & =\exp \left(-\frac{1}{2 \sigma_1^2}\left(x_1-\mu_1\right)^2-\frac{1}{2 \sigma_2^2}\left(x_2-\mu_2\right)^2\right) \\ \log \left(2 \pi c \sigma_1 \sigma_2\right) & =-\frac{1}{2 \sigma_1^2}\left(x_1-\mu_1\right)^2-\frac{1}{2 \sigma_2^2}\left(x_2-\mu_2\right)^2 \\ \log \left(\frac{1}{2 \pi c \sigma_1 \sigma_2}\right) & =\frac{1}{2 \sigma_1^2}\left(x_1-\mu_1\right)^2+\frac{1}{2 \sigma_2^2}\left(x_2-\mu_2\right)^2 \\ 1 & =\frac{\left(x_1-\mu_1\right)^2}{2 \sigma_1^2 \log \left(\frac{1}{2 \pi c \sigma_1 \sigma_2}\right)}+\frac{\left(x_2-\mu_2\right)^2}{2 \sigma_2^2 \log \left(\frac{1}{2 \pi c \sigma_1 \sigma_2}\right)} \end{aligned} \end{equation}

Defining r1=2σ12log(12πcσ1σ2)r2=2σ22log(12πcσ1σ2) \begin{equation} r_1=\sqrt{2 \sigma_1^2 \log \left(\frac{1}{2 \pi c \sigma_1 \sigma_2}\right)} \quad r_2=\sqrt{2 \sigma_2^2 \log \left(\frac{1}{2 \pi c \sigma_1 \sigma_2}\right)} \end{equation} it follows that 1=(x1μ1r1)2+(x2μ2r2)2 \begin{equation} 1=\left(\frac{x_1-\mu_1}{r_1}\right)^2+\left(\frac{x_2-\mu_2}{r_2}\right)^2 \end{equation}

The obtained equation is that of an axis-aligned ellipse, with center (μ1,μ2)\left(\mu_1, \mu_2\right), where the x1x_1 axis has length 2r12 r_1 and the x2x_2 axis has length 2r22 r_{2}. Thus, iso-contours in gaussian random vectors are ellipses.

alt text

Note that when σ1=σ2\sigma_1=\sigma_2, we have r1=r2r_1=r_2 and thus, the ellipse reduces to a circle. Also, it is interesting to note that the principal axis of the ellipse determines the covariance between the 2 marginal distributions XX and YY. A positive value of ρ\rho means a positive slope and a negative value indicates a negative slope.