Discrete and Continuous Random Variables

There are two important classes of random variables: discrete random variables and continuous random variables.

Discrete Random Variable

A random variable is called discrete if its range (the set of values that it can take) is finite or at most countably infinite. A random variable that can take an uncountably infinite number of values is not discrete. For an example, consider the experiment of choosing a point a from the interval [−1, 1]. The random variable that associates the numerical value X(a)=a2X(a) = a^2 to the outcome a is not discrete since the range is [0, 1]. On the other hand, the random variable that associates with a the numerical value

X(a)={1,a>00,a=01,a<0(6.2) X(a) = \begin{cases} 1, & a > 0 \\ 0, & a = 0 \\ -1, & a < 0 \end{cases} \tag{6.2}

is discrete.

For a discrete random variable X, we define the probability mass function (pmf) of X by

pX(a)=P(X=a)=P({ωX(ω)=a}). p_X(a) = P(X = a) = P(\{\omega | X(\omega) = a\}).

Note that pX(.)p_X(.) is a valid pmf if and only if the following condition is satisfied.

I=1pX(xI)=1, \sum_{I=1}^{\infty} p_X(x_I) = 1,

where {x1,x2,,}\{x_1, x_2, \dots, \} is the range of the random variable X.

The CDF of a random variable X X is defined as:

FX(x)=P(Xx)for all xR. F_X(x) = P(X \leq x) \quad \text{for all } x \in \mathbb{R}.

For a discrete random variable X X with range RX={x1,x2,x3,} R_X = \{x_1, x_2, x_3, \ldots\} (with x1<x2<x3< x_1 < x_2 < x_3 < \ldots ):

FX(x)=xkxPX(xk). F_X(x) = \sum_{x_k \leq x} P_X(x_k).

Types of Discrete Random Variables

Bernoulli Random Vadriable

Consider the toss of a biased coin, which comes up a head with probability p, and a tail with probability 1 − p. The Bernoulli random variable takes the two values 1 and 0, depending on whether the outcome is a head or a tail:

X(T)=0,X(H)=1. X(T) = 0, \quad X(H) = 1.

The probability mass function (pmf) of the Bernoulli random variable is given by

pX(0)=1p,pX(1)=p. p_X(0) = 1 - p, \quad p_X(1) = p.

The cumulative distribution function (cdf) of the Bernoulli random variable is given by

FX(x)={0,x<01p,0x<11,x1 F_X(x) = \begin{cases} 0, & x < 0 \\ 1 - p, & 0 \le x < 1 \\ 1, & x \ge 1 \end{cases}

Binomial Random Variable

A biased coin is tossed n times. At each toss, the coin comes up a head with probability p, and a tail with probability 1p1-p, independently of prior tosses. The sample space is given by the set of all 2n2^n possible tuples

of H, T combinations. For the case of n = 4, the sample space is as given below:

Ω={TTTT,TTTH,TTHT,TTHH,THTT,THTH,THHT,THHH,HTTT,HTTH,HTHT,HTHH,HHTT,HHTH,HHHT,HHHH}. \Omega = \{TTTT, TTTH, TTHT, TTHH, THTT, THTH, THHT, THHH, HTTT, HTTH, HTHT, HTHH, HHTT, HHTH, HHHT, HHHH\}.

For any ωΩ\omega \in \Omega, X(ω)X(\omega) is defined as the number of heads in ω\omega. The range of values which the random variable X takes is {0,1,,n}\{0, 1, \dots, n\}. The probability mass function (pmf) of the random variable X is given by

pX(k)=(nk)pk(1p)nk,0kn p_X(k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad 0 \le k \le n

This random variable is known as binomial random variable. Note that the above pmf is a valid pmf as it sums to 1.

k=0npX(k)=k=0n(nk)pk(1p)nk=(p+(1p))n=1. \sum_{k=0}^{n} p_X(k) = \sum_{k=0}^{n} \binom{n}{k} p^k (1-p)^{n-k} = (p + (1-p))^n = 1.

Geometric Random Variable

Suppose that we repeatedly and independently toss a biased coin with probability of a head p, where 0<p<10 < p < 1 till a head comes up for the first time. The sample space corresponding to the experiment is given by

Ω={H,TH,TTH,TTTH,}. \Omega = \{H, TH, TTH, TTTH, \dots\}.

For any ωΩ\omega \in \Omega, X(ω)X(\omega) is defined as the number of tosses in ω\omega. The range of values which the random variable X takes is {1,2,,}\{1, 2, \dots, \}. The probability mass function (pmf) of the random variable X is given by

pX(k)=(1p)k1p,k{1,2,}. p_X(k) = (1-p)^{k-1}p, \quad k \in \{1, 2, \dots\}.

This random variable is known as geometric random variable. Note that the above pmf is a valid pmf as it sums to 1.

k=1pX(k)=k=1(1p)k1p=pk=0(1p)k=p11(1p)=1. \sum_{k=1}^{\infty} p_X(k) = \sum_{k=1}^{\infty} (1-p)^{k-1}p = p \sum_{k=0}^{\infty} (1-p)^k = p \frac{1}{1-(1-p)} = 1.

Poisson Random Variable

A poisson random variable takes nonnegative integer values. Its pmf is given by

pX(k)=eλλkk!,k=0,1,2, p_X(k) = \frac{e^{-\lambda} \lambda^k}{k!}, \quad k = 0, 1, 2, \dots

Note that the above pmf is a valid pmf as it sums to 1.

k=0pX(k)=eλk=0λkk!=eλeλ=1. \sum_{k=0}^{\infty} p_X(k) = e^{-\lambda} \sum_{k=0}^{\infty} \frac{\lambda^k}{k!} = e^{-\lambda} e^{\lambda} = 1.

An important property of the Poisson random variable is that it may be used to approximate a binomial random variable when the binomial parameter n is large and p is small.

Continuous Random Variables

Random variables with a continuous range of possible values are common. For example, the exact velocity of a vehicle on a highway is a continuous random variable. The CDF of a continuous random variable is a continuous function, meaning it does not have jumps. This aligns with the fact that P(X=x)=0 P(X = x) = 0 for all x x .

For a continuous random variable XX, the probability of it taking any single value is zero, so we use a Probability Density Function (PDF), denoted fX(x)f_X(x), to describe its distribution.

The PDF is defined as the derivative of the Cumulative Distribution Function (CDF), FX(x)F_X(x), where the derivative exists:

fX(x)=dFX(x)dx f_X(x) = \frac{dF_X(x)}{dx}

The probability that XX falls within an interval [a,b][a, b] is the integral of the PDF over that interval:

P(aXb)=abfX(x)dx \mathbb{P}(a \leq X \leq b) = \int_a^b f_X(x) \, dx

A valid PDF must satisfy two conditions: fX(x)0f_X(x) \geq 0 for all xx, and its total integral must be one, fX(x)dx=1\int_{-\infty}^{\infty} f_X(x) \, dx = 1.

Conversely, the CDF can be obtained from the PDF by integrating from negative infinity up to a point xx:

FX(x)=P(Xx)=xfX(u)du F_X(x) = \mathbb{P}(X \leq x) = \int_{-\infty}^{x} f_X(u) \, du

Types of Continuous Random Variable

Uniform Random Variable

A continuous random variable X X is uniformly distributed over [a,b] [a, b] , denoted XUniform(a,b) X \sim \text{Uniform}(a, b) , if:

fX(x)={1baa<x<b0otherwise f_X(x) = \begin{cases} \frac{1}{b-a} & a < x < b \\ 0 & \text{otherwise} \end{cases}

The CDF is:

FX(x)={0x<axabaax<b1xb F_X(x) = \begin{cases} 0 & x < a \\ \frac{x - a}{b - a} & a \leq x < b \\ 1 & x \geq b \end{cases}

Exponential Random Variable

The exponential distribution models the time between events. A continuous random variable X X is exponentially distributed with parameter λ>0 \lambda > 0 , denoted XExponential(λ) X \sim \text{Exponential}(\lambda) , if:

fX(x)={λeλxx>00otherwise f_X(x) = \begin{cases} \lambda e^{-\lambda x} & x > 0 \\ 0 & \text{otherwise} \end{cases}

The CDF is:

FX(x)=1eλx F_X(x) = 1 - e^{-\lambda x}

Normal Distribution

The Central Limit Theorem (CLT) states that the sum of a large number of random variables is approximately normal. A standard normal random variable Z Z is denoted ZN(0,1) Z \sim N(0, 1) and has PDF:

fZ(z)=12πexp{z22} f_Z(z) = \frac{1}{\sqrt{2\pi}} \exp\left\{ -\frac{z^2}{2} \right\}

The CDF is:

FZ(z)=12πzexp{u22}du F_Z(z) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{z} \exp\left\{ -\frac{u^2}{2} \right\} \, du