Functions of a Random Variable
If X is a random variable and Y=g(X), then Y itself is a random variable. Consequently, we can discuss its PMF, CDF, and expected value. The range of Y can be written as:
RY={g(x)∣x∈RX}where Rx is the range of X
To find the PMF of Y=g(X) given the PMF of X, we can write:
PY(y)=P(Y=y)=P(g(X)=y)=x:g(x)=y∑PX(x)
1 Expected Value of a Function of a Random Variable (LOTUS)
Let X be a discrete random variable with PMF PX(x), and let Y=g(X). Suppose we want to find E[Y]. One approach is to first find the PMF of Y and then use the expectation formula E[Y]=E[g(X)]=∑y∈RYyPY(y). However, a more convenient method is the law of the unconscious statistician (LOTUS).
Law of the Unconscious Statistician (LOTUS) for Discrete Random Variables:
E[g(X)]=xk∈RX∑g(xk)PX(xk)
This can be proved by expressing E[Y]=E[g(X)]=∑y∈RYyPY(y) in terms of PX(x). Typically, using LOTUS is easier than the direct definition when we need E[g(X)].
For a random variable Y, whether discrete or continuous, and a function g:R→R, W=g(Y) is also a random variable. Its distribution (pdf), mean, variance, etc., will differ from Y's. Transformations of random variables are crucial in statistics.
Theorem:
Suppose Y is a random variable, g is a transformation, and W=g(Y). Then:
- If Y is discrete, with pmf pY, we have:
E[W]=y∈SY∑g(y)pY(y)
- If Y is continuous, with pdf fY, we have:
E[W]=∫−∞∞g(y)fY(y)dy
The cdf-method
The fundamental formula of this theorem helps compute expectations, but it doesn't provide the distribution of W=g(Y). To find the cdf FW of W, given the cdf FY of Y, we can write:
FW(w)=P[W≤w]=P[g(Y)≤w]
The probability on the right needs to be expressed in terms of Y. If g is strictly increasing, it admits an inverse function g−1 and we can write:
FW(w)=P[g(Y)≤w]=P[Y≤g−1(w)]=FY(g−1(w))
For strictly decreasing g:
P[g(Y)≤w]=P[Y≥g−1(w)]
In continuous cases, P[Y≥y]=1−FY(y), so:
FW(w)=P[g(Y)≤w]=P[Y≥g−1(w)]=1−FY(g−1(w))
3 Functions of Two Random Variables
For two discrete random variables X and Y, and Z=g(X,Y), we can determine the PMF of Z as:
PZ(z)=P(g(X,Y)=z)=(xi,yj)∈Az∑PXY(xi,yj),where Az={(xi,yj)∈RXY:g(xi,yj)=z}
For E[g(X,Y)], we can use LOTUS:
LOTUS for two discrete random variables:
E[g(X,Y)]=(xi,yj)∈RXY∑g(xi,yj)PXY(xi,yj)
Linearity of Expectation: For two discrete random variables X and Y, E[X+Y]=E[X]+E[Y].
Let g(X,Y)=X+Y. Using LOTUS, we have:
E[X+Y]=(xi,yj)∈RXY∑(xi+yj)PXY(xi,yj)
=(xi,yj)∈RXY∑xiPXY(xi,yj)+(xi,yj)∈RXY∑yjPXY(xi,yj)
=xi∈RX∑yj∈RY∑xiPXY(xi,yj)+xi∈RX∑yj∈RY∑yjPXY(xi,yj)
=xi∈RX∑xiyj∈RY∑PXY(xi,yj)+yj∈RY∑yjxi∈RX∑PXY(xi,yj)=xi∈RX∑xiPX(xi)+yj∈RY∑yjPY(yj)(marginal PMF)
=E[X]+E[Y]
Functions of Two Continuous Random Variables
For two continuous random variables g(X,Y), the concepts are similar. For E[g(X,Y)], we use LOTUS:
LOTUS for two continuous random variables:
E[g(X,Y)]=∫−∞∞∫−∞∞g(x,y)fXY(x,y)dxdy
If Z=g(X,Y) and we are interested in its distribution, we can start by writing:
FZ(z)=P(Z≤z)=P(g(X,Y)≤z)=D∬fXY(x,y)dxdy
where D={(x,y)∣g(x,y)≤z}. To find the PDF of Z, we differentiate FZ(z).
4 Different Views of a Function of a Random Variable (FRV)
There are several different but essentially equivalent views of a function of a random variable (FRV). We will present two of them, highlighting their differences in emphasis.
Assume we have an underlying probability space P=(Ω,F,P) and a random variable X defined on it. Recall that X is a rule that assigns a number X(ζ) to every ζ∈Ω. X transforms the σ-field of events F into the Borel σ-field B of sets of numbers on the real line. If RX denotes the subset of the real line reached by X as ζ ranges over Ω, we can regard X as an ordinary function with domain Ω and range RX. Now, consider a measurable real function g(x) of the real variable x.
i) First View (Y: Ω → RY)
For every ζ∈Ω, we generate a number g(X(ζ))=Y(ζ). The rule Y, which generates the numbers {Y(ζ)} for random outcomes {ζ∈Ω}, is an RV with domain Ω and range RY⊂R. For every Borel set of real numbers BY, the set {ζ:Y(ζ)∈BY} is an event. Specifically, the event {ζ:Y(ζ)≤y} is equal to the event {ζ:g(X(ζ))≤y}.
In this view, the emphasis is on Y as a mapping from Ω to RY, with the intermediate role of X being suppressed.
For every value of X(ζ) in the range RX, we generate a new number Y=g(X) whose range is RY. The rule Y, whose domain is RX and range is RY, is a function of the random variable X. Here, the focus is on viewing Y as a mapping from one set of real numbers to another. A model for this view is to regard X as the input to a system with transformation function g(⋅). For such a system, an input x gets transformed to an output y=g(x), and an input function X gets transformed to an output function Y=g(X).
In general, we will write {Y≤y}={X∈Cy} in the sequel. For Cy so determined, it follows that:
P[Y≤y]=P[X∈Cy]
If Cy is empty, then the probability of {Y≤y} is zero.
When dealing with the input–output model, it is convenient to omit references to an abstract underlying experiment and deal directly with the RVs X and Y. In this approach, the observations on X are the underlying experiments, events are Borel subsets of the real line R, and the set function P[⋅] is replaced by the distribution function FX(⋅). Then Y is a mapping (an RV) whose domain is the range RX of X, and whose range RY is a subset of R. The functional properties of X are ignored in favor of viewing X as a mechanism that gives rise to numerically valued random phenomena. In this view, the domain of X is irrelevant.
Additional discussion on the various views of an FRV is available in the literature.