EE 364 Supplemental – Week 05 (Part 1)

Random Variables, CDF/PMF, Continuous Distributions, Gaussian

Kolmogorov – Foundations

Grundbegriffe (1918).

Object	Formalism
\(\Omega\)	measure space
\((\Omega, \mathcal{A})\)	sigma-algebra: abstraction of sample space and assignment of probability to events
\(P\)	CAT measure
r.v. \(X\)	measurable function (PAM) – “random experiment”

Definition: Random Variable

\(X: \Omega \to \mathbb{R}\) is a random variable (r.v.) on \((\Omega, \mathcal{A})\)

iff PAM: Pullbacks are Always Measurable

iff \(\forall B \in \mathcal{B}(\mathbb{R})\): \(\boxed{X^{-1}(B) \in \mathcal{A}}\)

where \(\mathcal{B}(\mathbb{R})\) = intervals in \(\mathbb{R}\) (Borel sets).

\(\therefore P(X^{-1}(B))\) defined if \(X^{-1}(B) \in \mathcal{A}\), since \(P: \mathcal{A} \to [0,1]\).

\(\therefore X\) is not a random variable if \(\exists B \in \mathcal{B}(\mathbb{R})\) with \(X^{-1}(B) \notin \mathcal{A}\).

Example: Coin Flip

Ex: Flip coin twice. \(\Omega = \{H,T\} \times \{H,T\}\), \(\mathcal{A} = 2^\Omega\), \(P(H) = p\), \(P(T) = 1 - p = q\).

Define:

\(X = \text{\# heads in 2 flips}\)
\(Y = (\text{\# H})^2\)
\(Z = (\text{\# H})^3 - 1\)

Uncountably infinite possible r.v.’s on same sample space.

	TT	TH	HT	HH
\(X\)	0	1	1	2
\(Y\)	0	1	1	4
\(Z\)	\(-1\)	0	0	7

Notation “\(P[X = 0]\)” doesn’t quite make sense – “\(X\)” is a function, not a constant. Shorthand.

\[ \begin{aligned} P[X = 0] &= P[X^{-1}(\{0\})] = P[\{TT\}] = q^2 \\ P[X = 1] &= P[X^{-1}(\{1\})] = P[\{TH, HT\}] = 2pq \\ P[X = 2] &= P[X^{-1}(\{2\})] = P[\{HH\}] = p^2 \\ P[X = 3] &= P[X^{-1}(\{3\})] = P[\emptyset] = 0 \end{aligned} \]

Also: \(P[Z = 0] = P[Z^{-1}(\{0\})] = P[\{TH, HT\}] = 2pq\).

Example: Indicator Function

\(I_A: \Omega \to \{0, 1\}\) with \(A \subset \Omega\) and \(A \in \mathcal{A}\).

\[ I_A(\omega) = \begin{cases} 1 & \text{if } \omega \in A \\ 0 & \text{if } \omega \notin A \end{cases} \]

\[ \begin{aligned} I_A^{-1}(\{0\}) &= A^c \in \mathcal{A} &&\text{[by CUT]} \\ I_A^{-1}(\{1\}) &= A \in \mathcal{A} \end{aligned} \]

\(\therefore I_A\) is a r.v. (is “measurable”).

“Fuzzy” set \(\leftrightarrow\) \(I_A: \Omega \to [0,1]\) vs. \(\{0,1\}\).

Example: PAM Can Fail

Ex: Define \(\Omega = (0, 10]\).

\[ \mathcal{A} = \{\emptyset, [0,3), [3,10], \Omega\} \]

\(X: \Omega \to \Omega\), \(X(\omega) = \omega\) (identity function).

Range S.A. \(\mathcal{B}((0,10])\).

Pick \(B = (2, 4) \in \mathcal{B}((0,10])\):

\[ \therefore X^{-1}(B) = X^{-1}((2,4)) = (2,4) \notin \mathcal{A} \]

\(\therefore X\) is NOT a r.v. (PAM fails).

Probability Mass Function (PMF)

Definition: PMF

The probability mass function (p.m.f.): \[ p_X(x_k) = P[X = x_k] \] where \(x_k\) are the outcomes from a random experiment with countable sample space.

Two properties:

\(p_X(x_k) \geq 0 \quad \forall x_k \in \Omega\)
\(\displaystyle\sum_{x_k \in \Omega} p_X(x_k) = 1\)

Ex: Binomial “distribution” (random variable): \[ P[X = k] = \binom{n}{k} p^k (1-p)^{n-k} \]

\(k\) successes in \(n\) flips.

Cumulative Distribution Function (CDF)

Definition: CDF

Say \(X: \Omega \to \mathbb{R}\) is a r.v. on p.s. \((\Omega, \mathcal{A}, P)\).

Then the CDF: \[ F_X(x) \triangleq P(X \leq x) = P(\{\omega \in \Omega : X(\omega) \leq x\}) = P(X^{-1}((-\infty, x])) \]

Note

CDF \(F_X\) always exists.

Random Variable	Realization
\(X\)	\(x\)
function	constant
foot	footprint

CDF for Discrete Random Variables

\[ F_X(x) = P[X \leq x] = \sum_{x_k \leq x} p_X(x_k) \]

Ex: Bernoulli random variable. \(P[X=1] = p\), \(P[X=0] = 1-p\).

Note

PMF is a discrete function, defined only for discrete argument \(x_k \in \Omega\). CDF is defined \(\forall x \in \mathbb{R}\).

Properties of the CDF

\(F_X(-\infty) = 0\)
\(F_X(u_1) \leq F_X(u_2)\) if \(u_1 \leq u_2\) – \(F_X\) is increasing
\(F_X(\infty) = 1\)
\(F_X(b) = \lim_{h \to 0^+} F_X(b+h)\) – \(F_X\) is right continuous 4b. \(P[X = b] = \lim_{h \to 0^+} F_X(b+h) - \lim_{h \to 0^-} F_X(b+h)\) – the “jump”

“approach from right” \(-\) “approach from left”

Working with the CDF

Consider event \(\{X \leq u_1\} = \{X \leq u_0\} \cup \{u_0 < X \leq u_1\}\) for \(u_0 < u_1\) (disjoint).

\[ \therefore P[\{X \leq u_1\}] = P[\{X \leq u_0\}] + P[\{u_0 < X \leq u_1\}] \]

Important

\[ P[u_0 < X \leq u_1] = F_X(u_1) - F_X(u_0) \]

\[ P[u_0 \leq X \leq u_1] = P[X = u_0] + P[u_0 < X \leq u_1] = F_X(u_1) - F_X(u_0) + p_X(u_0) \]

Ex: Count # heads in 3 fair coin flips. \(X \sim b(3, 0.5)\).

Measuring the Size of Sets

Definition: Injective (1-to-1)

\(f: X \to Y\) is 1-to-1 (“injective”) iff \[ \forall x\, \forall z:\; f(x) = f(z) \implies x = z \]

Definition: Surjective (Onto)

\(f: X \to Y\) is onto (“surjective”) iff \[ \forall y \in Y\; \exists x \in X:\; y = f(x) \qquad (f(X) = Y) \]

Definition: Bijection

\(f: X \to Y\) is a bijection (“1-to-1 correspondence”) iff \(f\) is 1-to-1 and onto.

\(|A|\) = cardinality (“size”) of set \(A\).

Set	Definition
\(\mathbb{N}\)	\(\{1, 2, 3, \ldots\}\) – natural numbers
\(\mathbb{Z}\)	\(\{0, \pm 1, \pm 2, \pm 3, \ldots\}\) – integers; \(\mathbb{Z}^+ = \mathbb{N}\)
\(\mathbb{R}\)	\((-\infty, \infty)\) – reals

Definition: Finite / Infinite

\(A\) is finite iff \(A \xleftrightarrow[\text{onto}]{\text{1-to-1}} S\) for some \(S \subset \mathbb{N}\), i.e. \(A = \{a_1, \ldots, a_n\}\) for some \(n \in \mathbb{N}\) (or \(A = \emptyset\)).

Else \(A\) is infinite.

Fact: \(A\) infinite \(\iff\) \(A \xleftrightarrow[\text{onto}]{\text{1-to-1}} B\) for some \(B \subset A\) and \(B \neq A\).

\(\therefore\) infinite \(\iff\) \(\sim\)finite.

Countability

Cantor’s Theorem

\(|X| < |2^X|\)

Definitions

Denumerable: \(A\) is denumerable iff \(A \xleftrightarrow[\text{onto}]{\text{1-to-1}} \mathbb{N}\) (e.g. \(\mathbb{Z}\)).
Countable: \(A\) is countable iff (1) \(A\) is finite, or (2) \(A\) is denumerable.

Facts:

\(|\mathbb{Z}| = |\mathbb{N}| = \aleph_0\) (aleph-nought)
\(|\mathbb{R}| = |2^{\mathbb{Z}}| = \mathfrak{c} = \aleph_1\) (power of the continuum)
\(|A| = \aleph_k \implies |2^A| = \aleph_{k+1}\) (\(k = 0, 1, \ldots\))

Cantor’s Continuum Hypothesis: There is no \(\omega\) such that \(\aleph_k < \omega < \aleph_{k+1}\).

Ex: \((0,1)\) same size as \(\mathbb{R}^+\). \(f(x) = \frac{1}{1 + e^{-x}}\).

Note: adding 2 points \(0, 1 \to [0,1]\): no 1-to-1 onto map with \(\mathbb{R}^+\).

Aside: \(\pm \infty\) not real numbers.

Continuous Probability Distributions (“Densities”)

Suppose sample space \(\Omega\) not countable (e.g. \(\Omega = [0,1]\), \(\Omega = \mathbb{R}\)).

\(\therefore\) CDF continuous.

Definition: Continuity

\(f: \mathbb{R} \to \mathbb{R}\) is continuous at \(x_0 \in \mathbb{R}\) iff \[ \forall \varepsilon > 0\; \exists \delta > 0:\; |x - x_0| < \delta \implies |f(x) - f(x_0)| < \varepsilon \]

Inputs close \(\implies\) outputs close.

\(f\) continuous at \(a\) (\(\therefore\) left & right continuous) \(\iff \lim_{x \to a} f(x) = f(a)\) (if \(f\) defined at \(a\)).

Reminder: CDF only requires right continuous.

\(f\) is continuous iff it is continuous at \(\forall x_0 \in \mathbb{R}\).

Fact: \[ P[X = b] = \lim_{h \to 0^+} F_X(b+h) - \lim_{h \to 0^-} F_X(b+h) = 0 \quad \text{if } F_X \text{ continuous} \]

Not “close to zero” or “very small” – identically 0.

\(\therefore\) If \(X\) is a continuous random variable (\(\therefore\) CDF continuous), then \(P[X = x] = 0 \quad \forall x \in \mathbb{R}\).

i.e. \(P[X = a] = P[X = b] = 0\).

Important

\[ P[a < X \leq b] = P[a \leq X \leq b] = P[a \leq X < b] = P[a < X < b] \]

Probability Density Function (PDF)

Q: If \(P[X = x] = 0\) for all \(x\), how to describe probability of an outcome? (e.g. \(P[\text{height} = 6\text{ft}] = 0\))

A: Use density of probability “near” the outcome. (e.g. \(P[\text{height} \approx 6\text{ft}]\))

Theorem

CDF \(F\) is absolutely continuous iff \(\exists\) a probability density function (pdf) \(f\): \[ F(x) = \int_{-\infty}^{x} f(t)\, dt \]

\(\therefore\) (1) \(f = F' = \frac{dF}{dx}\), and (2) \(\int_{-\infty}^{\infty} f(x)\, dx = 1\).

Definition: PDF

The probability density function (pdf) of a continuous random variable \(X\): \[ \boxed{f_X(x) = \frac{dF_X}{dx}} \]

(if it exists – iff \(X\) is “absolutely continuous”)

\[ \therefore F_X(x) = \int_{-\infty}^{x} f_X(t)\, dt \]

Fundamental Theorem of Calculus

\[ \int_a^b f(x)\, dx = F(b) - F(a) \]

where \(F\) is the anti-derivative.

\[ \therefore P[a \leq X \leq b] = F_X(b) - F_X(a) = \int_a^b f_X(x)\, dx \quad \text{(if it exists)} \]

Behavior of interior is completely determined by behavior at boundary.
Simplifies operation of probability because it hides underlying complexity.

Differentiable Functions

Definition: Differentiable

Function \(f\) is differentiable at \(a\) if \[ \lim_{x \to a} \frac{f(x) - f(a)}{x - a} = \lim_{h \to 0} \frac{f(a+h) - f(a)}{h} = f'(a) \quad \text{exists.} \]

Ex: \(f(x) = |x|\) – continuous everywhere, not differentiable (at \(x = 0\)).

Theorem

Differentiable \(\implies\) Continuous.

Proof. \[ \begin{aligned} f(x) &= f(x) + f(a) - f(a) = \left(\frac{f(x) - f(a)}{x - a}\right)(x - a) + f(a) \\[6pt] \therefore \lim_{x \to a} f(x) &= \lim_{x \to a}\left[\left(\frac{f(x) - f(a)}{x - a}\right)(x - a) + f(a)\right] \\ &= \underbrace{\lim_{x \to a}\frac{f(x) - f(a)}{x - a}}_{= f'(a) \text{ exists by assumption}} \cdot \underbrace{\lim_{x \to a}(x - a)}_{= 0} + f(a) \\ &= f(a) \end{aligned} \]

\(\therefore \lim_{x \to a} f(x) = f(a)\), \(\therefore f\) continuous at \(a\), \(\therefore f\) continuous (since \(\forall a\)). \(\square\)

Insight: PDF as Density

The pdf specifies the density of probability.

\[ \begin{aligned} P[x < X \leq x + \Delta x] &= F_X(x + \Delta x) - F_X(x) \\ &= \Delta x \cdot \frac{F_X(x + \Delta x) - F_X(x)}{\Delta x} \end{aligned} \]

\(\therefore\) for small \(\Delta x\): \(\approx f_X(x) \cdot \Delta x\)

density = prob / length, so prob \(\approx\) length \(\cdot\) density.

So \(f_X(x)\) gives the probability that \(X\) falls in a small interval (width \(\Delta x\)) around \(x\).

Properties of the PDF

\(f_X(x) \geq 0\)
\(P[a \leq X \leq b] = \int_a^b f_X(x)\, dx\) – “area under the curve”
\(F_X(x) = \int_{-\infty}^{x} f_X(t)\, dt\)
\(\int_{-\infty}^{\infty} f_X(t)\, dt = 1\) – “normalization condition”

Note

(by 4) Given any non-negative function (piecewise continuous) \(g(x)\) with \(\int_{-\infty}^{\infty} g(x)\, dx = c < \infty\), then \(f_X(x) = \frac{g(x)}{c}\) is a valid pdf.

BEG CUP: Important EE 364 PDFs

Letter	Distribution	Notes
B	Binomial	(and discrete sampling)
E	Exponential (Gamma)	“waiting time”
G	Gaussian / Normal	“thin-tailed” bell
C	Cauchy	“thick-tailed” bell
U	Uniform
P	Poisson

Uniform PDF

\(X \sim U[a, b]\): \[ f_X(x) = \frac{1}{b - a} \quad \text{if } x \in [a, b], \quad \text{else } 0. \]

\(\therefore\) CDF is a “ramp.”

Standard Uniform: \(Z \sim U(0, 1)\).

Ex: \(X \sim U[0, 10]\)

\[ f_X(x) = \begin{cases} \frac{1}{10} & 0 \leq x \leq 10 \\ 0 & \text{else} \end{cases} \qquad F_X(x) = \begin{cases} 0 & x < 0 \\ \frac{x}{10} & 0 \leq x < 10 \\ 1 & x \geq 10 \end{cases} \]

\[ P[X \leq 5] = \int_{-\infty}^{5} f_X(x)\, dx = \int_0^5 \tfrac{1}{10}\, dx = \tfrac{1}{2} \]

\[ P[2 \leq X \leq 6] = \int_2^6 \tfrac{1}{10}\, dx = \tfrac{4}{10} = \tfrac{2}{5} \]

Later: Beta random variable.

Gaussian (Normal) PDF

\(X \sim N(\mu, \sigma^2)\): \[ f_X(x) = \frac{1}{\sqrt{2\pi}\,\sigma}\, e^{-\frac{1}{2}\left(\frac{x - \mu}{\sigma}\right)^2} \]

\(\mu \in \mathbb{R}\) (mean), \(\sigma > 0\) (standard deviation), \(\sigma^2\) = variance, \(x \in \mathbb{R}\).

Standard Normal

\[ Z \sim N(0, 1): \quad f_Z(z) = \frac{1}{\sqrt{2\pi}}\, e^{-z^2/2} \]

Standardize: \(Z \triangleq \frac{X - \mu}{\sigma}\)

Coverage	\(z\)-value
68.3%	\(\pm 1\)
95%	\(\pm 1.96\) (\(\approx \pm 2\sigma\))
95.4%	\(\pm 2\)
99%	\(\pm 2.58\)
99.7%	\(\pm 3\)

Standardization and the \(\Phi\)-function

Fact: Gaussian CDF does not have a closed form (but CDF does exist).

\[ F_Z(z) = \Phi(z) \triangleq \int_{-\infty}^{z} f_Z(t)\, dt = \int_{-\infty}^{z} \frac{1}{\sqrt{2\pi}}\, e^{-t^2/2}\, dt \]

(“Phi-function”)

Standardize: \(f_X(x) = \frac{1}{\sigma} f_Z\!\left(\frac{x - \mu}{\sigma}\right)\)

\[ F_X(x) = \int_{-\infty}^{x} f_X(t)\, dt = \int_{-\infty}^{x} \frac{1}{\sigma} f_Z\!\left(\frac{t - \mu}{\sigma}\right) dt = \int_{-\infty}^{\frac{x-\mu}{\sigma}} f_Z(z)\, dz = \Phi\!\left(\frac{x - \mu}{\sigma}\right) \]

Ex: Suppose \(X \sim N(\mu, \sigma^2)\):

\[ P[a < X \leq b] = \Phi\!\left(\frac{b - \mu}{\sigma}\right) - \Phi\!\left(\frac{a - \mu}{\sigma}\right) \]

Symmetry: \[ \Phi(-z) = 1 - \Phi(z) \qquad \text{i.e. } P[Z \leq -z] = P[Z \geq z] \]

Advanced result: \[ \lim_{\sigma \to 0} \frac{1}{\sigma} f_Z\!\left(\frac{x_0 - \mu}{\sigma}\right) = \delta(x_0 - \mu) \]

Gaussian Integral

Theorem

\[ \int_{-\infty}^{\infty} e^{-x^2}\, dx = \sqrt{\pi} \]

Proof. \[ \begin{aligned} \int_{-\infty}^{\infty} e^{-x^2}\, dx &= \sqrt{\left(\int_{-\infty}^{\infty} e^{-x^2}\, dx\right)\left(\int_{-\infty}^{\infty} e^{-y^2}\, dy\right)} = \sqrt{\int_{-\infty}^{\infty}\!\int_{-\infty}^{\infty} e^{-(x^2 + y^2)}\, dx\, dy} \end{aligned} \]

Polar coordinates: \(dx\, dy \mapsto r\, dr\, d\theta\).

\[ = \sqrt{\int_{\theta=0}^{2\pi}\int_{r=0}^{\infty} r\, e^{-r^2}\, dr\, d\theta} = \sqrt{2\pi \cdot \int_0^{\infty} r\, e^{-r^2}\, dr} = \sqrt{2\pi \cdot \tfrac{1}{2}} = \sqrt{\pi} \quad \square \]

Normal Table Values

\(X \sim N(\mu, \sigma^2)\), CDF: \(F_X(x) = \frac{1}{\sqrt{2\pi}\,\sigma} \int_{-\infty}^{x} e^{-\frac{1}{2}\left(\frac{w-\mu}{\sigma}\right)^2} dw\) – no closed form.

\(Z \sim N(0,1)\): \(\Phi(z) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{z} e^{-w^2/2}\, dw\) \(\therefore\) use table.

See: Normal (Z) Table

Ex: \(X \sim N(1, 4)\) (so \(\mu = 1\), \(\sigma = 2\)):

\[ \begin{aligned} P[X \leq 3] &= F_X(3) = F_Z\!\left(\tfrac{3-1}{2}\right) = \Phi(1) \\[4pt] P[X \leq -1] &= P\!\left[\tfrac{X-1}{2} \leq \tfrac{-1-1}{2}\right] = \Phi(-1) \\[4pt] P[-1 \leq X \leq 5] &= P\!\left[\tfrac{-1-1}{2} \leq Z \leq \tfrac{5-1}{2}\right] = \Phi(2) - \Phi(-1) \\[4pt] P[X \geq 2] &= 1 - P[X \leq 2] = 1 - P\!\left[Z \leq \tfrac{2-1}{2}\right] = \Phi(-1.5) \end{aligned} \]

Note on last line: used \(1 - \Phi(0.5) = \Phi(-0.5)\) by symmetry. Actually: \(\frac{2-1}{2} = 0.5\), so \(P[X \geq 2] = 1 - \Phi(0.5) = \Phi(-0.5)\).

Q-function

\[ Q(x) \triangleq 1 - \Phi(x) = P(Z > z) = \tfrac{1}{2} - \tfrac{1}{2}\operatorname{erf}\!\left(\tfrac{x}{\sqrt{2}}\right) \]

for “error function” \(\operatorname{erf}(x) = \frac{2}{\sqrt{\pi}} \int_0^x e^{-u^2}\, du\).