EE 364 Supplemental – Week 04
Discrete Random Experiments
A “Bernoulli trial” (a.k.a. coin flip): binary outcome H/T, T/F, 0/1, Success/Fail, … with \(P[\text{success}] = p\).
A Bernoulli sequence \(X_1, X_2, \ldots\) is a sequence of independent and identical Bernoulli trials.
A sequence of outcomes \(\{X_k\}\) is a random sample iff the outcomes are independent and identically distributed (i.i.d.).
Random sample \(\iff\) i.i.d.
Ex: 2 coin flips.
\[ P[H_1 \cap T_2] \stackrel{\text{indep.}}{=} P[H_1] \cdot P[T_2] \stackrel{\text{ident.}}{=} P[H] \cdot P[T] \quad (= p \cdot q) \]
Binomial Probability “Distribution”
\(p + (1-p) = 1\) \(\implies\) \(1 = (p + q)^n = \sum_{k=0}^{n} \binom{n}{k} p^k (1-p)^{n-k}\)
Idea: \(\binom{n}{k} p^k (1-p)^{n-k}\)
- \(p^k (1-p)^{n-k}\): prob of getting \(k\) “success” and \(n-k\) “fail” (ordered)
- \(\binom{n}{k}\): # different ways to arrange \(n\) “success” and \(n-k\) “fail” (\(n\) choose \(k\))
\[ P[\text{\# success} = k] = \binom{n}{k} p^k (1-p)^{n-k} \stackrel{\Delta}{=} b(n, k, p) \]
\(X\) = # success (in any order) in \(n\) trials.
Binomial PDF
\[ b(n, k, p) = \binom{n}{k} p^k (1-p)^{n-k} \qquad \text{if } k \leq n \text{ and } p \in [0,1] \]
\(p\) = “success probability” \(= P[\text{Success on one Bernoulli Trial}]\).
\(\therefore b(n, k, p) = P[k \text{ successes in } n \text{ trials}]\).
CDF
\[ P[X \leq k] = P[X\!=\!0] + P[X\!=\!1] + \cdots + P[X\!=\!k] = \sum_{j=0}^{k} b(n, j, p) \]
\(\therefore \sum_{k=0}^{n} b(n,k,p) = 1\) by binomial theorem, since \((p + (1-p))^n = 1^n = 1\).
Examples
Ex: \(P[\text{2 heads in 3 flips}] = P[X\!=\!2] = b(3,2,p)\)
Fair coin: \(p = 0.5\). “means: exactly 2”
\[ = \binom{3}{2}(0.5)^2(0.5)^1 = \frac{3}{8} \]
\(P[\text{at most 3 heads in } n \text{ flips}] = P[X \leq 3] = \sum_{k=0}^{3} b(n,3,0.5) = \sum_{k=0}^{3} \binom{n}{k}(0.5)^k(0.5)^{n-k} = \frac{1}{2^n} \sum_{k=0}^{3} \binom{n}{k}\)
Ex:
| \(P[\text{3 heads in 5 flips}]\) | \(= b(5,3,\tfrac{1}{2}) = \binom{5}{3}(\tfrac{1}{2})^3(\tfrac{1}{2})^2\) | \(= \frac{5}{16}\) |
| \(P[\text{3 heads in 6 flips}]\) | \(= b(6,3,\tfrac{1}{2}) = \binom{6}{3}(\tfrac{1}{2})^3(\tfrac{1}{2})^3\) | \(= \frac{5}{16}\) |
| \(P[\text{2 heads in 5 flips}]\) | \(= b(5,2,\tfrac{1}{2}) = \binom{5}{2}(\tfrac{1}{2})^2(\tfrac{1}{2})^3\) | \(= \frac{5}{16}\) |
Pay $1 to play, $3 win. Average win \(\frac{15}{16} < 1\): “rational price” / expected return.
Binomial at Peak: Stirling’s Approximation
Ex: Flip a coin \(n\) times with \(p = P[H]\). Suppose \(n\) is even. What is the probability of getting \(\frac{n}{2}\) heads in \(n\) flips?
\[ b\!\left(n, \tfrac{n}{2}, p\right) \approx \sqrt{\tfrac{2}{\pi}} \cdot \frac{1}{\sqrt{n}} \cdot (4pq)^{n/2} \]
\(\therefore\) for \(p = q = \frac{1}{2}\): \(b(n,k,p) \approx \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}}\)
\(\therefore\) slow (square-root) decay.
Computer: \(P[X\!=\!n \text{ success in } 2n \text{ flips}] = \binom{2n}{n} \frac{1}{2^{2n}}\).
\(\lim_{n \to \infty} \binom{2n}{n} \cdot \frac{1}{2^{2n}} = ?\) (\(= 0\)).
So many ways of getting \(\# \approx n\) but not \(n\) exactly. \(\therefore P[X\!=\!n] \to 0\).
Proof. \[ \begin{aligned} b(n, \tfrac{n}{2}, p) &= \binom{n}{n/2} p^{n/2} q^{n/2} \qquad \text{(since } n \text{ even)} \\ &= \frac{n!}{((n/2)!)^2} (pq)^{n/2} \\ &\approx \frac{\sqrt{2\pi n}\;\frac{n^n}{e^n}\;(pq)^{n/2}}{\left(\sqrt{2\pi \frac{n}{2}} \left(\frac{n}{2}\right)^{n/2} e^{-n/2}\right)^2} &&\text{[Stirling's]} \\ &= \frac{\sqrt{2\pi n}\;\frac{n^n}{e^n}\;(pq)^{n/2}}{\pi n \cdot \frac{n^n}{2^n} \cdot e^{-n}} \\ &= \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}} \cdot 2^n (pq)^{n/2} \\ &= \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}} \cdot (4pq)^{n/2} \end{aligned} \]
\(\therefore\) for \(p = q = \frac{1}{2}\): \((4pq)^{n/2} = 1^{n/2} = 1\), so \(b(n,k,p) \approx \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}}\). \(\square\)
Ex (computer):
\(b(10, 5, \tfrac{1}{2}) = 0.24609\). Stirling: \(b(10,5,\tfrac{1}{2}) \approx 0.25231 = \sqrt{2/\pi/10}\).
Faster decay if \(p \neq q\).
| \(n\) | \(b(n, \frac{n}{2}, \frac{1}{2})\) | Stirling’s Approx. |
|---|---|---|
| 100 | 0.03958 | 0.03978 |
| 1,000 | 0.02522 | 0.02523 |
| 10,000 | 0.00798 | 0.00798 |
| 100,000 | 0.00252 | 0.00252 |
Compare to (intuition): \(\frac{\text{\# success}}{\text{\# total}} \to p\).
Multinomial Probability
Trinomial: 3 Outcomes
“Yes”, “no”, “abstain.” Put \(Z = n - X - Y\) for \(n\) Bernoulli trials with probabilities \(p_x\), \(p_y\), \(p_z\): \(p_x + p_y + p_z = 1\).
\[ \begin{aligned} 1 = 1^n &= (p_x + p_y + p_z)^n = (p_x + (p_y + p_z))^n \\ &\stackrel{\text{B.T.}}{=} \sum_{x=0}^{n} \binom{n}{x} p_x^x (p_y + p_z)^{n-x} \\ &\stackrel{\text{B.T.}}{=} \sum_{x=0}^{n} \sum_{y=0}^{n-x} \binom{n}{x}\binom{n-x}{y} p_x^x\, p_y^y\, p_z^{n-x-y} \\ &= \sum_{\substack{x \geq 0,\, y \geq 0,\, z \geq 0 \\ x+y+z=n}} \frac{n!}{x!\;y!\;z!}\; p_x^x\, p_y^y\, p_z^z \end{aligned} \]
\[ p(x,y) = P[X\!=\!x,\; Y\!=\!y] = \frac{n!}{x!\;y!\;(n-x-y)!}\; p_x^x\, p_y^y\, p_z^{n-x-y} \]
\[ \frac{n!}{x!\;y!\;(n-x-y)!} = \binom{n}{x,y,z} \]
# ways to choose \(n\) with \(x\) Type 1, \(y\) Type 2, and \(z\) Type 3.
Example: Group Assignment
Ex: How many ways to assign 20 students to groups A, B, C with \(k_A = k_B = 6\) and \(k_C = 8\)?
\[ \binom{20}{6,6,8} = \frac{20!}{6!\;6!\;8!} = 116{,}396{,}280 \]
Note: \(\binom{n}{k} = \binom{n}{k, n-k}\) (binomial is special case).
Example: Trinomial Sampling
Ex: Sample with replacement. \(p_x = 0.30\), \(p_y = 0.15\), \(p_z = 0.55\). Pick 10.
\(P[\text{2 "X" and 3 "Y" if pick 10}] = \binom{10}{2,3,5}(0.30)^2(0.15)^3(0.55)^5\)
\(P[\text{2 "X" if pick 10}]\) – need “marginal” probability.
\(p(x,y)\) describes “joint” probability of \(X\) and \(Y\). \(p(x)\) and \(p(y)\) describe probability of \(X\) and \(Y\) separately (“marginal”).
Marginal Probability of Trinomial
\[ \begin{aligned} P(X\!=\!x) &= \sum_{y=0}^{n-x} p(x,y) = \sum_{y=0}^{n-x} \frac{n!}{x!\;y!\;(n-x-y)!}\;p_x^x\, p_y^y\, p_z^{n-x-y} \\ &= \frac{n!}{x!\;(n-x)!}\;p_x^x \sum_{y=0}^{n-x} \frac{(n-x)!}{y!\;(n-x-y)!}\;p_y^y\, p_z^{(n-x)-y} \\ &= \binom{n}{x} p_x^x \left[\sum_{y=0}^{n-x} \binom{n-x}{y} p_y^y\, p_z^{(n-x)-y}\right] \\ &\stackrel{\text{B.T.}}{=} \binom{n}{x} p_x^x (p_y + p_z)^{n-x} = \binom{n}{x} p_x^x (1 - p_x)^{n-x} \end{aligned} \]
\(\therefore X \sim b(n, x, p_x)\): marginal is binomial.
Sample \(n\), prob of \(x\) “X” regardless of “Y” or “Z” count. Sum over all \(y\) with \(x\) fixed.
Conditional Probability of Trinomial
\[ \begin{aligned} P(X\!=\!x \mid Y\!=\!y) = p(x \mid y) &= \frac{p(x,y)}{p(y)} \\ &= \frac{(n-y)!}{x!\;((n-y)-x)!} \cdot \frac{p_x^x \cdot p_z^{(n-y)-x}}{(1-p_y)^{n-y}} \\ &= \binom{n-y}{x} \left(\frac{p_x}{1 - p_y}\right)^x \left(1 - \frac{p_x}{1 - p_y}\right)^{(n-y)-x} \\ &= b\!\left(n - y,\; x,\; \frac{p_x}{1 - p_y}\right) \end{aligned} \]
\(\therefore\) conditional is binomial.
Sample \(n\), probability of \(x\) “X” given \(y\) “Y” (\(\therefore n - y - x\) “Z”).
Multinomial Theorem
\[ (p_1 + \cdots + p_K)^n = \sum_{\substack{\ell_1, \ldots, \ell_K \geq 0 \\ \ell_1 + \cdots + \ell_K = n}} \frac{n!}{\ell_1!\;\ell_2!\;\cdots\;\ell_K!}\; p_1^{\ell_1} \cdots p_K^{\ell_K} \]
(Partition: \(\ell_1 + \cdots + \ell_K = n\).)
Compare: binomial: \(p_x, p_y = 1 - p_x\). Trinomial: \(p_x, p_y, p_z = 1 - p_x - p_y\).
Example: Die Rolling
Ex: Roll a die (“d6”), \(K = 6\). Roll fair die 7 times, \(n = 7\), \(p_1 = \cdots = p_6 = \frac{1}{6}\).
\[ P(\text{3 twos, 2 fours, 2 sixes}) = \binom{7}{0,3,0,2,0,2}\left(\frac{1}{6}\right)^7 = \frac{7 \cdot 6 \cdot 5}{6^7} \approx 0.00078 \]
Connection: Deep Neural Classifier
- Deep neural classifier with \(K\) output SoftMax neurons (1-in-\(K\) coding)
- \(\therefore x \to N \to N(x)\): 1 roll of \(K\)-sided die (“categorical” multinomial)
- \(\therefore \ln p = \ln p_1^{x_1} \cdots p_K^{x_K} = \sum_{k=1}^{K} x_k \cdot \ln p_k\) – cross-entropy
- \(p = p_1^{x_1} \cdots p_K^{x_K}\): unknown (\(p_k\)), data (\(x_k\))
- Q: find “best choice” of \(p_1, \ldots, p_K\) given data/observations \(x_1 \cdots x_K\)
Sequences and Series
\(a_n \to a\) if \(\lim_{n \to \infty} a_n = a\), iff \(\forall \varepsilon > 0\;\; \exists n_0 \in \mathbb{Z}^+: \forall n \geq n_0\quad |a_n - a| < \varepsilon\).
\(\therefore\) Smaller \(\varepsilon > 0\) \(\implies\) takes longer to find \(n_0\).
Partial Sum: \(S_N = \sum_{n=0}^{N} a_n = a_0 + a_1 + \cdots + a_N\)
“\(\sum_{n=0}^{\infty} a_n = S\)” converges iff \(\lim_{N \to \infty} S_N = S\) (\(S\) finite),
iff \(\forall \varepsilon > 0\;\; \exists n_0 \in \mathbb{Z}^+: \forall n \geq n_0\quad |S_N - S| < \varepsilon\). (Else “diverges”.)
Special case: \(\lim_{n \to \infty} a_n = \infty\) iff \(\forall \varepsilon > 0\;\; \exists n_0 \in \mathbb{Z}^+: \forall n \geq n_0\quad a_n > \varepsilon\).
A type of non-convergence. Not the same in general as \(\lim a_n \neq a\).
Triangle Inequality and Limit Theorems
\[ |a + b| \leq |a| + |b| \]
Proof. \[ \begin{aligned} |a+b|^2 &= (a+b)^2 = a^2 + b^2 + 2ab \\ &\leq a^2 + b^2 + 2|a| \cdot |b| &&\text{[note: } |x| = \max(x, -x)\text{]} \\ &= |a|^2 + |b|^2 + 2|a| \cdot |b| = (|a| + |b|)^2 \end{aligned} \]
\(\therefore |a + b| \leq |a| + |b|\). \(\square\)
Ex: \((a_n \to a) \land (b_n \to b) \implies (a_n + b_n \to a + b)\)
Proof. Suppose \(a_n \to a\) and \(b_n \to b\). Pick \(\varepsilon > 0\).
\(\exists n_0\): \(\forall n \geq n_0\): \(|a_n - a| < \frac{\varepsilon}{2}\). \(\quad \exists m_0\): \(\forall n \geq m_0\): \(|b_n - b| < \frac{\varepsilon}{2}\).
\(\therefore \forall n \geq \max(n_0, m_0)\):
\[ |(a_n + b_n) - (a + b)| = |(a_n - a) + (b_n - b)| \stackrel{\text{T.I.}}{\leq} |a_n - a| + |b_n - b| < \frac{\varepsilon}{2} + \frac{\varepsilon}{2} = \varepsilon \quad \square \]
\[ \big||a| - |b|\big| \leq |a - b| \]
Proof. \(|a| - |b| = |(a-b) + b| - |b| \leq |a-b| + |b| - |b| = |a-b|\)
Similarly: \(|b| - |a| \leq |a - b|\), \(\therefore |a| - |b| \geq -|a - b|\).
\[ \therefore -|a-b| \leq |a| - |b| \leq |a-b| \quad \implies \quad \big||a| - |b|\big| \leq |a-b| \quad \square \]
Convergence Tests
Fact: \(\lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n = e^x\) where \(e^x \stackrel{\Delta}{=} \sum_{n=0}^{\infty} \frac{x^n}{n!}\).
Fact: \(\sum_{n=0}^{\infty} a_n = S \implies \lim_{n \to \infty} a_n = 0\). (Converse does NOT hold.)
p-Series Test
\(\sum_{n=1}^{\infty} \frac{1}{n^p}\): converges if \(p > 1\), diverges if \(p \leq 1\).
Ex: \(\sum_{k=1}^{\infty} \frac{1}{n}\) diverges since \(\frac{1}{n} = \frac{1}{n^1}\), \(p = 1\).
\(\sum_{k=1}^{\infty} \frac{1}{n^2}\), \(\sum_{k=1}^{\infty} \frac{1}{n^{1.0001}}\): converge (\(p > 1\)).
Alternating Series Test
\(\sum_{n=1}^{\infty} (-1)^{n+1} a_n\) converges if: (1) \(a_k \geq a_{k+1} > 0\) \(\forall k \in \mathbb{Z}^+\), and (2) \(\lim_{n \to \infty} a_n = 0\).
Ex: \(\sum_{n=1}^{\infty} (-1)^{n+1} \cdot \frac{1}{n}\) converges since \(\frac{1}{n} \geq \frac{1}{n+1}\) \(\forall n\) and \(\lim \frac{1}{n} = 0\).
Absolute and Conditional Convergence
\(\sum_{n=0}^{\infty} a_n\) converges absolutely iff \(\sum_{n=0}^{\infty} |a_n|\) converges.
(\(\therefore\) treat infinite sums like finite sums.)
Facts:
- Absolute convergence \(\implies\) convergence (converse does NOT hold).
- If \(\sum a_n\) converges absolutely and \(\sum b_n\) is any rearrangement of \(\sum a_n\), then \(\sum b_n\) converges and \(\sum b_n = \sum a_n\).
\(\sum_{n=0}^{\infty} a_n\) converges conditionally iff \(\sum a_n\) converges but \(\sum |a_n|\) diverges.
Ex: \(\sum_{n=1}^{\infty} (-1)^{n+1} \cdot \frac{1}{n}\) converges conditionally since \(\sum (-1)^{n+1} \frac{1}{n}\) converges but \(\sum |(-1)^{n+1} \frac{1}{n}| = \sum \frac{1}{n}\) diverges.
(Extra, not tested.) Pick ANY \(x \in \mathbb{R}\). Suppose \(\sum a_n\) converges conditionally. Then \(\exists\) a rearrangement \(\sum b_n\) of \(\sum a_n\) where \(\sum b_n = x\).
Ratio Test
Put \(L \stackrel{\Delta}{=} \lim_{n \to \infty} \left|\frac{a_{n+1}}{a_n}\right|\). Then:
- \(L < 1 \implies\) series converges absolutely
- \(L > 1 \implies\) series diverges
- \(L = 1 \implies\) test fails
Ex: Geometric series: \(a_n = a^n\), \(\sum_{n=0}^{\infty} a^n\).
\[ L = \lim_{n \to \infty} \left|\frac{a^{n+1}}{a^n}\right| = \lim_{n \to \infty} |a| = |a| \]
\(\therefore \sum_{n=0}^{\infty} a^n\) converges absolutely if \(|a| < 1\).
Ex: \(e^x = \sum_{n=0}^{\infty} \frac{x^n}{n!}\)
\[ L = \lim_{n \to \infty} \left|\frac{x^{n+1}/(n+1)!}{x^n/n!}\right| = |x| \cdot \lim_{n \to \infty} \frac{1}{n+1} = 0 \]
\(\therefore e^x\) converges \(\forall x\).
Ex: \(\sum \frac{1}{n}\): \(L = \lim \frac{n}{n+1} = 1\). \(\therefore\) test fails.
Power Series
Fact: \(|x| \leq c\) iff \(-c \leq x \leq c\) (\(c > 0\)).
Note: Taylor’s series: \(f(x) = \sum_{n=0}^{\infty} \frac{f^{(n)}(0)}{n!} \cdot x^n\), i.e. power series “around 0”.
\[ \sum_{n=0}^{\infty} a_n x^n = a_0 + a_1 x + a_2 x^2 + \cdots \]
Ex: Find all \(x\) so that \(\sum_{n=0}^{\infty} \frac{n}{5^n} x^n\) converges absolutely.
By the ratio test:
\[ \lim_{n \to \infty} \left|\frac{a_{n+1}}{a_n}\right| = \lim_{n \to \infty} \left|\frac{(n+1)x^{n+1}}{n \cdot x^n} \cdot \frac{5^n}{5^{n+1}}\right| = \lim_{n \to \infty} \frac{(n+1)}{5n} \cdot |x| = \frac{1}{5}|x| < 1 \]
iff \(|x| < 5\), i.e. \(-5 < x < 5\).
Check endpoints: \(x = +5\): \(\sum n\) diverges. \(x = -5\): \(\sum (-1)^n n\) diverges.
\(\therefore \sum \frac{n}{5^n} x^n\) converges iff \(-5 < x < 5\) (else diverges).
Theorems on Power Series
- If \(\sum a_n x^n\) converges for \(c \neq 0\) then it converges absolutely \(|x| < |c|\).
- If \(\sum a_n x^n\) diverges for \(d \neq 0\) then it diverges if \(|x| > |d|\).
Exactly one holds for \(\sum a_n x^n\):
- It converges only if \(x = 0\).
- It converges absolutely for all \(x\) (e.g. \(e^x\)).
- \(\exists r > 0\): it converges absolutely if \(|x| < r\) and diverges if \(|x| > r\).
Case 3: \(r\) is the radius of convergence, interval \((-r, r)\).
Negative Binomial and Geometric
Negative Binomial
\(P(X\!=\!n \text{ trials until } k \text{ successes})\)
\[ = P(\text{head on } n\text{-th flip}) \cdot P(k-1 \text{ heads in first } n-1 \text{ flips}) \]
\[ = p \cdot \binom{n-1}{k-1} p^{k-1} (1-p)^{(n-1)-(k-1)} = \binom{n-1}{k-1} p^k (1-p)^{n-k} \]
\(n\)-th trial is success (independent), first \(n-1\) trials have \(k-1\) successes.
Geometric
\(P(X\!=\!n \text{ trials until } 1^{\text{st}} \text{ success})\)
\(\therefore = \text{NB}(n, k\!=\!1) = \binom{n-1}{0} p^1 (1-p)^{n-1}\)
\[ = p \cdot (1-p)^{n-1} \]
Geometric Series
\[ \sum_{k=0}^{n} a^k = \begin{cases} n+1 & \text{if } a = 1 \\ \dfrac{1 - a^{n+1}}{1 - a} & \text{if } a \neq 1 \end{cases} \]
Proof. \(S \stackrel{\Delta}{=} a^0 + a^1 + \cdots + a^n = \sum_{k=0}^{n} a^k\)
\(\therefore aS = a + a^2 + \cdots + a^{n+1}\)
\(\therefore S(1-a) = S - aS = (1 + a + \cdots + a^n) - (a + a^2 + \cdots + a^{n+1}) = 1 - a^{n+1}\)
\(\therefore S = \frac{1 - a^{n+1}}{1 - a}\) since \(a \neq 1\). \(\square\)
Corollary 1: \(\sum_{k=0}^{\infty} a^k = \frac{1}{1-a}\) if \(|a| < 1\).
Corollary 2: \(\sum_{k=1}^{\infty} a^k = \frac{a}{1-a}\) if \(|a| < 1\).
Corollary 3: \(\sum_{k=j}^{\infty} a^k = \frac{a^j}{1-a}\) if \(|a| < 1\).
Verifying Geometric is a Valid Probability
\[ \begin{aligned} \sum_n p(x) &\stackrel{G(p)}{=} \sum_{n=1}^{\infty} p \cdot (1-p)^{n-1} = p \cdot \sum_{n=1}^{\infty} (1-p)^{n-1} \\ &= p \cdot \sum_{k=0}^{\infty} (1-p)^k \qquad [k = n - 1] \\ &= p \cdot \frac{1}{1 - (1-p)} \qquad [0 \leq 1-p < 1] \\ &= p \cdot \frac{1}{p} = 1 \quad \therefore \text{"valid" probability} \end{aligned} \]
Geometric Tail Probability
\(P(X > k) = q^k\) if \(X \sim G(p)\). \(\quad \therefore P(X \leq k) = 1 - q^k\).
Proof. \[ \begin{aligned} P(X > k) &= P(\{X\!=\!k\!+\!1\} \cup \{X\!=\!k\!+\!2\} \cup \cdots) \\ &\stackrel{\text{CA}}{=} \sum_{x=k+1}^{\infty} p(x) \stackrel{G(p)}{=} \sum_{x=k+1}^{\infty} p \cdot q^{x-1} \\ &= p \cdot \sum_{j=k}^{\infty} q^j \qquad [j = x - 1] \\ &\stackrel{\text{Corr 3}}{=} p \cdot \frac{q^k}{1 - q} = \frac{p}{p} \cdot q^k = q^k \quad \square \end{aligned} \]
Vandermonde’s Identity
\[ \binom{N}{n} = \sum_{k=0}^{n} \binom{N_1}{k}\binom{N_2}{n-k} \qquad \text{if } N = N_1 + N_2 \]
and \(\binom{k}{j} = 0\) for \(j > k\).
\(N\): total pop., \(N_1\): # type 1, \(N_2\): # type 2. Sum to \(N\).
Hypergeometric Probability
Suppose batch of \(N\) items. Sample \(n\) w/o replacement. \(N_1\) type 1, \(N_2\) type 2.
\[ \binom{N}{n} = \binom{N_1 + N_2}{n} = \sum_{k=0}^{n} \binom{N_1}{k}\binom{N_2}{n-k} \]
# different ways to draw \(k\) “good” (Type 1) and \(n - k\) “bad” (Type 2). Add over all ways for 0 “good”, 1 “good”, … \(n\) “good”.
\[ \frac{\binom{N_1}{k}\binom{N_2}{n-k}}{\binom{N}{n}} \geq 0 \qquad \text{and} \qquad \sum_{k=0}^{n} \frac{\binom{N_1}{k}\binom{N_2}{n-k}}{\binom{N}{n}} = 1 \]
\(\therefore\) obeys all axioms of probability, with \(P[k \text{ "good" from draw of } n]\) = hypergeometric p.m.f.
Example: Fishing
Ex: Small lake, \(N = 50\) fish. \(N_1 = 10\) trout, \(N_2 = 40\) large mouth bass.
Bob catches 7 fish w/ replacement (“catch and release”):
\[ P[\text{2 trout and 5 bass}] = b(7, 2, \tfrac{1}{5}) = \binom{7}{2}\left(\frac{1}{5}\right)^2\left(\frac{4}{5}\right)^5 = 0.275 \]
7 fish w/o replacement (“catch and eat”):
\[ P[\text{2 trout and 5 bass}] = \text{Hypergeo}(7, 10, 2, 50) = \frac{\binom{10}{2}\binom{40}{5}}{\binom{50}{7}} = 0.2969 \]
Q: Under what conditions is w/ replacement a “good” approximation to w/o replacement?
A: If the pond is an ocean. \(\implies\) \(N\) big, \(n\) small (fixed ratio \(\frac{N_1}{N}\)).
Binomial Approximation to the Hypergeometric
Hypergeometric \(\to\) Binomial if \(N \to \infty\) and \(p = \frac{m}{N}\) (fixed).
Proof. Say \(X \sim \text{Hyp}(n, m, k, N)\). Fixed \(k\) and \(n\).
\[ \begin{aligned} \lim_{N \to \infty} P(X\!=\!k) &= \lim_{N \to \infty} \frac{\binom{m}{k}\binom{N-m}{n-k}}{\binom{N}{n}} \\[6pt] &= \lim_{N \to \infty} \binom{n}{k} \cdot \frac{\prod_{j=1}^{k}(m - k + j) \cdot \prod_{j=1}^{n-k}(N - m - n + k + j)}{\prod_{j=1}^{n}(N - n + j)} \cdot \frac{N^n}{N^n} \end{aligned} \]
\(\frac{N^n}{N^n}\) is fixed (= 1).
Each product ratio \(\to\) the appropriate power of \(\frac{m}{N}\) or \(1 - \frac{m}{N}\) as \(N \to \infty\):
\[ = \binom{n}{k} \cdot \left(\frac{m}{N}\right)^k \left(1 - \frac{m}{N}\right)^{n-k} = \binom{n}{k} p^k (1-p)^{n-k} = b(n, k, p) \quad \square \]
BEG CUP: Distribution Selection
| w/ replacement | w/o replacement | |
|---|---|---|
| 2 outcomes | Binomial | Hypergeometric |
| 2 outcomes, “until” | Negative Binomial / Geometric | – |
| \(\geq 2\) outcomes | Multinomial | Multivariate Hypergeometric |
Hypergeo \(\to\) Binomial. Multi Hypergeo \(\to\) Multinomial.
Multivariate Hypergeometric: \(\dfrac{\binom{N_1}{n_1}\binom{N_2}{n_2}\cdots\binom{N_K}{n_K}}{\binom{N}{n}}\) with \(n = n_1 + \cdots + n_K\), \(N = N_1 + \cdots + N_K\).
Issue-Spotting Sequence
- Q1: Outcomes? (2 or more?)
- Q2: Sampling? (w/ or w/o replacement)
- Q3: Until structure?
“UNTIL” \(\implies\) negative binomial (\(k = 1\) outcomes: geometric).