EE 364 Supplemental – Week 04

Discrete Random Experiments

NoteDefinition: Bernoulli Trial

A “Bernoulli trial” (a.k.a. coin flip): binary outcome H/T, T/F, 0/1, Success/Fail, … with \(P[\text{success}] = p\).

NoteDefinition: Bernoulli Sequence

A Bernoulli sequence \(X_1, X_2, \ldots\) is a sequence of independent and identical Bernoulli trials.

NoteDefinition: Random Sample

A sequence of outcomes \(\{X_k\}\) is a random sample iff the outcomes are independent and identically distributed (i.i.d.).

Important

Random sample \(\iff\) i.i.d.

Ex: 2 coin flips.

\[ P[H_1 \cap T_2] \stackrel{\text{indep.}}{=} P[H_1] \cdot P[T_2] \stackrel{\text{ident.}}{=} P[H] \cdot P[T] \quad (= p \cdot q) \]

Binomial Probability “Distribution”

\(p + (1-p) = 1\) \(\implies\) \(1 = (p + q)^n = \sum_{k=0}^{n} \binom{n}{k} p^k (1-p)^{n-k}\)

Idea: \(\binom{n}{k} p^k (1-p)^{n-k}\)

  • \(p^k (1-p)^{n-k}\): prob of getting \(k\) “success” and \(n-k\) “fail” (ordered)
  • \(\binom{n}{k}\): # different ways to arrange \(n\) “success” and \(n-k\) “fail” (\(n\) choose \(k\))

\[ P[\text{\# success} = k] = \binom{n}{k} p^k (1-p)^{n-k} \stackrel{\Delta}{=} b(n, k, p) \]

\(X\) = # success (in any order) in \(n\) trials.


Binomial PDF

NoteDefinition: Binomial PMF

\[ b(n, k, p) = \binom{n}{k} p^k (1-p)^{n-k} \qquad \text{if } k \leq n \text{ and } p \in [0,1] \]

\(p\) = “success probability” \(= P[\text{Success on one Bernoulli Trial}]\).

\(\therefore b(n, k, p) = P[k \text{ successes in } n \text{ trials}]\).

CDF

\[ P[X \leq k] = P[X\!=\!0] + P[X\!=\!1] + \cdots + P[X\!=\!k] = \sum_{j=0}^{k} b(n, j, p) \]

\(\therefore \sum_{k=0}^{n} b(n,k,p) = 1\) by binomial theorem, since \((p + (1-p))^n = 1^n = 1\).

Examples

Ex: \(P[\text{2 heads in 3 flips}] = P[X\!=\!2] = b(3,2,p)\)

Fair coin: \(p = 0.5\). “means: exactly 2”

\[ = \binom{3}{2}(0.5)^2(0.5)^1 = \frac{3}{8} \]

\(P[\text{at most 3 heads in } n \text{ flips}] = P[X \leq 3] = \sum_{k=0}^{3} b(n,3,0.5) = \sum_{k=0}^{3} \binom{n}{k}(0.5)^k(0.5)^{n-k} = \frac{1}{2^n} \sum_{k=0}^{3} \binom{n}{k}\)

Ex:

\(P[\text{3 heads in 5 flips}]\) \(= b(5,3,\tfrac{1}{2}) = \binom{5}{3}(\tfrac{1}{2})^3(\tfrac{1}{2})^2\) \(= \frac{5}{16}\)
\(P[\text{3 heads in 6 flips}]\) \(= b(6,3,\tfrac{1}{2}) = \binom{6}{3}(\tfrac{1}{2})^3(\tfrac{1}{2})^3\) \(= \frac{5}{16}\)
\(P[\text{2 heads in 5 flips}]\) \(= b(5,2,\tfrac{1}{2}) = \binom{5}{2}(\tfrac{1}{2})^2(\tfrac{1}{2})^3\) \(= \frac{5}{16}\)

Pay $1 to play, $3 win. Average win \(\frac{15}{16} < 1\): “rational price” / expected return.


Binomial at Peak: Stirling’s Approximation

Ex: Flip a coin \(n\) times with \(p = P[H]\). Suppose \(n\) is even. What is the probability of getting \(\frac{n}{2}\) heads in \(n\) flips?

TipResult

\[ b\!\left(n, \tfrac{n}{2}, p\right) \approx \sqrt{\tfrac{2}{\pi}} \cdot \frac{1}{\sqrt{n}} \cdot (4pq)^{n/2} \]

\(\therefore\) for \(p = q = \frac{1}{2}\): \(b(n,k,p) \approx \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}}\)

\(\therefore\) slow (square-root) decay.

Computer: \(P[X\!=\!n \text{ success in } 2n \text{ flips}] = \binom{2n}{n} \frac{1}{2^{2n}}\).

\(\lim_{n \to \infty} \binom{2n}{n} \cdot \frac{1}{2^{2n}} = ?\) (\(= 0\)).

So many ways of getting \(\# \approx n\) but not \(n\) exactly. \(\therefore P[X\!=\!n] \to 0\).

Proof. \[ \begin{aligned} b(n, \tfrac{n}{2}, p) &= \binom{n}{n/2} p^{n/2} q^{n/2} \qquad \text{(since } n \text{ even)} \\ &= \frac{n!}{((n/2)!)^2} (pq)^{n/2} \\ &\approx \frac{\sqrt{2\pi n}\;\frac{n^n}{e^n}\;(pq)^{n/2}}{\left(\sqrt{2\pi \frac{n}{2}} \left(\frac{n}{2}\right)^{n/2} e^{-n/2}\right)^2} &&\text{[Stirling's]} \\ &= \frac{\sqrt{2\pi n}\;\frac{n^n}{e^n}\;(pq)^{n/2}}{\pi n \cdot \frac{n^n}{2^n} \cdot e^{-n}} \\ &= \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}} \cdot 2^n (pq)^{n/2} \\ &= \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}} \cdot (4pq)^{n/2} \end{aligned} \]

\(\therefore\) for \(p = q = \frac{1}{2}\): \((4pq)^{n/2} = 1^{n/2} = 1\), so \(b(n,k,p) \approx \sqrt{\frac{2}{\pi}} \cdot \frac{1}{\sqrt{n}}\). \(\square\)

Ex (computer):

\(b(10, 5, \tfrac{1}{2}) = 0.24609\). Stirling: \(b(10,5,\tfrac{1}{2}) \approx 0.25231 = \sqrt{2/\pi/10}\).

Faster decay if \(p \neq q\).

\(n\) \(b(n, \frac{n}{2}, \frac{1}{2})\) Stirling’s Approx.
100 0.03958 0.03978
1,000 0.02522 0.02523
10,000 0.00798 0.00798
100,000 0.00252 0.00252

Compare to (intuition): \(\frac{\text{\# success}}{\text{\# total}} \to p\).


Multinomial Probability

Trinomial: 3 Outcomes

“Yes”, “no”, “abstain.” Put \(Z = n - X - Y\) for \(n\) Bernoulli trials with probabilities \(p_x\), \(p_y\), \(p_z\): \(p_x + p_y + p_z = 1\).

\[ \begin{aligned} 1 = 1^n &= (p_x + p_y + p_z)^n = (p_x + (p_y + p_z))^n \\ &\stackrel{\text{B.T.}}{=} \sum_{x=0}^{n} \binom{n}{x} p_x^x (p_y + p_z)^{n-x} \\ &\stackrel{\text{B.T.}}{=} \sum_{x=0}^{n} \sum_{y=0}^{n-x} \binom{n}{x}\binom{n-x}{y} p_x^x\, p_y^y\, p_z^{n-x-y} \\ &= \sum_{\substack{x \geq 0,\, y \geq 0,\, z \geq 0 \\ x+y+z=n}} \frac{n!}{x!\;y!\;z!}\; p_x^x\, p_y^y\, p_z^z \end{aligned} \]

NoteDefinition: Trinomial PMF

\[ p(x,y) = P[X\!=\!x,\; Y\!=\!y] = \frac{n!}{x!\;y!\;(n-x-y)!}\; p_x^x\, p_y^y\, p_z^{n-x-y} \]

\[ \frac{n!}{x!\;y!\;(n-x-y)!} = \binom{n}{x,y,z} \]

# ways to choose \(n\) with \(x\) Type 1, \(y\) Type 2, and \(z\) Type 3.

Example: Group Assignment

Ex: How many ways to assign 20 students to groups A, B, C with \(k_A = k_B = 6\) and \(k_C = 8\)?

\[ \binom{20}{6,6,8} = \frac{20!}{6!\;6!\;8!} = 116{,}396{,}280 \]

Note: \(\binom{n}{k} = \binom{n}{k, n-k}\) (binomial is special case).

Example: Trinomial Sampling

Ex: Sample with replacement. \(p_x = 0.30\), \(p_y = 0.15\), \(p_z = 0.55\). Pick 10.

  1. \(P[\text{2 "X" and 3 "Y" if pick 10}] = \binom{10}{2,3,5}(0.30)^2(0.15)^3(0.55)^5\)

  2. \(P[\text{2 "X" if pick 10}]\) – need “marginal” probability.

\(p(x,y)\) describes “joint” probability of \(X\) and \(Y\). \(p(x)\) and \(p(y)\) describe probability of \(X\) and \(Y\) separately (“marginal”).

Marginal Probability of Trinomial

\[ \begin{aligned} P(X\!=\!x) &= \sum_{y=0}^{n-x} p(x,y) = \sum_{y=0}^{n-x} \frac{n!}{x!\;y!\;(n-x-y)!}\;p_x^x\, p_y^y\, p_z^{n-x-y} \\ &= \frac{n!}{x!\;(n-x)!}\;p_x^x \sum_{y=0}^{n-x} \frac{(n-x)!}{y!\;(n-x-y)!}\;p_y^y\, p_z^{(n-x)-y} \\ &= \binom{n}{x} p_x^x \left[\sum_{y=0}^{n-x} \binom{n-x}{y} p_y^y\, p_z^{(n-x)-y}\right] \\ &\stackrel{\text{B.T.}}{=} \binom{n}{x} p_x^x (p_y + p_z)^{n-x} = \binom{n}{x} p_x^x (1 - p_x)^{n-x} \end{aligned} \]

Important

\(\therefore X \sim b(n, x, p_x)\): marginal is binomial.

Sample \(n\), prob of \(x\) “X” regardless of “Y” or “Z” count. Sum over all \(y\) with \(x\) fixed.

Conditional Probability of Trinomial

\[ \begin{aligned} P(X\!=\!x \mid Y\!=\!y) = p(x \mid y) &= \frac{p(x,y)}{p(y)} \\ &= \frac{(n-y)!}{x!\;((n-y)-x)!} \cdot \frac{p_x^x \cdot p_z^{(n-y)-x}}{(1-p_y)^{n-y}} \\ &= \binom{n-y}{x} \left(\frac{p_x}{1 - p_y}\right)^x \left(1 - \frac{p_x}{1 - p_y}\right)^{(n-y)-x} \\ &= b\!\left(n - y,\; x,\; \frac{p_x}{1 - p_y}\right) \end{aligned} \]

Important

\(\therefore\) conditional is binomial.

Sample \(n\), probability of \(x\) “X” given \(y\) “Y” (\(\therefore n - y - x\) “Z”).

Multinomial Theorem

TipTheorem (Multinomial)

\[ (p_1 + \cdots + p_K)^n = \sum_{\substack{\ell_1, \ldots, \ell_K \geq 0 \\ \ell_1 + \cdots + \ell_K = n}} \frac{n!}{\ell_1!\;\ell_2!\;\cdots\;\ell_K!}\; p_1^{\ell_1} \cdots p_K^{\ell_K} \]

(Partition: \(\ell_1 + \cdots + \ell_K = n\).)

Compare: binomial: \(p_x, p_y = 1 - p_x\). Trinomial: \(p_x, p_y, p_z = 1 - p_x - p_y\).

Example: Die Rolling

Ex: Roll a die (“d6”), \(K = 6\). Roll fair die 7 times, \(n = 7\), \(p_1 = \cdots = p_6 = \frac{1}{6}\).

\[ P(\text{3 twos, 2 fours, 2 sixes}) = \binom{7}{0,3,0,2,0,2}\left(\frac{1}{6}\right)^7 = \frac{7 \cdot 6 \cdot 5}{6^7} \approx 0.00078 \]

Connection: Deep Neural Classifier

  • Deep neural classifier with \(K\) output SoftMax neurons (1-in-\(K\) coding)
  • \(\therefore x \to N \to N(x)\): 1 roll of \(K\)-sided die (“categorical” multinomial)
  • \(\therefore \ln p = \ln p_1^{x_1} \cdots p_K^{x_K} = \sum_{k=1}^{K} x_k \cdot \ln p_k\) – cross-entropy
  • \(p = p_1^{x_1} \cdots p_K^{x_K}\): unknown (\(p_k\)), data (\(x_k\))
  • Q: find “best choice” of \(p_1, \ldots, p_K\) given data/observations \(x_1 \cdots x_K\)

Sequences and Series

NoteDefinition: Limit of a Sequence

\(a_n \to a\) if \(\lim_{n \to \infty} a_n = a\), iff \(\forall \varepsilon > 0\;\; \exists n_0 \in \mathbb{Z}^+: \forall n \geq n_0\quad |a_n - a| < \varepsilon\).

\(\therefore\) Smaller \(\varepsilon > 0\) \(\implies\) takes longer to find \(n_0\).

Partial Sum: \(S_N = \sum_{n=0}^{N} a_n = a_0 + a_1 + \cdots + a_N\)

NoteDefinition: Infinite Series

\(\sum_{n=0}^{\infty} a_n = S\)” converges iff \(\lim_{N \to \infty} S_N = S\) (\(S\) finite),

iff \(\forall \varepsilon > 0\;\; \exists n_0 \in \mathbb{Z}^+: \forall n \geq n_0\quad |S_N - S| < \varepsilon\). (Else “diverges”.)

Special case: \(\lim_{n \to \infty} a_n = \infty\) iff \(\forall \varepsilon > 0\;\; \exists n_0 \in \mathbb{Z}^+: \forall n \geq n_0\quad a_n > \varepsilon\).

A type of non-convergence. Not the same in general as \(\lim a_n \neq a\).


Triangle Inequality and Limit Theorems

TipTheorem (Triangle Inequality)

\[ |a + b| \leq |a| + |b| \]

Proof. \[ \begin{aligned} |a+b|^2 &= (a+b)^2 = a^2 + b^2 + 2ab \\ &\leq a^2 + b^2 + 2|a| \cdot |b| &&\text{[note: } |x| = \max(x, -x)\text{]} \\ &= |a|^2 + |b|^2 + 2|a| \cdot |b| = (|a| + |b|)^2 \end{aligned} \]

\(\therefore |a + b| \leq |a| + |b|\). \(\square\)

Ex: \((a_n \to a) \land (b_n \to b) \implies (a_n + b_n \to a + b)\)

Proof. Suppose \(a_n \to a\) and \(b_n \to b\). Pick \(\varepsilon > 0\).

\(\exists n_0\): \(\forall n \geq n_0\): \(|a_n - a| < \frac{\varepsilon}{2}\). \(\quad \exists m_0\): \(\forall n \geq m_0\): \(|b_n - b| < \frac{\varepsilon}{2}\).

\(\therefore \forall n \geq \max(n_0, m_0)\):

\[ |(a_n + b_n) - (a + b)| = |(a_n - a) + (b_n - b)| \stackrel{\text{T.I.}}{\leq} |a_n - a| + |b_n - b| < \frac{\varepsilon}{2} + \frac{\varepsilon}{2} = \varepsilon \quad \square \]

TipTheorem (Reverse Triangle Inequality)

\[ \big||a| - |b|\big| \leq |a - b| \]

Proof. \(|a| - |b| = |(a-b) + b| - |b| \leq |a-b| + |b| - |b| = |a-b|\)

Similarly: \(|b| - |a| \leq |a - b|\), \(\therefore |a| - |b| \geq -|a - b|\).

\[ \therefore -|a-b| \leq |a| - |b| \leq |a-b| \quad \implies \quad \big||a| - |b|\big| \leq |a-b| \quad \square \]


Convergence Tests

Fact: \(\lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n = e^x\) where \(e^x \stackrel{\Delta}{=} \sum_{n=0}^{\infty} \frac{x^n}{n!}\).

Fact: \(\sum_{n=0}^{\infty} a_n = S \implies \lim_{n \to \infty} a_n = 0\). (Converse does NOT hold.)

p-Series Test

\(\sum_{n=1}^{\infty} \frac{1}{n^p}\): converges if \(p > 1\), diverges if \(p \leq 1\).

Ex: \(\sum_{k=1}^{\infty} \frac{1}{n}\) diverges since \(\frac{1}{n} = \frac{1}{n^1}\), \(p = 1\).

\(\sum_{k=1}^{\infty} \frac{1}{n^2}\), \(\sum_{k=1}^{\infty} \frac{1}{n^{1.0001}}\): converge (\(p > 1\)).

Alternating Series Test

\(\sum_{n=1}^{\infty} (-1)^{n+1} a_n\) converges if: (1) \(a_k \geq a_{k+1} > 0\) \(\forall k \in \mathbb{Z}^+\), and (2) \(\lim_{n \to \infty} a_n = 0\).

Ex: \(\sum_{n=1}^{\infty} (-1)^{n+1} \cdot \frac{1}{n}\) converges since \(\frac{1}{n} \geq \frac{1}{n+1}\) \(\forall n\) and \(\lim \frac{1}{n} = 0\).

Absolute and Conditional Convergence

NoteDefinition: Absolute Convergence

\(\sum_{n=0}^{\infty} a_n\) converges absolutely iff \(\sum_{n=0}^{\infty} |a_n|\) converges.

(\(\therefore\) treat infinite sums like finite sums.)

Facts:

  1. Absolute convergence \(\implies\) convergence (converse does NOT hold).
  2. If \(\sum a_n\) converges absolutely and \(\sum b_n\) is any rearrangement of \(\sum a_n\), then \(\sum b_n\) converges and \(\sum b_n = \sum a_n\).
NoteDefinition: Conditional Convergence

\(\sum_{n=0}^{\infty} a_n\) converges conditionally iff \(\sum a_n\) converges but \(\sum |a_n|\) diverges.

Ex: \(\sum_{n=1}^{\infty} (-1)^{n+1} \cdot \frac{1}{n}\) converges conditionally since \(\sum (-1)^{n+1} \frac{1}{n}\) converges but \(\sum |(-1)^{n+1} \frac{1}{n}| = \sum \frac{1}{n}\) diverges.

TipTheorem (Riemann’s Rearrangement Theorem)

(Extra, not tested.) Pick ANY \(x \in \mathbb{R}\). Suppose \(\sum a_n\) converges conditionally. Then \(\exists\) a rearrangement \(\sum b_n\) of \(\sum a_n\) where \(\sum b_n = x\).


Ratio Test

TipTest (Ratio Test)

Put \(L \stackrel{\Delta}{=} \lim_{n \to \infty} \left|\frac{a_{n+1}}{a_n}\right|\). Then:

  1. \(L < 1 \implies\) series converges absolutely
  2. \(L > 1 \implies\) series diverges
  3. \(L = 1 \implies\) test fails

Ex: Geometric series: \(a_n = a^n\), \(\sum_{n=0}^{\infty} a^n\).

\[ L = \lim_{n \to \infty} \left|\frac{a^{n+1}}{a^n}\right| = \lim_{n \to \infty} |a| = |a| \]

\(\therefore \sum_{n=0}^{\infty} a^n\) converges absolutely if \(|a| < 1\).

Ex: \(e^x = \sum_{n=0}^{\infty} \frac{x^n}{n!}\)

\[ L = \lim_{n \to \infty} \left|\frac{x^{n+1}/(n+1)!}{x^n/n!}\right| = |x| \cdot \lim_{n \to \infty} \frac{1}{n+1} = 0 \]

\(\therefore e^x\) converges \(\forall x\).

Ex: \(\sum \frac{1}{n}\): \(L = \lim \frac{n}{n+1} = 1\). \(\therefore\) test fails.


Power Series

Fact: \(|x| \leq c\) iff \(-c \leq x \leq c\) (\(c > 0\)).

Note: Taylor’s series: \(f(x) = \sum_{n=0}^{\infty} \frac{f^{(n)}(0)}{n!} \cdot x^n\), i.e. power series “around 0”.

\[ \sum_{n=0}^{\infty} a_n x^n = a_0 + a_1 x + a_2 x^2 + \cdots \]

Ex: Find all \(x\) so that \(\sum_{n=0}^{\infty} \frac{n}{5^n} x^n\) converges absolutely.

By the ratio test:

\[ \lim_{n \to \infty} \left|\frac{a_{n+1}}{a_n}\right| = \lim_{n \to \infty} \left|\frac{(n+1)x^{n+1}}{n \cdot x^n} \cdot \frac{5^n}{5^{n+1}}\right| = \lim_{n \to \infty} \frac{(n+1)}{5n} \cdot |x| = \frac{1}{5}|x| < 1 \]

iff \(|x| < 5\), i.e. \(-5 < x < 5\).

Check endpoints: \(x = +5\): \(\sum n\) diverges. \(x = -5\): \(\sum (-1)^n n\) diverges.

\(\therefore \sum \frac{n}{5^n} x^n\) converges iff \(-5 < x < 5\) (else diverges).

Theorems on Power Series

TipTheorem
  1. If \(\sum a_n x^n\) converges for \(c \neq 0\) then it converges absolutely \(|x| < |c|\).
  2. If \(\sum a_n x^n\) diverges for \(d \neq 0\) then it diverges if \(|x| > |d|\).
TipTheorem (Radius of Convergence)

Exactly one holds for \(\sum a_n x^n\):

  1. It converges only if \(x = 0\).
  2. It converges absolutely for all \(x\) (e.g. \(e^x\)).
  3. \(\exists r > 0\): it converges absolutely if \(|x| < r\) and diverges if \(|x| > r\).

Case 3: \(r\) is the radius of convergence, interval \((-r, r)\).


Negative Binomial and Geometric

Negative Binomial

Note\(X \sim \text{NB}\): Negative Binomial

\(P(X\!=\!n \text{ trials until } k \text{ successes})\)

\[ = P(\text{head on } n\text{-th flip}) \cdot P(k-1 \text{ heads in first } n-1 \text{ flips}) \]

\[ = p \cdot \binom{n-1}{k-1} p^{k-1} (1-p)^{(n-1)-(k-1)} = \binom{n-1}{k-1} p^k (1-p)^{n-k} \]

\(n\)-th trial is success (independent), first \(n-1\) trials have \(k-1\) successes.

Geometric

Note\(X \sim G(p)\): Geometric

\(P(X\!=\!n \text{ trials until } 1^{\text{st}} \text{ success})\)

\(\therefore = \text{NB}(n, k\!=\!1) = \binom{n-1}{0} p^1 (1-p)^{n-1}\)

\[ = p \cdot (1-p)^{n-1} \]


Geometric Series

TipTheorem (Finite Geometric Sum)

\[ \sum_{k=0}^{n} a^k = \begin{cases} n+1 & \text{if } a = 1 \\ \dfrac{1 - a^{n+1}}{1 - a} & \text{if } a \neq 1 \end{cases} \]

Proof. \(S \stackrel{\Delta}{=} a^0 + a^1 + \cdots + a^n = \sum_{k=0}^{n} a^k\)

\(\therefore aS = a + a^2 + \cdots + a^{n+1}\)

\(\therefore S(1-a) = S - aS = (1 + a + \cdots + a^n) - (a + a^2 + \cdots + a^{n+1}) = 1 - a^{n+1}\)

\(\therefore S = \frac{1 - a^{n+1}}{1 - a}\) since \(a \neq 1\). \(\square\)

Corollary 1: \(\sum_{k=0}^{\infty} a^k = \frac{1}{1-a}\) if \(|a| < 1\).

Corollary 2: \(\sum_{k=1}^{\infty} a^k = \frac{a}{1-a}\) if \(|a| < 1\).

Corollary 3: \(\sum_{k=j}^{\infty} a^k = \frac{a^j}{1-a}\) if \(|a| < 1\).

Verifying Geometric is a Valid Probability

\[ \begin{aligned} \sum_n p(x) &\stackrel{G(p)}{=} \sum_{n=1}^{\infty} p \cdot (1-p)^{n-1} = p \cdot \sum_{n=1}^{\infty} (1-p)^{n-1} \\ &= p \cdot \sum_{k=0}^{\infty} (1-p)^k \qquad [k = n - 1] \\ &= p \cdot \frac{1}{1 - (1-p)} \qquad [0 \leq 1-p < 1] \\ &= p \cdot \frac{1}{p} = 1 \quad \therefore \text{"valid" probability} \end{aligned} \]


Geometric Tail Probability

TipTheorem

\(P(X > k) = q^k\) if \(X \sim G(p)\). \(\quad \therefore P(X \leq k) = 1 - q^k\).

Proof. \[ \begin{aligned} P(X > k) &= P(\{X\!=\!k\!+\!1\} \cup \{X\!=\!k\!+\!2\} \cup \cdots) \\ &\stackrel{\text{CA}}{=} \sum_{x=k+1}^{\infty} p(x) \stackrel{G(p)}{=} \sum_{x=k+1}^{\infty} p \cdot q^{x-1} \\ &= p \cdot \sum_{j=k}^{\infty} q^j \qquad [j = x - 1] \\ &\stackrel{\text{Corr 3}}{=} p \cdot \frac{q^k}{1 - q} = \frac{p}{p} \cdot q^k = q^k \quad \square \end{aligned} \]

Vandermonde’s Identity

TipFact (Vandermonde’s Identity)

\[ \binom{N}{n} = \sum_{k=0}^{n} \binom{N_1}{k}\binom{N_2}{n-k} \qquad \text{if } N = N_1 + N_2 \]

and \(\binom{k}{j} = 0\) for \(j > k\).

\(N\): total pop., \(N_1\): # type 1, \(N_2\): # type 2. Sum to \(N\).


Hypergeometric Probability

Suppose batch of \(N\) items. Sample \(n\) w/o replacement. \(N_1\) type 1, \(N_2\) type 2.

\[ \binom{N}{n} = \binom{N_1 + N_2}{n} = \sum_{k=0}^{n} \binom{N_1}{k}\binom{N_2}{n-k} \]

# different ways to draw \(k\) “good” (Type 1) and \(n - k\) “bad” (Type 2). Add over all ways for 0 “good”, 1 “good”, … \(n\) “good”.

\[ \frac{\binom{N_1}{k}\binom{N_2}{n-k}}{\binom{N}{n}} \geq 0 \qquad \text{and} \qquad \sum_{k=0}^{n} \frac{\binom{N_1}{k}\binom{N_2}{n-k}}{\binom{N}{n}} = 1 \]

\(\therefore\) obeys all axioms of probability, with \(P[k \text{ "good" from draw of } n]\) = hypergeometric p.m.f.

Example: Fishing

Ex: Small lake, \(N = 50\) fish. \(N_1 = 10\) trout, \(N_2 = 40\) large mouth bass.

Bob catches 7 fish w/ replacement (“catch and release”):

\[ P[\text{2 trout and 5 bass}] = b(7, 2, \tfrac{1}{5}) = \binom{7}{2}\left(\frac{1}{5}\right)^2\left(\frac{4}{5}\right)^5 = 0.275 \]

7 fish w/o replacement (“catch and eat”):

\[ P[\text{2 trout and 5 bass}] = \text{Hypergeo}(7, 10, 2, 50) = \frac{\binom{10}{2}\binom{40}{5}}{\binom{50}{7}} = 0.2969 \]

Q: Under what conditions is w/ replacement a “good” approximation to w/o replacement?

A: If the pond is an ocean. \(\implies\) \(N\) big, \(n\) small (fixed ratio \(\frac{N_1}{N}\)).


Binomial Approximation to the Hypergeometric

TipTheorem

Hypergeometric \(\to\) Binomial if \(N \to \infty\) and \(p = \frac{m}{N}\) (fixed).

Proof. Say \(X \sim \text{Hyp}(n, m, k, N)\). Fixed \(k\) and \(n\).

\[ \begin{aligned} \lim_{N \to \infty} P(X\!=\!k) &= \lim_{N \to \infty} \frac{\binom{m}{k}\binom{N-m}{n-k}}{\binom{N}{n}} \\[6pt] &= \lim_{N \to \infty} \binom{n}{k} \cdot \frac{\prod_{j=1}^{k}(m - k + j) \cdot \prod_{j=1}^{n-k}(N - m - n + k + j)}{\prod_{j=1}^{n}(N - n + j)} \cdot \frac{N^n}{N^n} \end{aligned} \]

\(\frac{N^n}{N^n}\) is fixed (= 1).

Each product ratio \(\to\) the appropriate power of \(\frac{m}{N}\) or \(1 - \frac{m}{N}\) as \(N \to \infty\):

\[ = \binom{n}{k} \cdot \left(\frac{m}{N}\right)^k \left(1 - \frac{m}{N}\right)^{n-k} = \binom{n}{k} p^k (1-p)^{n-k} = b(n, k, p) \quad \square \]


BEG CUP: Distribution Selection

w/ replacement w/o replacement
2 outcomes Binomial Hypergeometric
2 outcomes, “until” Negative Binomial / Geometric
\(\geq 2\) outcomes Multinomial Multivariate Hypergeometric

Hypergeo \(\to\) Binomial. Multi Hypergeo \(\to\) Multinomial.

Multivariate Hypergeometric: \(\dfrac{\binom{N_1}{n_1}\binom{N_2}{n_2}\cdots\binom{N_K}{n_K}}{\binom{N}{n}}\) with \(n = n_1 + \cdots + n_K\), \(N = N_1 + \cdots + N_K\).

Issue-Spotting Sequence

  • Q1: Outcomes? (2 or more?)
  • Q2: Sampling? (w/ or w/o replacement)
  • Q3: Until structure?

“UNTIL” \(\implies\) negative binomial (\(k = 1\) outcomes: geometric).