EE 364 Supplemental – Week 02

Rules of Inference

(The logical schema)

Modus Ponens

(Modus ponendo ponens – “mode that affirms by affirming”)

Premise 1:	\(P \to Q\)
Premise 2:	\(P\)
Conclusion:	\(Q\)

Theorem

\[ [(P \to Q) \mathbin{\&} P] \to Q \]

Proof. By truth table:

\(P\)	\(Q\)	\(P \to Q\)	\((P \to Q) \land P\)	\(\to\)	\(Q\)
1	1	1	1	1	1
0	1	1	0	1	1
1	0	0	0	1	0
0	0	1	0	1	0

All entries in the implication column are 1. \(\square\)

Modus Tollens

(Modus tollendo tollens – “mode that denies by denying”)

Premise 1:	\(P \to Q\)
Premise 2:	\(\sim Q\)
Conclusion:	\(\sim P\)

Theorem

\[ [(P \to Q) \mathbin{\&} \sim Q] \to \sim P \]

Definition: Fallacy

A fallacy is a logical error (not factual error).

Truth:

factual truth: “grass is green”
logical truth: “green is green” \(\leftarrow\) tautology

Fallacy of Affirming the Consequent

P1:	\(P \to Q\)
P2:	\(Q\)
C:	\(\therefore P\)

(Invalid!)

Fallacy of Denying the Antecedent

P1:	\(P \to Q\)
P2:	\(\sim P\)
C:	\(\therefore \sim Q\)

(Invalid!)

Conditional, Contrapositive, Converse

Form	Statement
Conditional	\(P \to Q\)
Contrapositive	\(\sim Q \to \sim P\)	same logical truth as conditional
Converse	\(Q \to P\)

i.e. \([P \to Q] \iff [\sim Q \to \sim P]\)

Theorem (Material Implication)

\[ P \to Q \;=\; \sim P \lor Q \]

Definition: Valid

An argument is valid iff the premises logically imply the conclusion (can’t make if-part TRUE and then-part FALSE).

A proposition is valid by structure, not content.

Definition: Sound

An argument is sound iff it is valid and all premises are true.

Example: \([(P \lor Q) \to R] \leftrightarrow [P \to (Q \to R)]\)

\(P\)	\(Q\)	\(R\)	\((P \lor Q) \to R\)	\(\leftrightarrow\)	\(P \to (Q \to R)\)
1	1	1	1	1	1
0	1	1	1	1	1
1	0	1	1	1	1
0	0	1	1	1	1
1	1	0	0	1	0
0	1	0	0	0	0
1	0	0	0	1	0
0	0	0	1	1	1

Row (0,1,0) produces 0 in the biconditional column (highlighted in original). The “final” column is not all 1’s.

\(\therefore\) False (since the biconditional column is not all 1’s).

Independence

Definition: Independence

Suppose \((\Omega, \mathcal{A}, P)\) is a probability space. Events \(A\) and \(B\) are (statistically) independent w.r.t. \(P\) iff \[ P(A \cap B) = P(A) \cdot P(B) \]

“Joint factors into marginals.”

Definition: Independence (sequence)

Events \(A_1, A_2, \ldots\) are independent \(\iff\) every finite subset is independent.

Q&A on Independence

Q: Can \(A\) be independent of itself?

A: iff \(P(A \cap A) = P(A) \cdot P(A)\) \(\iff\) \(P(A) = P(A)^2\) \(\therefore\) only if \(P[A] = 0\) or \(P[A] = 1\).

Q: Can \(A\) be independent of \(X\)?

A: iff \(P(A \cap X) = P(A) \cdot P(X)\) \(\iff\) \(P(A) = P(A) \cdot 1\) \(\iff\) \(P(A) = P(A)\). \(\therefore\) every set is independent of \(X\).

Q: Does \(A\) indep. \(B\) \(\implies\) \(A \cap B = \emptyset\) (disjoint)?

A: iff \(P(A \cap B) = P(A) \cdot P(B)\), but \(P(A \cap B) = 0\) by assumption \(\iff\) \(P[A] = 0\) or \(P[B] = 0\). No in general.

Example: Coin Flip

Coin flip with \(P[\text{head}] = p\), \(\therefore P[\text{tail}] = 1 - p\). (Independent flips.)

Flip twice:

\[ \begin{aligned} P[H_1 \cap H_2] &= P[H_1] \cdot P[H_2] = P[H]^2 = p^2 \\ P[H_1 \cap T_2] &= P[H_1] \cdot P[T_2] = P[H] \cdot P[T] = p(1-p) \\ P[T_1 \cap H_2] &= P[T_1] \cdot P[H_2] = P[T] \cdot P[H] = (1-p)p \\ P[T_1 \cap T_2] &= P[T_1] \cdot P[T_2] = P[T]^2 = (1-p)^2 \end{aligned} \]

Note

\[ P[\{H_1 \cap H_2\} \cup \{H_1 \cap T_2\} \cup \{T_1 \cap H_2\} \cup \{T_2 \cap H_2\}] = p^2 + 2p(1-p) + (1-p)^2 = P[\Omega] = 1 \]

Q: \(P[\text{1 tail AND 1 head (any order)}]\)

\[ = P[\{H_1 \cap T_2\} \cup \{T_1 \cap H_2\}] \stackrel{\text{disjoint, CA}}{=} P[H_1 \cap T_2] + P[T_1 \cap H_2] = 2p(1-p) \]

Q: \(P[\text{at least 1 tail}]\)

\[ = P[\{H_1 \cap T_2\} \cup \{T_1 \cap H_2\} \cup \{T_1 \cup T_2\}] = 2p(1-p) + (1-p)^2 \]

Or by complement:

\[ = 1 - P[(\text{at least 1 tail})^c] = 1 - P[\text{no tail}] = 1 - p^2 \]

Conditional Probability

Definition: Conditional Probability

\[ P(B \mid A) = \frac{P(A \cap B)}{P(A)} \qquad \text{if } P(A) > 0 \]

\(P(A \cap B)\): joint; \(P(A)\): marginal; \(P(B \mid A)\): conditional.

\(\therefore\) \(A\) & \(B\) independent \(\iff\) \(P(B \mid A) = P(B)\) \(\iff\) \(P(A \mid B) = P(A)\)

Proof. \[ P(A \mid B) = \frac{P(A \cap B)}{P(B)} = \frac{P(A) \cdot \cancel{P(B)}}{\cancel{P(B)}} = P(A) \quad \square \]

Important

\(P[A] = 0\) does NOT imply \(A = \emptyset\). (But \(A = \emptyset \implies P[A] = 0\).)

Note

Conditional probability is not transitive: \(P[A \mid B] \neq P[B \mid A]\).

Intuition for Conditioning

If you know \(A\) occurred, you constrain the size of the sample space of \(B\):

\(B \cap A\)
\(B^c \cap A\)

The value \(P[A]\) normalizes area to 1.

Example: 4-Ball Urn

Ex: Urn experiment with 4 numbered balls: #1 and #2 blue, #3 and #4 red.

Events: \(A\): blue ball, \(B\): even ball, \(C\): ball \(> 2\).

\[ P[A \mid B] = \frac{P[A \cap B]}{P[B]} = \frac{1/4}{1/2} = \frac{1}{2} \]

“Blue given even.” Note: \(= P[A]\), so \(A\) and \(B\) are independent.

\[ P[A \mid C] = \frac{P[A \cap C]}{P[C]} = \frac{0}{1/2} = 0 \]

“Blue given \(> 2\).” Note: \(\neq P[A]\), so \(A\) and \(C\) are not independent.

\(\therefore\) conditioning can (but may not) affect computed probability.

Chain Rule for Joint Probability

Important

\[ P[A \cap B] = P[A \mid B] \cdot P[B] = P[B \mid A] \cdot P[A] \]

Example: 5-Ball Urn, Drawing 2

Ex: Urn experiment with 5 balls: 2 blue, 3 red. Draw 2 balls. \(P[\text{2 blue balls}]\)?

Consider as 2 “sub” experiments:

\[ P[B_1 \cap B_2] = P[B_1] \cdot P[B_2 \mid B_1] = \frac{2}{5} \cdot \frac{1}{4} = \frac{1}{10} \]

Theorem

Suppose \(A\) and \(B\) are independent. Then:

\(P[A \mid B] = P[A]\)
\(P[B \mid A] = P[B]\)

Proof. \[ P[B \mid A] = \frac{P[A \cap B]}{P[A]} = \frac{P[B] \cdot \cancel{P[A]}}{\cancel{P[A]}} = P[B] \]

And similarly for \(P[A \mid B]\). \(\square\)

Important

Conditioning does not affect the probability of independent events.

Probabilistic Modus Ponens

Theorem: Probabilistic Modus Ponens

Premise 1:	\(P(B \mid A) \geq c\)
Premise 2:	\(P(A) \geq a\)
Conclusion:	\(P(B) \geq a \cdot c\)

Proof. \[ \begin{aligned} P(B) &\geq P(A \cap B) &&\text{[monotonicity]} \\ &= P(A) \cdot \frac{P(A \cap B)}{P(A)} &&\text{[}P(A) > 0\text{]} \\ &= P(A) \cdot P(B \mid A) &&\text{[defn cond. prob.]} \\ &\geq a \cdot c &&\text{[premise 1 and 2]} \quad \square \end{aligned} \]

Check: \(a = c = 1\) \(\implies\) \(P(B) = 1\).

Probabilistic Modus Tollens

Theorem: Probabilistic Modus Tollens

Premise 1:	\(P(B \mid A) \geq c > 0\)
Premise 2:	\(P(B) \leq b\)
Conclusion:	\(P(A) \leq \min\!\left(1, \;\dfrac{b}{c}\right)\)

Proof. \[ \begin{aligned} P(A) &\leq P(A) \cdot \frac{P(B)}{P(A \cap B)} &&\text{[since } P(A \cap B) \leq P(B)\text{]} \\[6pt] &= \frac{P(B)}{P(B \mid A)} \\[6pt] &\leq \frac{b}{P(B \mid A)} &&\text{[premise 2]} \\[6pt] &\leq \frac{b}{c} &&\text{[premise 1]} \end{aligned} \]

Always \(P(A) \leq 1\), \(\therefore P(A) \leq \min\!\left(1, \frac{b}{c}\right)\). \(\square\)

Proof by Induction

Prove \(A(n)\) holds \(\forall n\):

Prove \(A(1)\) (basis step)
Show \(A(n) \implies A(n+1)\) (induction step)

Ex: \(n^2 + n\) even \(\forall n\).

Basis: \(n = 1\): \(1^2 + 1 = 2 = 2 \cdot k\) for \(k = 1\). \(\therefore\) even. QED-basis.
Induction: Suppose \(n^2 + n = 2k'\) for some \(k' \in \mathbb{Z}^+\).

\[ (n+1)^2 + (n+1) = n^2 + 2n + 1 + n + 1 = \underbrace{n^2 + n}_{2k'} + 2(n+1) = 2(k' + k'') \quad \square \]

Boole’s Inequality

Theorem (Boole’s Inequality)

\[ P\left(\bigcup_{k=1}^{n} A_k\right) \leq \sum_{k=1}^{n} P(A_k) \]

Proof. By induction on \(n = 2, 3, 4, \ldots\)

Basis Step: \(n = 2\).

\[ P(A_1 \cup A_2) = P(A_1) + P(A_2) - P(A_1 \cap A_2) \leq P(A_1) + P(A_2) = \sum_{k=1}^{2} P(A_k) \]

(Addition theorem, \(P[A] \geq 0\).) QED-basis.

Induction Step: Assume holds for \(n\). Derive for \(n + 1\).

Induction Hypothesis (IH): \(P\!\left(\bigcup_{k=1}^{n} A_k\right) \leq \sum_{k=1}^{n} P(A_k)\)

\[ \begin{aligned} P\left(\bigcup_{k=1}^{n+1} A_k\right) &= P\left(\left(\bigcup_{k=1}^{n} A_k\right) \cup A_{n+1}\right) \\ &\leq P\left(\bigcup_{k=1}^{n} A_k\right) + P(A_{n+1}) \\ &\leq \sum_{k=1}^{n} P(A_k) + P(A_{n+1}) \\ &= \sum_{k=1}^{n+1} P(A_k) \quad \square \end{aligned} \]

Inclusion-Exclusion

Theorem (Inclusion-Exclusion)

\[ P\left(\bigcup_{k=1}^{n} A_k\right) = \sum_{k=1}^{n} P(A_k) - \sum_{i < j} P(A_i \cap A_j) + \sum_{i < j < k} P(A_i \cap A_j \cap A_k) - \cdots + (-1)^{n+1} P(A_1 \cap A_2 \cap \cdots \cap A_n) \]

Generalizes CA (countable additivity).

Technique for Unions

Unions are usually hard to work with, \(\therefore\) use either:

De Morgan’s
Additivity (find disjoint decomposition)

Multiplication Theorem

Theorem (Multiplication Theorem)

\[ P\left(\bigcap_{k=1}^{n} A_k\right) = P(A_1) \cdot P(A_2 \mid A_1) \cdots P(A_n \mid A_1 \cap \cdots \cap A_{n-1}) \]

Compare to:

Independence: \(= P(A_1) \cdot P(A_2) \cdots P(A_n) = \prod_{k=1}^{n} P(A_k)\)

No dependence between events.

Markov: \(= P(A_1) \cdot P(A_2 \mid A_1) \cdot P(A_3 \mid A_2) \cdots P(A_n \mid A_{n-1}) = P(A_1) \cdot \prod_{k=2}^{n} P(A_k \mid A_{k-1})\)

Each event depends only on the previous one.

Example: Drawing 3 Aces

With replacement (independent samples): \(P[A_1 \cap A_2] = P[A_1] \cdot P[A_2]\)

Without replacement: \(P[A_1 \cap A_2] = P[A_1] \cdot P[A_2 \mid A_1]\)

\[ \begin{aligned} P[\text{draw 3 aces, w/o replacement}] &= P[A_1 \cap A_2 \cap A_3] \\ &= P[A_1] \cdot P[A_2 \mid A_1] \cdot P[A_3 \mid A_1 \cap A_2] \\ &= \frac{4}{52} \cdot \frac{3}{51} \cdot \frac{2}{50} \\ &= \frac{1}{5525} \approx 0.00018 \end{aligned} \]

Proof of Multiplication Theorem

Proof. By induction on \(n = 2, 3, 4, \ldots\)

Basis Step: \(n = 2\). \[ P(A_1 \cap A_2) = P(A_1) \cdot P(A_2 \mid A_1) \quad \text{by defn cond. prob.} \] QED-basis.

Induction Step: Assume holds for \(n\). Derive for \(n + 1\).

IH: \(P\!\left(\bigcap_{k=1}^{n} A_k\right) = P(A_1) \cdot P(A_2 \mid A_1) \cdots P(A_n \mid A_1 \cap \cdots \cap A_{n-1})\)

\[ \begin{aligned} P\left(\bigcap_{k=1}^{n+1} A_k\right) &= P\left(\bigcap_{k=1}^{n} A_k \;\cap\; A_{n+1}\right) \\ &\stackrel{\text{assoc.}}{=} P\left(\left(\bigcap_{k=1}^{n} A_k\right) \cap A_{n+1}\right) \\ &= P\left(\bigcap_{k=1}^{n} A_k\right) \cdot P\!\left(A_{n+1} \;\middle|\; \bigcap_{k=1}^{n} A_k\right) \\ &= P(A_1) \cdot P(A_2 \mid A_1) \cdots P(A_n \mid A_1 \cap \cdots \cap A_{n-1}) \cdot P\!\left(A_{n+1} \;\middle|\; \bigcap_{k=1}^{n} A_k\right) \quad \square \end{aligned} \]

Partition and Total Probability

Definition: Partition

Suppose \((\Omega, \mathcal{A})\) is a measurable space. Then \(\{H_k\} \subset \mathcal{A}\) partition \(\Omega\) iff:

\(\bigcup_k H_k = \Omega\)
Pairwise disjoint: \(H_i \cap H_j = \emptyset\) if \(i \neq j\)

Union that does not overlap \(\therefore\) CA applies.

Theorem (Total Probability)

\[ P(E) = \sum_k P(H_k) \cdot P(E \mid H_k) \qquad \text{if } \{H_k\} \text{ partitions } \Omega \]

Proof. \[ \begin{aligned} P(E) &= P[E \cap \Omega] \\ &= P\!\left[E \cap \left(\bigcup_k H_k\right)\right] &&\text{[}\{H_k\}\text{ partition } \Omega\text{]} \\ &= P\left(\bigcup_k (E \cap H_k)\right) \\ &= \sum_k P(E \cap H_k) &&\text{[}\{H_k\}\text{ partition } \Omega \therefore \text{disjoint]} \\ &= \sum_k P(H_k) \cdot P(E \mid H_k) &&\text{[defn cond. prob.]} \quad \square \end{aligned} \]

Example: 5-Ball Urn, Total Probability

Ex: Urn with 5 balls: 2 blue, 3 red. \(P[\text{second ball is red}]\)?

Events: \(B_1 = \{b_1 \cap r_2,\; b_1 \cap b_2\}\), \(\sim B_1 = R_1 = \{r_1 \cap r_2,\; r_1 \cap r_2\}\).

Partition: \(B_1\) vs. \(\sim B_1\).

\[ \begin{aligned} P[r_2] &= P[r_2 \mid b_1] \cdot P[b_1] + P[r_2 \mid r_1] \cdot P[r_1] \\ &= \frac{3}{4} \cdot \frac{2}{5} + \frac{1}{2} \cdot \frac{3}{5} \\ &= \frac{3}{5} \quad (60\%) \end{aligned} \]

Another Representation

Data = Evidence. Hypotheses \(H_1, H_2, \ldots, H_n\) partition sample space.

\[ P[E] = \sum_{k=1}^{n} \underbrace{P[E \mid H_k]}_{\text{"likelihood"}} \cdot \underbrace{P[H_k]}_{\text{"prior"}} \]

Later in this course

Total Expectation: \(E_X[X] = E_Y[E[X \mid Y]]\)
Total Variance: \(V_X[X] = E_Y[V[X \mid Y]] + V_Y[E[X \mid Y]]\)

Extremely important for problem solving.

Special Case: \(\Omega = A \cup A^c\)

\[ \therefore P(B) \stackrel{\text{total prob.}}{=} P(A) \cdot P(B \mid A) + P(A^c) \cdot P(B \mid A^c) \]

For any \(B\).

Theorem

\[ P[B \mid A] = \frac{P[A \mid B] \cdot P[B]}{P[A]} \qquad \text{for any event } B,\; \text{if } P[A] > 0 \]

Proof. \[ P[B \mid A] = \frac{P[A \cap B]}{P[A]} = \frac{\cancel{P[B]} \cdot P[A \mid B] \cdot}{P[A]} \quad \square \]

(Rewriting \(P[A \cap B] = P[A \mid B] \cdot P[B]\).)

Bayes’ Theorem

Theorem (Bayes’ Theorem)

\[ P(H_j \mid E) = \frac{P(E \mid H_j) \cdot P(H_j)}{\sum_k P(E \mid H_k) \cdot P(H_k)} \qquad \text{if } \{H_k\} \text{ partitions } \Omega \]

Proof. \[ \begin{aligned} P(H_j \mid E) &= \frac{P(E \cap H_j)}{P(E)} &&\text{[defn cond. prob.]} \\[6pt] &= \frac{P(H_j) \cdot P(E \mid H_j)}{P(E)} &&\text{[defn cond. prob.]} \\[6pt] &= \frac{P(H_j) \cdot P(E \mid H_j)}{\sum_k P(H_k) \cdot P(E \mid H_k)} &&\text{[total prob.]} \quad \square \end{aligned} \]

Term	Notation
Prior	\(P(H_j)\)
Posterior	\(P(H_j \mid E)\)
Likelihood	\(P(E \mid H_j)\)

\(\therefore P(H \mid E) = \frac{a}{a+b}\), \(P(H^c \mid E) = \frac{b}{a+b}\)

IMAC Format

IMAC Problem-Solving Format

I – Issue: Associative memory (“spot the issue”)
M – Math rule: Rote memorization
A – Apply math to facts: (Analysis) Pattern matching; identify element from given facts
C – Conclusion: Deductive logic

Mnemonic: “Party Unconditionally To Conquer Bayes”

Letter	Step
P	Partition	\(\{H_k\}\) or \(A \cup A^c\)
U	Unconditional probability	\(P(B)\)
T	Total Probability
C	Conditional Probability	\(P(B \mid A)\)
B	Bayes’ Theorem

Issue Spotting Sequence

Is there a partition?
- Unconditional probability \(P(B)\)? \(\therefore\) Use Total Probability.
- Conditional probability \(P(B \mid A)\)? \(\therefore\) Use Bayes’ Theorem.

Example: Bayes’ – Special Case \(H\) vs \(H^c\)

\[ P[H \mid E] = \frac{P[E \mid H] \cdot P[H]}{P[E \mid H] \cdot P[H] + P[E \mid H^c] \cdot P[H^c]} \]

\(H\) and \(H^c\) partition \(\Omega\): \(H \cup H^c = X\), \(H \cap H^c = \emptyset\).

Example: Cancer Screening

Ex: Cancer (\(H\)) vs. not cancer (\(H^c\)). Evidence = test with diagnostics:

Prior: 1% of women have breast cancer (\(\therefore\) 99% do not)
80% detect cancer when it is there (20% false negative)
9.6% detect cancer when it is not there (90.4% true negative)

	Cancer (\(H\)) – 1%	\(\sim\)Cancer (\(H^c\)) – 99%
Test +	TP: 0.80	FP: 0.096
Test \(-\)	FN: 0.20	TN: 0.904

Columns sum to 1.

\[ \begin{aligned} P[\text{Cancer} \mid +] &= \frac{P[+ \mid \text{Cancer}] \cdot P[\text{Cancer}]}{P[+ \mid \text{Cancer}] \cdot P[\text{Cancer}] + P[+ \mid \sim\!\text{Cancer}] \cdot P[\sim\!\text{Cancer}]} \\[6pt] &= \frac{(0.80)(0.01)}{(0.80)(0.01) + (0.096)(0.99)} \\[6pt] &= 0.0776 \quad (\sim 7.76\%) \end{aligned} \]

\(\therefore\) get second opinion.

Example: Binary Channel

Problem: Transmit a single noisy bit. Guess transmitted given received.

\[ \begin{aligned} P[Y\!=\!1 \mid X\!=\!1] &= 1 - \gamma \\ P[Y\!=\!1 \mid X\!=\!0] &= \varepsilon \end{aligned} \]

\[ \begin{aligned} P[Y\!=\!0 \mid X\!=\!1] &= \gamma \quad \text{(error)} \\ P[Y\!=\!0 \mid X\!=\!0] &= 1 - \varepsilon \end{aligned} \]

Definition: Binary Symmetric Channel

A binary symmetric channel has \(\gamma = \varepsilon\).

MAP Estimation

Q: Suppose \(Y = 1\) (received 1). Guess \(X = 0\) or \(X = 1\)?

A: Depends on \(\gamma\) and \(\varepsilon\). Choose hypothesis (\(X = 0\) or \(1\)) with higher probability.

Estimate \(\hat{X} = 1\) if \(P[X\!=\!1 \mid Y\!=\!1] > P[X\!=\!0 \mid Y\!=\!1]\)
\(\hat{X} = 0\) otherwise

“Pick biggest”

Definition: MAP Estimate

The MAP (maximum a posteriori) estimate is the hypothesis that maximizes the probability given the observation: \[ \hat{H}^{\text{MAP}} = \arg\max_k P(H_k \mid E) \]

Deriving the Decision Rule

In our problem: \(P[X\!=\!1 \mid Y\!=\!1] \;\underset{\hat{X}=0}{\overset{\hat{X}=1}{\gtrless}}\; P[X\!=\!0 \mid Y\!=\!1]\)

By Bayes (same denominator on both sides, cancels):

\[ P[Y\!=\!1 \mid X\!=\!1] \cdot P[X\!=\!1] \;\gtrless\; P[Y\!=\!1 \mid X\!=\!0] \cdot P[X\!=\!0] \]

\[ (1 - \gamma) \cdot p \;\gtrless\; \varepsilon \cdot (1 - p) \]

where \(p = P[X = 1]\).

“Odds” form.

\[ \text{iff} \quad \frac{P[X\!=\!1]}{P[X\!=\!0]} \;\gtrless\; \frac{\varepsilon}{1 - \gamma} \]

Suppose \(P[X\!=\!1] = P[X\!=\!0]\) (transmit 50% 1 and 50% 0):

\[ \implies \frac{0.50}{0.50} = 1 \;\gtrless\; \frac{\varepsilon}{1 - \gamma} \]

\(\therefore\) if \(Y = 1\), choose:

\(\hat{X} = 1\) if \(\frac{\varepsilon}{1 - \gamma} < 1\), i.e. \(\varepsilon < 1 - \gamma\)

\(P[\text{flip}] < P[\text{not flip}]\)

\(\hat{X} = 0\) if \(\frac{\varepsilon}{1 - \gamma} > 1\), i.e. \(\varepsilon > 1 - \gamma\)

\(P[\text{flip}] > P[\text{not flip}]\)

MAP hypothesis test.