Bayes Theorem
Bayes Theorem
Given that there are M mutually exclusive and exhaustive events B1, B2, ....,BM considered
outcomes or decisions, Bayes Theorem computes the probability of any of these events Bi,
conditioned on (given) an event A called the findings. This conditional probability, p(Bi|A) is
called the a posteriori probability ("after the fact" or after the findings), and is computed in
terms of an a priori probability ("before the fact" or "before the findings") p(Bi), the
probability of the findings p(A), and an expression p(A|Bi) which also. is called a likelihood
function (the probability of the findings given the event Bi.). That is,
p(A/Bi) * p(Bi) p(A/Bi) * p(Bi)
p(Bi|A) = ------------------ --------------------------- M
p(A) Σ p(A|Bj)*p(Bj)
j =1
Axioms of Probability Theory
The derivation of Bayes Theorem is based on the three axioms of probability theory devised
given a random situation defined on a sample description space S where a probability
function p(·) is assigned to every event E in S such that p(E) is a nonnegative real number.
The probability function must satisfy three axioms:
Axiom 1: p(E) is greater than or equal to zero for every event E.
Axiom2: p(S) = 1 for the certain event S.
Axiom 3: p(E +F) = p(E) + p(F), if EF = 0 where + denotes union
and EF is the intersection of E and F. Or, is words, the probability
of the union of two mutually exclusive events is the sum of their
probabilities.
Derivation of Bayes Theorem
By symmetry,
p(Bi,A) = p(A,Bi)
Using the definition of conditional probability, it follows that
p(Bi|A)p(A) = p(A|Bi)p(Bi)
or
p(Bi|A) = p(A|Bi)p(Bi) / p(A)
Because B1, B2 ...., BM are mutually exclusive and exhaustive events which cover the entire
decision or outcome space (Axiom 2), applying Axiom 3 results in
p(A) = p(A,B1) + p(A,B2) + .... + p(A,BM)
or
p(A) = p(A|B1)p(B1) + p(A|B2)p(B2) + .... + p(A|BM)p(PM)
which completes the derivation of Bayes theorem.
Limitations of Bayes Theorem
A major limitation of Bayes Theorem is that it does not provide for the construction
of or estimation of the conditional probabilities (Likelihoods) p(A|Bi). That is
accomplished in the discipline of Statistical Pattern Recognition as created by
Dr. Edward A. Patrick. Bayes theorem in the context of Statistical Pattern Recognition
has been referred to as Bayes Framework.
Another limitation of Bayes Theorem is the restriction that events in the decision or
outcome space must be mutually exclusive. This eliminates the possibility of complex
classes such as (Bi,Bj) being directly considered in the decision space. The facility of
including complex classes is provided by Patrick's Theorem.
|