**Next message:**Danhall Anna: "Mealey Response 3"**Previous message:**HARNAD Stevan: "Koehler: The Bayesian Calculation"**Next in thread:**Dock Jenny: "Re: Koehler: Conditional Probability"**Maybe reply:**Dock Jenny: "Re: Koehler: Conditional Probability"**Maybe reply:**HARNAD Stevan: "Re: Koehler: Conditional Probability"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

On Sun, 31 May 1998 Eal195@aol.com wrote:

*> Dear Stevan,
*

*> I'm tying myself in knots over 'inverse fallacy', any chance of a kid-sib
*

*> explanation? I can't seem to grasp the difference between (to use Koeler's
*

*> example) an innocent person (not source) being a "match" for a DNA sample and
*

*> its posterior probability/inverse. Isn't the probability of the sample
*

*> matching an innocent person the same? Or does it mean matching a particular
*

*> innocent person? Posterior probability is the probability of an event
*

*> occuring given another event isn't it?
*

*> Thanks
*

*> Liz
*

First let me review probability, conditional probability, prior

probability, posterior probability and Bayes' Rule. I'll use a slight

variant of the examples I used in class:

The probability of something happening can be interpreted as a kind of

percentage. If the probability that someone will be Tall (where "Tall"

means 5 feet 9 inches or taller) is 50%, that means out of every 10

people, 5, on average, should be Tall.

A conditional probability is a probability of something happening GIVEN

that something else happens. The probability of someone being Tall may

be 50% when they are chosen at random, but the probability of a Male

being Tall (the probability of being Tall GIVEN that someone is a Male)

may be, let's say, 80%.

Now this conditional probability that someone is Tall given that they

are a Male, P(T|M), should not be confused with the conditional

probability I'll show you how to calculate P(M|T) in a moment. For that

you'll need Bayes' Rule.

First we need the probability of being a Male. Let's say that's 40%

(amplifying the fact that there are slightly morfewer e males than females).

So we have the probability of being Tall:

P(T)=50%

Then we have the probability of being tall GIVEN that someone is a Male:

P(T|M)=80%

Then we have the probability of being Male:

P(M)=40%

Now we need Bayes' Rule. Bayes' Rule is based on the JOINT probability of

two events, that is, the probability that they will BOTH happen. The

probability that two events will happen together depends on whether or

not they are INDEPENDENT. If they are independent, the answer is easy,

it's just the probability of the one times the probability of the other.

As an example of independent events, there's no reason to believe that

the probability being Dark differs for males and females: let's say

it's 50% in both cases. Then the probability that someone is Male is:

P(M)=40%

The probability of being Dark is:

P(D)=50%

Moreover, the CONDITIONAL probability of being Dark GIVEN you are Male

will be exactly the same as the probability of just being Dark, namely,

50%, because the two are independent; that's what it means to be

independent.

So the JOINT probability of being Male and Dark is:

P(M AND D) =

P(M) x P(D) =

40% of 50% = 20%.

But with Tall and Male this does not work: They are not independent.

The joint probability is NOT the product of P(M) and P(T), even though

they are again 40% and 50%. This is because the CONDITIONAL probability

of being Tall GIVEN that you are a male is not the same as the

probability of just being Tall, it's much higher:

P(T|M)=80%

And THAT's the probability you need for figuring out the JOINT probability

of being both Tall and Male.

The joint probability can be calculated either of two ways, because

of a certain symmetry you will see in a moment:

It is always the probability of the first times event the CONDITIONAL

probability of the second event, GIVEN the first. So with Tall and

Male, it's:

P(M AND T) =

P(M) x P(T|M) =

40% x 80% = 32%

This means that Tall Males are more common than Dark Males. Exactly 20%

(less than 1/4) of the people you stop randomly on the street will be

both Dark and Male (the rest will be Short or Female, or both), but 32%

(more than 1/4) of the people you stop on the street will be both Tall

and Male (because there are fewer Tall Females).

The other way you can calculate the joint probability of being Male and

Tall is the following. It is identical to the previous formula, just

swapping the Ms and Ts:

P(M AND T) =

P(T) x P(M|T) =

50% x ? = 32%

We know the answer has to be the same, 32% (because P(M AND T) equals

itself, and we know that P(T) is 50%; but earlier I said that we don't

know what P(M|T) is: we don't know the CONDITIONAL probability that

someone is Male GIVEN that he is Tall, P(M|T); so far we only know the

conditional probability that someone is Tall given that he is Male,

P(T|M)=80%

So Bayes' rule simply uses this symmetry, that

P(M AND T) = P(M) x P(T|M) = P(T) x (PM|T)

and just does a little algebra to rewrite it as:

P(M|T) = [P(M) x P(T|M)] / P(T)

Using that we can calculate P(M|T) = (40% x 80%) / 50% =

(because this is easier to calculate with fractions than with

percentages -- AND THAT IS ONE OF KOEHLER'S AND GIGERENZER'S POINTS:

FRACTIONS ARE RELATIVE FREQUENCIES, AND EASIER TO UNDERSTAND THAN

PROBABILITIES AND PERCENTAGES WHEN IT COMES TO CALCULATIONS LIKE THIS):

(4/10 x 8/10) / 5/10 =

32/10 / 5/10 = 32/10 x 10/5 = 320/50 = 64%

Now I can tell you exactly what the "inverse fallacy" is:

It is to confuse P(T|M) and P(M|T). In this case, it would be to

confuse 80% with 64%. As you can see, if you are told that the

probability that someone is Tall given that he is Male is 80%, that

does not yet tell you what the probability is that someone is Male

given that they are Tall! To know that, you first need to know the

probability of being Male, P(M), 40%, and the probability of being

Tall, P(T), 50%, the base rates. And then you have to apply Bayes' Rule!

To translate this into the question of DNA evidence, the probability

that, GIVEN that someone is (in reality) innocent, the DNA finds them

guilty, P(G|I), is sometimes misdescribed to the jury, and

misunderstood by them, as the probability that, GIVEN that the DNA

implies they are Guilty, someone is in reality Innocent, P(I|G). These

are clearly very different, and one cannot be derived from the other

without knowing the base rates P(G) and P(I).

Suppose the probability that the DNA will call someone guilty when they

are really innocent is very low, and the jury misinterprets this as

meaning that the probability that they are really innocent when the DNA

calls them guilty is very low; so they vote guilty. But maybe there are

a lot of falsely accused people and a lot of DNA tests, so the probability

of being one of the (rare) false calls of the test is not so low!

P.S. Bayes' Rule is often used in statistics to UPDATE the probability of

hypotheses in the face of new evidence (data). So it is often written

in the form:

P(H|E) = P(E|H)P(H) / P(E)

This means the probability of the Hypothesis GIVEN the new Evidence

(otherwise knows as the "posterior probability" or the new, recalculated

probability, once you've done the calculation) is:

the probability of the Evidence GIVEN the Hypothesis

times the probability of the Hypothesis,

all divided by the probability of the evidence

The best way to get such date is by actually counting frequencies:

How often does E happen? How often does E happen given that H is true?

And how true did we think H was before? P(H) is called the "prior"

probability, and this is what is being updated or adjusted on the basis

of the new conditional probability, P(H|E), the "posterior" probability.

The prior probability is also the base rate.

Cheers, Stevan

**Next message:**Danhall Anna: "Mealey Response 3"**Previous message:**HARNAD Stevan: "Koehler: The Bayesian Calculation"**Next in thread:**Dock Jenny: "Re: Koehler: Conditional Probability"**Maybe reply:**Dock Jenny: "Re: Koehler: Conditional Probability"**Maybe reply:**HARNAD Stevan: "Re: Koehler: Conditional Probability"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

*
This archive was generated by hypermail 2b30
: Tue Feb 13 2001 - 16:23:22 GMT
*