Koehler Base Rates Quote/Comment-Ready

From: HARNAD Stevan (harnad@cogsci.soton.ac.uk)
Date: Thu May 14 1998 - 22:10:47 BST


[Shrink-wrapped version of Psycoloquy version of Koehler, followed
by shrink-wrapped Abstract and Conclusions of BBS version.]

Jonathan J. Koehler (1993) The Base Rate Fallacy Myth. Psycoloquy:
4(49) Base Rate (1)

THE BASE RATE FALLACY MYTH

Jonathan J. Koehler

Abstract

few tasks map unambiguously into the simple, narrow standard of good
decision making.

literature does not support the widely held belief that people ignore
base rates.

we know very little about how the ambiguous, unreliable and unstable
base rates of the real world are and should be used by decision makers
with complex goals.

the existing research paradigm should be replaced by an empirical
program that examines real world base rate use with more flexible
standards

Lawyer/Engineer Problem:

When told that in population, 70% lawyers 30% engineers and asked:

What is P that a randomly chosen person is Lawyer/Engineer?

Subjects said 70%/30%

When told that in population, 70% engineers 30% lawyers and asked:

What is P that a randomly chosen person is Lawyer/Engineer?

Subjects said 70%/30%

When told that the random person was "conservative,
nonpolitical/nonsocial, hobbies: carpentry and puzzles"

Chose 90% engineer under both conditions: Baselines seems to have no
effect.

taxi cab problem (Tversky and Kahneman (1980)

"A cab was involved in a hit-and-run accident at night. Two cab
companies, the Green and the Blue, operate in the city. You are given
the following data:

(i) 85% of the cabs in the city are Green and 15% are Blue.

(ii) A witness identified the cab as Blue.

The court tested the witness's ability to identify cabs under the
appropriate visibility conditions.

When presented with a sample of cabs (half Blue and half Green) the
witness was right 80% of the times and wrong 20% of the time

What is the probability that the cab involved in the accident was Blue
rather than Green?"

1.0. INTRODUCTION

1.1. Psychologists use Bayes' theorem as a normative model for
combining base rates with other probabilistic information

base rate is the relative frequency with which an event occurs or an
attribute is present in a population

Bayes' theorem follows directly from

multiplicative rule of probability:

the joint probability of two events, H and E, equals the product of the
CONDITIONAL probability of one of the events GIVEN the second event,
plus the probability of the second event:

P(H&E) = P(H|E)P(E)

P(H&E) = P(E|H)P(H)

Therefore: P(H|E) = P(E|H)P(H) / P(E)

where P(E) = P(E|H)P(H)+P(E|-H)P(-H) for binary hypotheses.

Odds form:

P(H|E) / P(-H|E) = [P(E|H)P(H) / P(E)] / [P(E|-H)P(-H) / P(E)] =

P(E|H)P(H) / P(E|-H)P(-H) = [P(H) / P(-H)] X [P(E|H) / P(E|-H)]

H and E stand for Hypothesis and Evidence

P(H) and P(-H) probability the H is true or false PRIOR to the
collection of additional evidence

P(H) and P(-H) are "prior probabilities" and their ratio is the "prior
odds."

P(E|H) and P(E|-H)

the information value of the evidence if H is true and false
respectively; their ratio is the "likelihood ratio."

P(H|E) and P(-H|E)

are the probability that the hypothesis true and false in light of the
evidence; their ratio is the "posterior odds," which is the combination
of the prior odds and likelihood ratio.

base rates and Bayesian methods will sometimes improve predictive
accuracy, but there is no single, clear standard for using base rate
information in most realistic decision situations

not because Bayesian methods are flawed or lead to unresolvable
paradoxes or are logically flawed but because base rates do not map
into the Bayesian framework in most real world problems.

1.2. The failure to appreciate this has led to a vast oversale of the
so-called "base rate fallacy" in the probabilistic judgment literature

According to this fallacy, people routinely ignore base rates and it is
an error to do so

base rates are equated with prior probabilities, and deviations between
subjects' judgments and the Bayesian posterior probability are used to
measure the extent of base rate fallacy

1.3. both the normative and the descriptive components of the base
rate fallacy have been exaggerated.

1.4. normatively: there is an important difference between
identifying a theoretically sound normative rule for combining
probabilistic information and applying that rule to tasks.

depends on how well the task meets the assumptions of the rule. Where
key assumptions are violated or unchecked, the superiority of the
normative rule is an untested empirical matter

1.5. empirically, little evidence that people ignore base rates

base rates almost always do influence judgments

2.0. QUESTIONABLE ASSUMPTIONS

2.1. Should people make greater use of base rates? not clear.
ecological validity of the base rate literature is low.

Subjects asked hypothetical questions about unfamiliar and unrealistic
situations.

presented with single base rate to treat as perfectly reliable.
Failure to do so is error.

such "errors" tell us little about whether people should make use of
base rates more in daily lives.

2.2. researchers have not given the normative component of the base
rate fallacy the attention it deserves.

make simplifying assumptions to invoke a normative standard

2.2.1. "Subjects' prior beliefs are precisely represented by the
single base rate statistic provided by the experimenter"

use Bayes' theorem to determine whether subjects rely on base rates
enough.

why should a base rate be equated with a subject's prior belief? prior
beliefs may be informed by base rates, but the two need not be
identical.

always have additional information that does (and should) affect one's
prior beliefs.

2.2.2. "The context within which base rate problems are presented and
solved does not and should not influence their solutions":

but subjects use the psychological context of the problem as a cue

2.2.3. "Subjects understand and accept that the individuating
information in base rate problems is a random sample from an
unambiguous reference class":

lawyer-engineer problem;

descriptions were NOT in reality randomly selected, and subjects could
have suspected as much

when subjects performed and observed the random sampling for
themselves, the influence of base rates was much stronger than it was
when random sampling was only verbally asserted.

Similarly reasurances about representativeness of base rates in causal
attribution studies promoted greater use of them

2.2.4. "Subjects' prior beliefs are and should be made independently
of their assessments of the diagnostic strength of individuating
information."

usually unrealistic to assume that likelihood ratios are independent of
either base rates or prior probabilities.

signal detection literature shows that ratio of hit rate to false alarm
rate (the likelihood ratio) depends on signal probabilities (i.e., base
rates).

accuracy of likelihood information derived from observer reports
changes as the observer's knowledge of the base rates changes.

in taxi cab problem:

witness who is aware there are many more Green cabs than Blue cabs
probably predisposed to see Green cabs in ambiguous situations.

Hence the actual probability that a cab in an accident was Blue given
that a witness SAYS so may be much closer to the responses given by
untrained subjects (80%) than to the solution presented by base rate
investigators (41%).

2.5 mechanical applications of Bayes' theorem will not help measure a
"base rate fallacy" when key assumptions of the model are not checked
or violated

or when there features of the task that may CORRECTLY influence
subjects' responses

3.0. BASE RATES ARE NOT IGNORED

3.1. frequently claimed that people ignore base rates

This is considerable exaggeration at best.

many studies have concluded base rates are less weight than
case-specific information in particular situations, few have shown they
are ignored.

3.2. Some of the confusion from the unfortunate use of "ignore" when
subjects are just giving less weight to base rates than to other
specific information.

In Kahneman and Tversky's (1973) lawyer-engineer experiment there WAS a
small but statistically significant main effect for the base rate (p <
.01).

3.3. numerous attempts to replicate lawyer-engineer results.

seven lawyer-engineer experiments

base rates uniformly influence subjects' judgments in the presence of
diagnostic individuating information. Differences between high and low
base rate groups ranged from 2% to 30%, with an average near 10%.

3.4. four studies revealed strong base rate effects

3.5. has been shown that base rates influence social judgments, moral
judgments, auditing judgments, medical judgments, sports judgments

not surprising. base rates commonly used in daily life,

Baseball managers routinely "play the percentages," choosing
left-handed batters to face right-handed pitchers and vice versa;

police officers stop and detain suspected criminals,on the basis of
background characteristics

voters mistrust the political promises of even their most favored
politicians.

4.0. EMERGENCE OF THE MYTH

4.1. how did base rate neglect myth emerge?

(1) Kuhn stressed that a simple and powerful theory can withstand
empirical challenge when the challenging data are not accompanied by a
simple, general theory of their own.

base rate neglect thesis sprang from the "heuristics and biases
paradigm" that dominated research on judgment and decision making in
the 1970s and 1980s.

paradigm was extremely critical of people's intuitive judgments about
probabilistic events, claiming that people make such judgments via
simple error-prone heuristics.

well-known heuristic, "representativeness," suggests that people's
judgments about the probability of category membership depend on how
similar the features of the target are to the essential features of the
category.

judgments that Viki is an accountant depend upon how similar Viki's
interests, background, talents, etc., are to those ordinarily
associated with accountants.

4.2. As evidence in support of this heuristic mounted, base rate
neglect became an easy sell. If people use the representativeness
heuristic, and if base rates are typically less representative of a
category's central features than individuating information, then it
follows that people will ignore base rates.

When subsequent research failed to support the theory, the data were
ignored and the theory persisted; the underlying principle was too
attractive to abandon on account of data.

result was a simplification and misinterpretation of the body of
literature by observers, researchers and reviewers alike.

4.3. (2) psychologists' misperception of the base rate literature may
also be attributed to heuristic thinking.

to make sense of complex and sometimes conflicting data, scientists may
search for (or create) simple conclusions to represent a given body of
research.

simple and general statements about a literature can become more
authoritative than either the existing data or the claims made about
the data by the original authors.

"Hawthorne Effect": a series of studies at Hawthorne electrical plant
in late 1920s widely cited as demonstrating that productivity increased
regardless of the type of change made in their work environments, this
interpretation is a gross simplification and distortion of the actual
findings

5.0. TOWARD AN ECOLOGICALLY VALID RESEARCH PROGRAM

5.1. If we are to increase our understanding of how people should and
do use base rates, ecologically sound program of research required.

current paradigm relies too heavily on artificial task environments

says too little about use of imperfect base rates of natural ecology

may matter little how much attention people pay to base rates in
problems of the lawyer-engineer type.

In the real world, those who appreciate the unreliability of certain
types of base rates may make better decisions than those who do not.

next generation of studies must examine base rate usage in more
realistic decision problems and environments.

5.2. begin by speculating about when judgmental accuracy is and is not
likely to be impaired by a relative inattention to base rates.

likely to impede accuracy when the base rates conflict with other
sources of information and are high in relative diagnosticity.

In medical diagnoses, inattention to reliable low base rates could lead
to extensive overdiagnosis and excessive treatment.

general base rate for hypothyroidism is less than 1 in 1000 among young
adult males, although the primary symptoms -- dermatological
problems, depression, and fatigue -- are quite common.

A diagnostician who disregards the base rate and relies solely on the
individual's symptomatology will overdiagnose this disease.

5.3. may be situations where inattention to base rates will not impede
accuracy

e.g., where there is redundant information

6.0. CONCERNS OTHER THAN ACCURACY

6.1. many situations in which accuracy is not the only basis for
evaluating performance.

e.g., American legal system is greatly concerned with fairness and due
process, which may interfere with accuracy

certain types of highly valid evidence routinely excluded in court
(e.g., illegally obtained confessions) because of other judicial
values.

admitting base rates in court violates the legal norm of individualized
justice and should therefore be excluded as well

6.2. costs of error considerations may persuade decision makers to
make judgments that they believe are inaccurate.

criminal guilt must be proved "beyond a reasonable doubt" to minimize
erroneous convictions

juries will often return "not guilty" verdicts in cases where they
believe the defendant is guilty (but not beyond a reasonable doubt).

6.3. often need to take factors other than judgmental accuracy into
account.

even where accuracy is goal, must also take into account costs such as
time, mental effort, and money

problems will have multiple solutions

whenever the assumptions, goals and values of decision makers vary,
people exposed to identical information may arrive at different
solutions, none necessarily erroneous.

6.4. person- and situation-specific criteria can provide richer
insights and more useful recommendations than existing program.

[Shrink-wrapped Abstract and Conclusions of BBS version of Koehler]

THE BASE RATE FALLACY RECONSIDERED: DESCRIPTIVE, NORMATIVE AND
METHODOLOGICAL CHALLENGES

Jonathan J. Koehler

Abstract

contrary to the literature, base rates are almost always used

their
degree of use depends on task structure and internal task representation.

few tasks map unambiguously into the simple, narrow framework considered the
standard of good decision making.

current work fails to consider how the ambiguous, unreliable and unstable base
rates of the real world should be used in the informationally rich and
criterion-complex natural environment.

more ecologically valid research
is called for.

6. SUMMARY AND CONCLUSION

oversold on the base rate fallacy from an empirical, normative,
and methodological standpoint.

thorough examination of the base rate literature shows base rates
almost always used; depends on task representation
and structure.

Frequency-based tasks, and those
structured to sensitize people to base rates do use them

people attach little weight to base rates only in certain tasks and contexts,
many quite unlike those that exist naturally

growing body of work shows people are
capable of sound statistical reasoning when information is learned and
presented in certain ways

fits well with observations made from daily life:

Baseball managers routinely "play
the percentages" by choosing left-handed batters to face right-handed
pitchers and vice versa

police officers stop and detain suspected
criminals on the basis of background characteristics

voters
mistrust the political promises of even their most favored politicians

At the normative level, popular form of the base rate fallacy should be
rejected

few tasks map unambiguously into the narrow framework considered
the standard of acceptable decision-making

Mechanical use of Bayes's theorem to identify errors is inappropriate when its
key assumptions (e.g., independence of prior probabilities and likelihoods) or
the decision-makers' representation of the task (e.g., equivalence of base
rates and prior probabilities) are not checked or grossly violated

potential ambiguity, unreliability and
instability of base rates under natural conditions reduces their
diagnosticity.

no single
normative standard for base rate usage

there may be situations (e.g., informationally redundant environments, natural
sampling of information) in which base rates can be safely ignored without
reducing predictive accuracy

even where certain formulae might increase predictive accuracy, prescriptive
models for base rate usage should be sensitive to a variety of policy
considerations (e.g., cost of error, fairness)

should stop searching
for performance errors relative to an inflexible standard in laboratory tasks.

Instead, should
pursue a more ecologically valid program of research on
 
(a)
patterns of base rate use in the natural ecology, and

(b) when and how
people would benefit from adjusting the intuitive weights they assign to
base rates in real world tasks.

where research shows that decision-makers would benefit from using base
rates more:

encourage frequentist problem representations

or sensitize
decision-makers (e.g.,
provide for direct base rate experience).

If the stimuli, incentives, performance standards and other contextual
features in base rate studies are far removed from those encountered in the
real world, what are we to conclude?

that people
would be richer, more successful and happier if only they paid more
attention to base rates?

teach this in our schools?

advise
professional auditors (who already pay substantial attention
to base rates) that they too would benefit from attending
more closely to base rates?

jjwhat to tell Olympic basketball
coaches, jurors, weathermen, stockbrokers and others who must sort through a
morass of ambiguous, unstable or conflicting base rates to estimate how
likely that an event did or will happen?

These are the types of questions that
need to be addressed in contexts that are richer, and with performance
standards that are more comprehensive, than those used to
date.



This archive was generated by hypermail 2b30 : Tue Feb 13 2001 - 16:23:21 GMT