November 27, 2009

How we know what isn’t so, by Thomas Gilovich

Posted in Behaviour at 07:27 by graham

By Thomas Gilovich, social psychologist and CSI Fellow, this well written book explains some of the reasoning and deduction errors we make when trying to understand the world, and ways to avoid making those errors.

This is an easy and engaging read, and offers several straightforward techniques to avoid making common reasoning errors. I recommend you look up How We Know What Isn’t So: The Fallibility of Human Reason in Everyday Life in your local library, or get it second-hand from Amazon for less than a posh cup of coffee.

These are my notes / summary of the book.


I. Cognitive determinants of belief

2. Something out of nothing: The mis-perception and misinterpretation of random data

We are predisposed to see order, pattern, and meaning in the world, and we find randomness, chaos, and meaninglessness unsatisfying. As a consequence we tend to ‘see’ order where there is none, and we spot meaningful patterns where only the vagrancies of chance are operating.

Detecting patterns and seeing connections is very useful, leads to discovery and advance. But the tendency is so strong we sometimes do it when there was nothing to spot.

Example of hot hand in basketball. Why?

  • Availability heuristic: We remember the long streaks, forget the single shots. Can interpret near-misses as evidence of ‘hot’ or ‘cold’ player.
  • People have faulty intuitions about what chance sequences look like.

    See: R. Falk (1981) The perception of randomness. In Proceedings, Fifth international conference for the psychology of mathematics education.

    In a coin toss, people expect a near perfect alternation of heads and tails. In a series of 20 tosses, there is a 50/50 chance of getting 4 heads in a row, and 10% chance of six in a row. Clustering illusion.

St Louis Gateway Arch is one of the worlds largest optical illusion. It appears much taller than it is wide, yet height and base width are the same.

Representativeness heuristic: “Like goes with like”. We expect things that go together to look similar. We expect instances of a type (librarian) to look like the prototype (a cliche librarian). Complex effect stem from complex causes. Effects look like their causes: Jagged handwriting means jagged nerves, heartburn comes from hot/spicy foods, etc.

Representativeness usually very helpful. Occasionally mis-applied. The salient (‘prominent or conspicuous’) feature of something is what we remember, and apply the representativeness heuristic to that. The salient feature of a random sequence is the even mix of all the outcomes.

Law of averages is, by statisticians, correctly called the law of large numbers. The even mix of outcomes is only true for very large random sample. There is no ‘law of small numbers’.

Stock market: Random walk, but chartists insist they see patterns in the randomness.

In any random distribution (e.g. x,y plots on a graph), there will be a way of segmenting it (along the axes, diagonally, in bands, etc) that will appear non-random. Carving up data after the fact is meaningless. If we think we see a pattern in the data, test that pattern on other independent sets of data. Unfortunately, for most people hypotheses constructed on one set of data are considered proved by that same data. This is why medical trials announce their goal, the expected outcome, ahead of time.

Once we think we see order in randomness, humans are exceptionally good at justifying it with a post-hoc theory. Improv comedy teaches ‘jump and justify’; say something random then justify it, it’s surprisingly easy. For example people let to believe they are above or below the average at some task can explain the difference quite easily, even when the experimenters assigned the above/below at random. Experiments with split-brain patients, where they justify what the other part of the brain just did.

Once a person has (mis)identified a random pattern as a ‘real’ phenomenon, it will not exist as a puzzling, isolated fact about the world. Rather, it is quickly explained and readily integrated into the person’s pre-existing theories and beliefs. These theories then serve to bias that persons evaluation of new information. People cling tenaciously to their beliefs in the face of hostile evidence.

Regression to the mean: When two variables are related, but imperfectly so, an extreme value on one tends to be matched by a less extreme value on the other. A high number roll on the dice tends to be followed by a lower number, and a low number by a higher one. A companies disastrous years tend to be followed by betters ones. Students with exceptionally good grades in high-school tend to do slightly less well in college.

Regression can be understood than in any performance (game, test, financial year, etc) there is a part of talent and a part of chance. An very high score is more likely to be a good student with luck in their favor, rather than an extremely good student with luck working against them, simply because there are more ‘good’ students than ‘extremely good’ students. So a very high score is likely to be followed by a slightly lower one, because it is unlikely to get chance that much in favor two times in a row.

Most people understand regression, but make two mistakes:

  • Insufficiently ‘regressive’ when making predictions. The ‘next’ value after an extreme one is closer to the average than they tend to predict.
  • Regression fallacy: Fail to recognize statistical regression, and explain it away with a superfluous and often complex causal theory. Ad-hoc justification where none was needed.

The regression fallacy shapes people (parents and teachers, mostly) perception of the effectiveness of rewards and punishments. A good performance is likely to be followed by a less good one, and a bad one by a less bad one. If the good performance is rewarded, the reward will be perceived as ineffective because the good performance was not repeated. If a bad performance is punished the punishment will be perceived as effective because the bad performance will not be repeated. That notwithstanding, psychologists have known for some time that rewarding desirable behavior is generally more effective in shaping behavior than punishing undesirable responses. See: B. F. Skinner (1953) Science and human behavior.

3. Too much from too little: The misinterpretation of incomplete and unrepresentative data

Using empirical evidence as proof: “I’ve seen it happen”, “I know someone who did”, etc. If a phenomenon exists, there must be some positive evidence – ‘instances’ of its existence visible to us. So empirical positive statements are necessary for a belief to be true, but they are not sufficient. We also need to know what goes in the other boxes.

Many of the beliefs we hold are about the relationships between two variables (‘takes vitamin C’ and ‘gets better’). Say we were investigating ‘vitamin C megadose’ on cancer patients. We notice some patients get better after taking mega-doses of vitamin C. That prompts us to start investigating. Positive outcomes under treatment are one box, we need to know the other three:

Takes Vitamin CDoesn’t take Vitamin C
Gets betterab
Doesn’t get bettercd

If we stop at noticing some patients getting better, all we have is box a. All we have is the ‘illusion of validity’.

For vitamin C to be effective, the probability of getting better after taking it [a / (a + b)] must be higher than the probability of getting better after not taking it [c / (c + d)].

This is often difficult to do intuitively, because box ‘a’ is the most salient. We notice things happening, not things not happening.

We often seek only evidence to confirm our beliefs; we should also seek information to disprove them, and only hold our beliefs if such evidence is not available.

Often the evidence in the other boxes is not available, particularly in evaluating selection criteria. How do we test whether a companies interview process is effective? We are asking the relationship between ‘passes interview’ and ‘performs well at job’. But it’s very nature, we don’t have the data on job performance for people who didn’t pass the interview.

Passes selectionDoesn’t pass
Performsa?
Doesn’t performc?

The only data we have is a and c, we can only compare the success rate of those who pass the selection. If the base rate of success is high (most people could perform well, only the best performers apply, etc), a would be higher than c whatever the selection criteria, so we would spuriously conclude our criteria are good.

It is not possible to be confident about the selection process without the data on those who were not selected.

Furthermore, simply being accepted can give someone a competitive advantage. Being admitted to a better school, getting a research grant, or working with high-performing colleagues will all improve an individuals performance compared to someone of similar initial ability who didn’t get selected.

Effectiveness of public policy is similarly difficult to measure, because we can’t both set the policy and not set it.

Often the lifestyles we lead, the roles we play, and the positions we occupy in a social network, deny us access to important classes of information and thus distort our view of the world. Overcoming that bias is difficult: We must first recognize the existence of a class of information we have not been exposed to, and then accurately characterize what that information is like.

Self-fulfilling prophecies: Have got a lot of attention. For a prophecy to be self-fulfilling there must be a mechanism that translates the expectation into confirmatory action. They often serve to exagerate a belief that holds a kernel of truth. Thinking a bank is in trouble (usually with good reason), creditors will withdraw their money, and the bank will really be in trouble. Behaving in an unfriendly and defensive manner because you think someone is hostile will often produce that very hostility.

True self-fulfilling prophecies are ones in which a persons belief elicits the very behavior originally anticipated. A seemingly self-fulfilling prophecy is one which alters a persons world or limits their responses, in such a way to make it very difficult or impossible for that expectation to be proved wrong. If someone thinks that I am unfriendly, they will avoid me, so I will have no opportunity to prove them wrong. If a sports player is thought incompetent, he won’t get to play, so won’t be able to prove himself competent. The continued absence of positive contribution can easily be mistaken for an absence of talent, when it is simply an absence of opportunity.

Negative first-impressions are generally more stable than positive ones, because we keep interacting only with people who created positive first impressions.

4. Seeing what we expect to see: The biased evaluation of ambiguous and inconsistent data

Information consistent with our pre-existing beliefs is generally accepted at face value, whereas evidence that contradicts them is critically scrutinized and discounted. Our beliefs are much less responsive than they should be to the implications of new information. This is a sane and necessary strategy. If a belief has a lifetime of support, it is perfectly valid to be skeptical of evidence that contradicts it. Well supported beliefs have earned their inertia. We need to be wary of the beliefs that don’t have a solid foundation, such as cultural stereotypes, social norms, and traditions.

Ambiguous information is usually perceived in a way that fits out expectations. We may not even be aware of the ambiguity – it gets resolved before reaching conscious awareness.

Gamblers tend to attribute their losses to outside forces (chance, the referee, a team injury), but their wins to themselves (knew the team well, studied the form). We re-write our history to discount our losses and bolster our wins.

By carefully scrutinizing information that does not fit our beliefs, we can usually find a way of either discounting or re-interpreting it.

Scientists use a set of formal safeguards to prevent their own erroneous thinking affecting their results:

  • Statistical measures to guard against the mis-perception of randomness
  • Control groups and random sampling to avoid drawing inference from incomplete or unrepresentative data
  • ‘Blind’ observers to eliminate experimenter bias.
  • Precisely specify in advance the meaning of various outcomes, and objectively determine those outcomes. These rules are the ‘context of justification’, used when testing an idea. Idea generating is much more open. Science works by flashes of inspiration followed by rigorous testing. The rigorous testing is what differentiates scientific inquiry from everyday life.

Multiple endpoints: By not precisely specifying an expected outcome, we can pick any one and claim it as success. Psychics will use very vague descriptions, so that we can apply them to our lives and perceive them as true. If the subject is fuzzy (‘personal well-being’ for example), we are likely to seize upon any subsequent measure of it that fits our initial beliefs.

Variable windows: By not specifying an endpoint to a prediction, it stays ‘open’ until it comes true. ‘Thing happen in threes’ because we could ‘things happening’ after the third one has occured, whenever that may be. The third event defines the time window for our prediction. With a wide enough variable window, beliefs can only be confirmed.

Multi-faceted expectation: For any two sufficiently complex entities, we can produce a mapping of one onto the other that will produce a certain amount of overlap, and allows us to claim they are similar in some way. Often used with the representativeness heuristic.

You have a strong need for other people to like you and for them to admire you. At times you are extroverted, affable, and sociable, while at other times you are introverted, wary, and reserved. You have a great deal of unused energy which you have not turned to your advantage. While you have some personality weaknesses, yo are generally able to compensate for them. You prefer a certain amount of change and variety and become dissatisfied whne hemmed in by restrictions and limitations. You pride yourself on being an independent thinker and do not accept others opinions without satisfactory proof. You have a tendency to be critical of yourself. Some of your aspirations tend to be pretty unrealistic.

If you see yourself in that description, you are not alone. It is multi-faceted, so most people will find the part that related to them the most salient, and the statements are so general (multiple endpoints) that they are bound to ring true. This is how horoscopes work.

One-sided versus two-sided events:

  • One-sided events are ones that are memorable only when they turn out a certain way: ‘The phone always rings when I’m in the bath’, ‘You wait forever for a bus and three come along at once’, etc. If the phone does not ring whilst you are in the bath, the non-event does not register.
  • Two-sided events are ones that register regardless of the outcome: Vacations, dates, gambling, buying a stock, etc. Both outcomes stand out from the stream of experience.

In two-sided events, often negative outcomes are remembered better than positive ones, as they require more rationalization to incorporate them into our self-image and understanding of the world. In one-sided events, we are more likely to remember the ‘side’ of events that has meaning to us, which is usually the one that reinforces our world-view. If I believe my dreams are prophetic, I will remember those much more than the other dreams. If I believe that strange things happen during a new moon, I will notice and remember the strange things, not the ordinary things (strange things not happening).

Many one-sided events only ‘exist’ when they are confirmed. If a fortune teller says you will have twins, and you have twins, you remember the fortune teller, link the two events, and form a strong memory. If you have one child, you probably won’t think of the fortune teller at all. And even if you did, the fortune is not contradicted, simple un-confirmed. You could have twins later in life.

One-sided events tend to be temporally unfocused, they have variable windows to be confirmed in. A ‘prophetic’ dream, a fortune, doesn’t have a fixed date, so only the positive outcome is acknowledged. Two-sided events tend to have fixed windows: A sporting event, a vacation, a job interview. The closed window forces us to acknowledge both outcomes.

In two-sided events, both outcomes produce the same intensity of emotion (winning or losing a bet). In a one-sided event, only one of the outcomes has any emotional weight: It ‘always’ seems to rain when you forget your umbrella, because getting wet has an emotional attachment (hair ruined, clothes soaked, etc), whereas not-getting-wet doesn’t.

Definitional asymmetries: ‘American tourists in London are loud’, ‘I can always spot fake breasts’, ‘You need to hit rock-bottom to bounce back’. All of these are difficult or impossible to disprove: We only notice the loud people, we don’t notice fake breasts that we didn’t spot, and how can you say if someone has hit rock-bottom or not? Definitional asymmetries are all one-sided events.

One-sided events tend to have a ‘normal’ outcome with a high base rate, which we don’t notice because it is part of everyday life (‘look, it’s not a full moon’), and an less usual more salient outcome (‘o oh, it’s a full moon!’).

All of the above relate to the availability heuristic.

II. Motivational and social determinants of questionable beliefs

5. Seeing what we want to see: Motivational determinants of belief

Endowment effect: We value something more when it is ours. Ownership creates an inertia that prevents many potentially beneficial transactions from occurring.

The endowment effect applied to humans is the Lake Wobegon effect, or Illusory superiority. This is particularly true on ambiguous traits (‘intelligence’, ‘sensitivity’, ‘idealism’), and less true on more specific traits (‘thriftiness’, ‘being well-read’). If a specific definition is given in the question, the above average effect tends to disappear: People are not lying or cheating, it’s just that the first thing that comes to mind when asked an ambiguous question is usually something they are good at, something salient in their lives.

We attribute success internally and failure externally. Thanks to their own resources, they succeeded. Because of the others/the environment, they failed. Many psychologists hold that we do this to maintain self-esteem. There is also a cognitive explanation: Succeeding at something is at least partly due to our own effort, and thus warrants some internal attribution. Failing at something usually happens despite our best efforts, so often invovles an unfavorable external situation.

People are more likely to believe things they want to believe, but are constrained by objective evidence and the need to construct a justification that would presuade a dispassionate observer. We draw the desired conclusion only if we can muster up enough evidence to support it. It is in this sense that most people think of themselves are objective. People often do not realize that their selection and interpretation of data is biased by their goals – the data could be interpreted a different way, and there is other data they ignored. They may well be able to justify opposite conclusions on different occasions.

How we ‘filter’ data to ensure it supports our goals:

  • Seeking only to confirm
  • Select who we consult. By judiciously choosing whom we consult on an issue, we can increase our chances of hearing what we want to hear.
  • Amount of information: If the initial results confirm out expectations, we stop looking. If they don’t, we keep looking.

We should not stop at Can I believe this?, but should progress to Must I believe this?.

These are some of the ways we skew the evidence in the world, frame it to support our beliefs. This is healthy. People who can’t frame effectively run the risk of depression.

Beliefs are like possessions: We acquire the ones we think will make us feel good, and cling tightly to the ones we have.

6. Believing what we are told: The biasing effects of secondhand information

Telling a good story: The speaker needs their message to be worthy of the listeners attention. For the listener the interaction must be worthwhile in some way. The message must be understandable (not assume too much knowledge of the listener) and yet not too detailed (not assume too little knowledge of the listener).

Sharpening and Leveling: What the speaker construes as the gist of the message is emphasized, ‘sharpened’, where are details thought to be less important are de-emphasized, ‘leveled’.

One results of sharpening and leveling is that we often develop exaggerated or extreme views of people we have only been told about.

One way to make a message more entertaining or seemingly informative is to increase it’s immediacy. Instead of the story happening to a friend of a friend’s colleague, have it happen directly to friend or family member, or even better, you. Often such alterations are intended for the self-aggrandizement of the speaker. It places them closer to center-stage. Other times it is simply an effort to make the story more salient, more vivid and concrete.

Increasing immediacy makes it difficult for the listener to accurately gauge the reliability of your message:

  • Reducing the hops: We all know that the more hops a message has been through, the less reliable it is. Increasing the immediacy of a message for the sake of entertainment or self-aggrandizement also makes a message seem more reliable than it is.
  • Changing the origination (‘my brother’, instead of ‘my brother’s friend from work’): Your brother might be trustworthy, so we trust the message, but your brother’s friend is an inveterate liar. The original source of the message has been obscured.

Presenting and accepting remote accounts as secondhand is misleading when estimating the prevalence of a phenomenon in the general population. If something happened to lots of your friends cousins, it is happening a lot. If lots of your friends cousins heard a story about someone else’s friends cousin, and passed it on, there might be a single case.

In attempting to be informative, we might level some of the qualifications in the original message. This is often the case when scientific findings are reported in the news media. Sometimes the facts are stretched to ‘help’ their audience get the message. This results in hysterical public service campaigns, that elicit far more fear than the original risk warranted (drugs, ‘stranger danger’, terrorism, etc). The facts are stretched beyond recognition to make a more compelling story. Parents are often guilty of this distortion in attempting to motivate behavior in their children.

The desire to entertain often creates a conflict for the speaker between satisfying the goal of accuracy and the goal of entertainment. There is often a tacit agreement between speaker and audience that the truth may be stretched: Tabloids and ‘light entertainment’ news shows are granted this permission. “One of the most common sources of such inaccuracy is the dissemination of unfounded or fallacious claims by news and other media organizations that try and entice by their ability to entertain.” The demand for news has been met by and artificial increase in supply.

Plausibility: Inaccurate or fictitious stories are sometimes told and retold because they just seem so plausible, that we let our critical guard down. Our standards for plausibility are often very low, a decent irony is often enough (the creator of the song “Don’t worry be happy” committing suicide, for example, or someone at the patent office resigning because there was nothing left to invent).

Summary: As we have seen in previous chapters, the data from our own experience is often biased and incomplete. In this chapter we saw that data from others is also. It is therefore important to locate unbiased, complete sources of base-rate data (scientific inquiry mainly), and use those to asses how likely it is that our own perceptions or those of our social group are true. If the base rate data and our personal experience concur, then we are likely correct. If they don’t a reliable base rate should be our guide. If the base rate data is to reliable, at least we know that we don’t know.

How to assess secondhand information:

  • Consider the source: The New York Times or the National Enquirer? A rock star or a researcher? An actor who plays a doctor on TV, or a practicing medical doctor.
  • Trust facts, distrust projections: Predicting the future is hard, even for an expert. What looks like an exponential curve can turn out to be sigmoidal.
  • Watch for Sharpening and Leveling.
  • Be wary of testimonials. One striking human interest story does not tell you anything about prevalence or risk.

7. The imagined agreement of others: Exaggerated impressions of social support

What we believe is heavily influenced by what we think others believe. This is usually a good strategy. However we often exaggerate the extent to which others hold the same beliefs as us. We think we have more social support for our opinions than we really do.

This is the false consensus effect: This is a relative effect. We realize if our belief is in the minority. We underestimate by how much in the minority.

Reasons:

  • Motivational: A desire to maintain a positive assessment of our own judgment.
  • Social: We interact with people who agree with us
  • Context interpretation: We assume that given the same context, everyone will infer the same thing. As we see others in the same context, we assume they come to the same conclusions as us. As a result the false consensus effect is strongest for beliefs we attribute to external factors (buy stock in Ford or Google), and weakest for ones we attribute internally (name your son Jacob or Ian). The context is often ambiguous, and different people resolve the ambiguity in different ways, often without noticing it.

Inadequate feedback from others: People are generally reluctant to openly question another person’s beliefs (Adults, that is. Children do it all the time). Only intimate friends and relatives can be counted upon for honest feedback. More casual acquaintances often side-step the awkwardness of disagreement and thus leave us without essential corrective feedback. Because so much disagreement remains hidden, our beliefs are not properly shaped by healthy scrutiny and debate.

III. Examples of questionable and erroneous beliefs

(not summarized)

IV. Where do we go from here

11. Challenging dubious beliefs: The roles of social science

The real purpose of [the] scientific method is to make sure Nature hasn’t misled you into thinking you know something you actually don’t know. R. Pirsig, Zen and the Art of Motorcycle Maintenance.

To avoid erroneous beliefs it is necessary that we deveop certain habits of mind that can shore up various deficiencies in our everyday inferential abilities. Fortunately, there is reason to believe that these corrective habits of mind are not difficult to develop. Students familiar with the work on errors and biases readily apply the learning to their everyday lives.

Most important mental habit is realizing the folly of trying to draw conclusions from incomplete and unrepresentative data.

Ask: What do the other three cells look like?

Information presented as firsthand is often secondhand or more remote, and from a less trustworthy source. Be sure you know where you information originated before assessing it’s value.

Many of these habits are core to scientific research. Familiarity with that world helps. Gives you valuable exposure to uncertainty and doubt, a healthy skepticism, and the awareness of how hard it can be to really know something with certainty.

Exposure to the ‘probabilistic’ sciences (psychology, economics, social sciences, etc) may be more effective in teaching these habits than the ‘deterministic’ (physics, chemistry, etc) sciences. Probabilistic sciences deal with phenomena that are not perfectly predictable, and with causes that are generally neither necessary nor sufficient. The death of a spouse is associated with deterioration in health. However not all partners health deteriorates (not sufficient), and people’s health deteriorates for other reasons (not necessary). To compensate for this lack of determinism, probabilistic sciencists must be aware of statistical regression, sample bias, and the importance of control groups.

Social sciences deal with everyday phenomenon, so it easier for them than physical sciences to make the probabilistic tools of reasoning readily understandable and available.

Social scientists should no longer suffer from physics envy. They cannot match their explanatory power or predictive precision, but specifically because of this, they are better equipped to deal with the messy, complex phenomena of real life. Their tools and process, rather than their content, may turn out to be the social sciences most valuable discoveries.

2 Comments »

  1. Graham King said,

    February 21, 2010 at 07:08

    @Jerry, Thanks for your comment.

    As far as I know, this book is best coverage of the Hot Hand. Thomas Gilovich, the author, did most of that research.

    Other excellent books which cover the social implications of mis-understanding randomness are ‘Fear’ by Dan Gardner, and ‘Nudge’ by Cass Sunstein and Richard Thaler. The ‘Behavior’ section of my blog covers some of this: http://www.darkcoding.net/category/behaviour/

    An excellent journalist, blogger and medical doctor, who often covers randomness is Ben Goldacre, blogging at BadScience.net. Here are a couple of relevant recent posts: – http://www.badscience.net/2010/02/guns-dont-kill-people-puppies-do/http://www.badscience.net/2010/01/voices-of-the-ancients/

    I also really, really recommend investing an hour into listening to this speech of his: http://www.badscience.net/2008/01/mp3-lecture-more-than-molecules-–-how-pill-pushers-and-the-media-medicalise-social-problems/

  2. Jerry D. Taylor said,

    February 19, 2010 at 19:32

    Hi I too read the book and thought it was excellent. I am impressed with how much you seemed to have gotten from the book. Do you have other books of this nature to recommend? I am especially interested in the Hot Hand concept or just randomness in general. I will forward a link to your site to a few friends to see what they think. Thanks for taking time to write the review. Jerry Taylor

Leave a Comment

Note: Your comment will only appear on the site once I approve it manually. This can take a day or two. Thanks for taking the time to comment.