Misplaced Pages

Correlation does not imply causation: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 15:48, 19 January 2007 editChris53516 (talk | contribs)4,082 edits General pattern: added more info← Previous edit Revision as of 15:33, 26 January 2007 edit undo4.240.183.7 (talk) Please stop rewording the primary top paragraph definition. You are making it more difficult than it needs to be!Next edit →
(One intermediate revision by one other user not shown)
Line 1: Line 1:
'''Correlation does not imply causation''' is a phrase used in ] to indicate that ] between two variables does not imply there is a ] relationship between the two. Its negation '''correlation implies causation''' is a ] by which two events that occur together are prematurely claimed to have a cause-and-effect relationship. It is also known as '''cum hoc ergo propter hoc''' (Latin for "with this, therefore because of this") and '''false cause'''. ]When a correlation implies causation, whether in the ] or through ], it may be assumed there is a ] relationship between two variables. However, according to logical reasoning a ''correlation does not imply causation''and should not immediately be considered a cause-and-effect relationship. When two events that occur together are prematurely considered to be a cause-and-effect relationship, and evidence only exists that the relationship is correlational, it is a logical fallacy to suggest one factor is immediately causing the other. This type of logical fallacy is known as '''cum hoc ergo propter hoc''' (Latin for "with this, therefore because of this") and '''false cause'''.


== Usage == == Usage ==


The widely used phrase "''Correlation does not imply causation''" from one definition of the term ] may sometimes be incorrect. In ], imply means In the most literal sense, to say a "''Correlation does not ''imply'' causation''" may sometimes be incorrect. In ], "imply" means


:* ''To involve as a '''necessary''' circumstance''. - which makes the above phrase correct. :* ''To involve as a '''necessary''' circumstance''. - which may make the above phrase correct in some cases.
This is the meaning intended by statisticians when they use the phrase. Indeed, '''p implies q''' has the technical meaning of ]: '''if p then q''' symbolized as '''p ⇒ q'''. This is the meaning intended by statisticians when they use the phrase. Indeed, '''p implies q''' has the technical meaning of ]: '''if p then q''' symbolized as '''p ⇒ q'''.


In everyday English, "imply" may in some cases mean However, in everyday English, "imply" often means


:* ''To indicate or suggest''. :* ''To indicate or suggest''.


However, the phrase "''Correlation does not ''suggest'' causation''" is not necessarily true: Demonstrably consistent correlation often ''suggests'' or ''increases the probability'' of some causal relationship (or ''implies'' it, in the latter sense of the term). What it does not do is ''prove'' causation, as arguments that use the fallacy as a pattern of reasoning assert. <ref>Karl L. Wuensch, Department of Psychology, East Carolina University </ref> However, to say a "''Correlation does not ''suggest'' causation''" is not necessarily true: A demonstrably consistent correlation often ''suggests'' or ''increases the probability'' of some causal relationship (or ''implies'' it, in the latter sense of the term). What the correlation does not do is ''prove'' causation, as arguments that use the '''cum hoc ergo propter hoc''' logical fallacy as a pattern of reasoning assert. <ref>Karl L. Wuensch, Department of Psychology, East Carolina University </ref>


], in a criticism of the brevity of ] presentations, deprecates the use of ''is'' to relate correlation and causation (as in "''Correlation is not causation''"), citing its inaccuracy as incomplete.<ref>{{cite book ], in a criticism of the brevity of ] presentations, deprecates the use of ''is'' to relate correlation and causation (as in "''Correlation is not causation''"), citing its inaccuracy as incomplete.<ref>{{cite book
Line 31: Line 31:
== General pattern == == General pattern ==


Cum hoc ergo propter hoc is a less specific ], without time as correlation factor. It can be expressed as follows: The ''cum hoc ergo propter hoc'' logical fallacy can be expressed as follows:
* A occurs in correlation to B. * A occurs in correlation with B.
* Therefore, A causes B. * Therefore, A causes B.



Revision as of 15:33, 26 January 2007

Link titleWhen a correlation implies causation, whether in the sciences or through statistics, it may be assumed there is a cause-and-effect relationship between two variables. However, according to logical reasoning a correlation does not imply causationand should not immediately be considered a cause-and-effect relationship. When two events that occur together are prematurely considered to be a cause-and-effect relationship, and evidence only exists that the relationship is correlational, it is a logical fallacy to suggest one factor is immediately causing the other. This type of logical fallacy is known as cum hoc ergo propter hoc (Latin for "with this, therefore because of this") and false cause.

Usage

In the most literal sense, to say a "Correlation does not imply causation" may sometimes be incorrect. In logic, "imply" means

  • To involve as a necessary circumstance. - which may make the above phrase correct in some cases.

This is the meaning intended by statisticians when they use the phrase. Indeed, p implies q has the technical meaning of logical implication: if p then q symbolized as p ⇒ q.

However, in everyday English, "imply" often means

  • To indicate or suggest.

However, to say a "Correlation does not suggest causation" is not necessarily true: A demonstrably consistent correlation often suggests or increases the probability of some causal relationship (or implies it, in the latter sense of the term). What the correlation does not do is prove causation, as arguments that use the cum hoc ergo propter hoc logical fallacy as a pattern of reasoning assert.

Edward Tufte, in a criticism of the brevity of Microsoft PowerPoint presentations, deprecates the use of is to relate correlation and causation (as in "Correlation is not causation"), citing its inaccuracy as incomplete. While it is not the case that correlation is causation, simply stating their nonequivalence omits information about their relationship. Tufte suggests that the shortest true statement that can be made about causality and correlation must be at least expanded to either

Empirically observed covariation is a necessary but not sufficient condition for causality.

or

Correlation is not causation but it sure is a hint.

General pattern

The cum hoc ergo propter hoc logical fallacy can be expressed as follows:

  • A occurs in correlation with B.
  • Therefore, A causes B.

In this type of logical fallacy, one makes a premature conclusion about causality after observing only a correlation between two or more factors. Generally, if one factor (A) is observed to only be correlated with another factor (B), it is sometimes taken for granted that A is causing B even when no evidence supports this. This is a logical fallacy because there are at least four other possibilities:

  1. B may be the cause of A, or
  2. some unknown third factor is actually the cause of the relationship between A and B, or
  3. the "relationship" is so complex it can be labelled coincidental (i.e., two events occurring at the same time that have no simple relationship to each other besides the fact that they are occurring at the same time).
  4. B may be the cause of A at the same time as A is the cause of B (contradicting that the only relationship between A and B is that A causes B). This describes a self-reinforcing system.

In other words, there can be no conclusion made regarding the existence or the direction of a cause and effect relationship only from the fact that A is correlated with B. Determining whether there is an actual cause and effect relationship requires further investigation, even when the relationship between A and B is statistically significant, a large effect size is observed, or a large part of the variance is explained.

Examples

Sleeping with one's shoes on is strongly correlated with waking up with a headache.
Therefore, sleeping with one's shoes on causes headache.

The above example commits the correlation-implies-causation fallacy, as it prematurely concludes that sleeping with one's shoes on causes headache. A more plausible explanation is that both are caused by a third factor, in this case alcohol intoxication, which thereby gives rise to a correlation. Thus, this is a case of possibility (2) above.

A recent scientific example:

Young children who sleep with the light on are much more likely to develop myopia in later life.

This result of a study at University of Pennsylvania Medical Center was published in the May 13, 1999, issue of Nature and received much coverage at the time in the popular press . However a later study at Ohio State University did not find any link between infants sleeping with the light on and developing myopia but did find a strong link between parental myopia and the development of child myopia and also noted that myopic parents were more likely to leave a light on in their children's bedroom . This is a case of (2).

Another example:

Since the 1950s, both the atmospheric CO2 level and crime levels have increased sharply.
Hence, atmospheric CO2 causes crime.

The above example arguably makes the mistake of prematurely concluding a causal relationship where the relationship between the variables, if any, is so complex it may be labelled coincidental. The two events have no simple relationship to each other beside the fact that they are occurring at the same time. This is a case of possibility (3) above.

Another example:

Not eating causes anorexia nervosa.

Depending on the evidence used to support this statement, it can be shown that this is a correlation implies causation error of either type (1) or (4) described above. Having the disease Anorexia Nervosa may be the cause of not eating. This could, however, also be an example of case (4): It is correct that not eating does cause anorexia nervosa, but it can also be claimed that having developed anorexia nervosa causes one not to eat. Empirical evidence would be necessary to make a causative statement.

A more complex example:

Scientific research finds that people who use cannabis (A) have a higher prevalence of psychiatric disorders compared to those who do not (B).

This particular correlation is sometimes used to support the theory that the use of cannabis causes a psychiatric disorder (A is the cause of B). Although this may be possible, we cannot automatically discern a cause and effect relationship from research that has only determined people who use cannabis are more likely to develop a psychiatric disorder. From the same research, it can also be the case that (1.) having the predisposition for a psychiatric disorder causes these individuals to use cannabis (B causes A), OR (2.) it may be the case that in the above study some unknown third factor (e.g., poverty) is the actual cause for there being found a higher number of people (compared to the general public) who both use cannabis and who have been diagnosed as having a psychiatric disorder. Alternatively, it may be that the effects of cannabis are found more pleasureable by persons with certain psychiatric disorders. To assume that A causes B is tempting, but further scientific investigation of the type that can isolate extraneous variables is needed when research has only determined a statistical correlation.

Flying Spaghetti Monsterism, a parody religion founded in 2005, satirically states that there is a correlation between the number of pirates and many natural disasters. Bobby Henderson, the creator of this religion, put forth the argument that:

Global warming, earthquakes, hurricanes, and other natural disasters are a direct effect of the shrinking numbers of pirates since the 1800s.

This helps to show that things with statistically significant correlations are not necessarily related.

An episode of The Simpsons (Season 7, "Much Apu About Nothing") serves as a good example of this principle. Springfield had just spent millions of dollars creating a highly sophisticated "Bear Patrol" in response to the sighting of a single bear the week before.

Homer: Not a bear in sight. The "Bear Patrol" is working like a charm!
Lisa: That's specious reasoning, Dad.
Homer: Thanks, honey.
Lisa: By your logic, I could claim that this rock keeps tigers away.
Homer: Hmm. How does it work?
Lisa: It doesn't work. (pause) It's just a stupid rock!
Homer: Uh-huh.
Lisa: But I don't see any tigers around, do you?
Homer: (pause) Lisa, I want to buy your rock.

Determining causation

This article does not cite any sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Correlation does not imply causation" – news · newspapers · books · scholar · JSTOR (October 2006) (Learn how and when to remove this message)

David Hume argued that causality cannot be perceived (and therefore cannot be known or proven), and instead we can only perceive correlation. However, he argued that we can use the scientific method to rule out false causes.

In modern science, causation is defined by a counterfactual. Suppose that a student performed poorly on a test and guesses that the cause was not studying. To prove this, we think of the counterfactual - the same student writing the same test under the same circumstances, but having studied the night before. This counterfactual is certainly possible, but it is not what happened and so the counterfactual test score cannot be observed. If we could rewind history, and change only one small thing (making the student study for the exam), then causation could be observed (by comparing version 1 to version 2). Because we cannot rewind history and replay events after making small controlled changes, causation can only be inferred, never exactly known. This is referred to as the Fundamental Problem of Causal Inference - it is impossible to directly observe causal effects.

The central goal of scientific experiments and statistical methods is to approximate as best as possible the counterfactual state of the world. For example, one could run an experiment on identical twins who were known to consistently get the same grades on their tests. One twin is sent to study for six hours while the other is sent to the amusement park. If their test scores suddenly diverged by a large degree, this would be strong evidence that studying (or going to the amusement park) had a causal effect on test scores. In this case, correlation between studying and test scores would almost certainly imply causation.

Well designed statistical studies replace equality of individuals as in the previous example by equality of groups. This is achieved by randomization of the subjects to two or more groups. Placing the subjects randomly in the treatment/placebo groups, ensure that it is highly likely that the groups are reasonably equal in all relevant aspects. If the treatment has a significant different effect than the placebo, one can conclude that the treatment is likely to have a causal effect on the disease. This likeliness can be quantified in statistical terms by the P-value.

See also

Common fallacies (list)
Formal
In propositional logic
In quantificational logic
Syllogistic fallacy
Informal
Equivocation
Question-begging
Correlative-based
Illicit transference
Secundum quid
Faulty generalization
Ambiguity
Questionable cause
Appeals
Consequences
Emotion
Genetic fallacy
Ad hominem
Other fallacies
of relevance
Arguments

References and notes

  1. Karl L. Wuensch, Department of Psychology, East Carolina University When does correlation imply causation?
  2. Tufte, Edward R. (2006). The Cognitive Style of PowerPoint: Pitching Out Corrupts Within. Cheshire, Connecticut: Graphics Press. p. 5. ISBN 0-9613921-5-0.
  3. CNN, May 13, 1999. Night-light may lead to nearsightedness.
  4. Ohio State University Research News, March 9, 2000. Night lights don't lead to nearsightedness, study suggests.
  5. Henderson, Bobby (2005). "Church of the Flying Spaghetti Monster" (HTML). Retrieved 2006-06-11.

External links

Categories: