Correlation is not causation. But then what is causation?!

Essay 1

Mar 17, 2025

Introductory note:

Last year, with others in my doctoral cohort at Columbia University, I spent significant time reading, thinking, and writing about some of the philosophical and theoretical issues in causal inference. This essay was initially written as part of the requirement for Prof. Sharon B Schwartz’s class on Causal Inference. The walls of Columbia Mailman School of Public Health can tell you stories about how brutal is the class material. But they also tell you how extremely wonderful is Sharon as an instructor, mentor, and person.

Maybe it was the brooding fall weather, the 2 am anxiety that hit me every Sunday before the class, the passion to write too much where it is not needed, or something entirely else that continues to escape my intellect — the essays turned out much better than what I had expected. So, I thought I will start putting these up.

This is work in progress. These are my independent takes. That said, things that you like here are most likely inspired by others around me who are better than me. For the things that you don’t like, I am the sole custodian.

This is hopefully the first of the many essays, as my thoughts evolve and I go from knowing nothing to knowing enough such that I know that I don’t know anything much.

My goal in publishing this is simple: help people think and talk about causal inference - one of the most meaningful pursuits across all disciplines - without the barrier of math.

In this essay, I posit that causation, i.e., causal effect, in epidemiology under the potential outcomes or counterfactual framework is about estimating the effect of a cause compared with other cause(s), where a cause is a variable with well-defined counterfactuals that is manipulable or potentially exposable over all units1 of the study population. In the first section, I will explain the elements of this definition based on the synthesis of literature from the thinkers of the counterfactual school. In the second section, I elaborate on some of the implications of this definition that draw from the arguments made by the counterfactual school thinkers.

Elements of the definition of causation

Deconstructing the above definition is necessary. The first noteworthy element is the ‘effect of a cause’. Under the counterfactual framework, the goal is to investigate the effects of the cause X (say) and not hunt down (all possible) causes of the outcome Y. This line of questioning - ‘does X cause Y’ centers the cause and uses a mere measurement of Y at the receiving end of the causal effect. This is different from the traditional approach in epidemiology where the investigators try to find the necessary, sufficient, or component causes of an outcome. Holland 1986 notes the distinction between the effect of a cause and the causes of an effect explicitly. Rubin 1974 nods to it at the outset by noting “causal treatment effects” as the grounding notion for causal inference in randomized and non-randomized studies.

The second element to focus on is that causes are manipulable or potentially exposable over all units of the study population. All the thinkers in the counterfactual school, elaborate on causation following the language of the (randomized) experiments where the experimenter assigns a treatment. It is this potential to assign the exposure to any study unit in the population that makes it a cause. If you cannot control the process of assignment, it cannot be studied as a cause in the counterfactual framework. To denote this, Rubin 1974 interchangeably uses causal and treatment effects. Holland 1986 clarifies that his juxtaposition of cause and treatment is to depict that cause is an action (or set of actions: ‘X’ and ‘not X’) that a study unit can be theoretically exposed to regardless of the reality of the presence (or absence) of the action, notwithstanding randomization. I believe that Hernan & Robins 2020 go a step further since they clearly and explicitly note intervention as the cause and categorically avoid using exposure. Hence, the use of ‘intervention’ notes that the investigator, who is ideally an experimenter or more reluctantly an observer, has control over the cause X to assign, remove, or change (manipulate) it.

The third element is the notion of ‘cause compared to another cause’. As noted above, the counterfactual school thinkers have adopted the language of experiments. By counterfactual, the thinkers of this school mean: ‘What would Y be had X not been the way it is’. This way of thinking, according to Rubin 1974, Hernan 2005, and Hernan & Robins 2020, is how humans think about causation, naturally, although those lacking training may be unable to formalize it. Rubin and Holland further argue that lay people in their day-to-day lives solve causal problems this way. Holland 1986 also notes that scientists, in the absence of statistical apparatus, under careful observations and justifiable consideration for homogeneity of everything else before and after the exposure, inferred causation by this way of counterfactual thinking. Hence, the causal effect of X on Y can only be captured by the contrast between Y under X and lack of action of X (control) or another cause, say, X’. Mathematically, causal effect = Y^X- Y^X’ for the given study unit at the same time point in the post-treatment period (i.e., the temporality of cause preceding the outcome needs to be ensured).2

The fourth element is about estimating the causal effects. Theoretically, the causal effect as defined above (Y^X- Y^X’) would require simultaneous observation of the two Ys for the given individual study unit. Both Holland 1986 and Rubin 1974 note the pragmatic impossibility of such a measurement with Holland calling it the ‘fundamental problem of causal inference’. Hence, the individual causal effects cannot be assessed. A statistical solution in such a case is to estimate the aggregate causal effect as the aggregate of the individual causal effects for a group of study units. Rubin 1974 and Hernan & Robins 2020 both note that different aggregate effects can be computed. However, all the thinkers seem to agree on the average causal effect as the common, easily interpretable, and computable aggregate causal effect. Hernan & Robins 2020 also present the calculations for measures of effect known to epidemiologists including the (causal) risk ratio, risk difference, and odds ratio under the counterfactual framework. Visibly, these do not look different from those used for presenting associations. However, Holland 1986 notes the subtle point of departure between the two. Associational inferences are statistical inferences based on the joint distributions of X and Y over the study population while inferences about (individual) causal effects are about the Y^X - Y^X’ contrast for a given study unit at some post-treatment time, under several assumptions, which I have discussed as elements here.

The final element requires the exposition of what all is entailed by ‘well-defined’. My reading is that well-defined counterfactuals of a cause include at least three things. The first and arguably the most important component of a well-defined counterfactual has to do with the postulate that all study units must be potentially exposable to the cause. Except maybe an implicit nod by Holland 1986, Rubin 1974 and Hernan & Robins 2020 side-step that this core idea is a postulate. The emphasis is on postulate since this is an assumption that is not universally true (like an axiom) but is needed for the foundation of the framework. So, at least theoretically, it should be possible to expose any random study unit to the specific treatment (say X vs. X’). Only then, it is possible to have counterfactuals Y^X and Y^X’ necessary for defining the effect of cause X compared with cause X’. If Y^X’is impossible, even in theory, then one cannot use the counterfactual framework to define and estimate individual causal effects since it would break the consistency assumption for treatment X’ per Hernan & Robins 2020. In the absence of a possible conception of individual causal effects, aggregate causal effects are impossible.

The second component can be considered in terms of the construct validity of the cause. Beyond the possibility of the counterfactual, the appropriate specification of the counterfactual is necessary. Hernan 2005 notes this in the context of studying the effect of BMI change achieved through exercise vs. dieting on a health outcome. Simply noting BMI as the cause is ill-defined in this case since the underlying counterfactuals would be different for exercise and dieting scenarios. Hernan & Robins 2020 also note this for surgical treatments where one suspects variability in performance across surgeons. A cause with good construct would then be a detailed intervention protocol including all the underlying considerations. The issue of construct validity also applies to outcome Y.

The third and minor component is that of measurement validity as noted by both Holland 1986 and Rubin 1974. Rubin 1974 has focused more explicitly on the measurement validity of Y noting that the causal effects estimated by Y^X - Y^X’at the post-treatment time for the study units assume that the measurement or technical errors are negligible or at least known to the investigator and manageable.

Implications of the definition

While there might be multiple implications of defining causal effects under a counterfactual framework in the above way, I wish to draw attention to the one that I consider most worthy. The framework limits what epidemiologists can study as causes. While Hernan expresses some discomfort albeit implicitly, Holland 1986 is more explicit in noting that attributes aka variables recording inherent properties of the study units (e.g., gender or race in the case of individual persons) cannot be studied as causes under the prescribed framework (although one can study them as causes under another framework or certainly investigate associational inference).

For instance, the counterfactual school thinkers would argue that ‘Does having a good IQ cause better school performance’ cannot be answered. Here, we assume the construct and measurement validity of IQ (exposure) and school performance (outcome) and that IQ (high vs. low, for simplicity) is an inherent attribute of a child (study unit). Theoretically, a child can't have a high IQ and a low IQ. Whatever IQ they have is integral to that child’s identity/makeup. Hence, a given child is not potentially exposable to different IQ levels. Put otherwise, IQ is non-manipulable. An investigator cannot control to even the smallest extent the assignment of this exposure. Further, adopting the line of argument from Hernan 2005, one cannot design an experiment where they can randomly assign the IQ levels to a group of children enrolled in a study. There can never be a confirmatory independent experimental assessment of the effects obtained from an observational study. Hence, a causal treatment effect can never be established. To the counterfactual school of thinkers, studying causal effects is crucial because they are directly tied to interventions. As noted in the epidemiology wars, it is the ability to intervene in an evidence-based manner that makes epidemiology the science of public health.

I follow Holland 1986 to use the generic term study unit. Other thinkers seem to focus more explicitly on individuals. However, similar to Holland, I believe that nothing stops us from using the definition for any kind of study unit.

I am assuming that I do not need to explain the notation since it is standard in the sources. Hence, I am also side-stepping explaining the formal definition of consistency: If treatment is X, the Y = Y^X.

A to Z-a-d-e-y

Discussion about this post