Author Talk: Causation in Population Health Informatics & Data Science
Olaf Dammann, Benjamin SmartAuthors Olaf Dammann and Benjamin Smart talked with us about their new book “Causation in Population Health Informatics and Data Science.”
You are an interesting team, with an MD, an MS in epidemiology, and a PhD in metaphysics between you. How does this intense interdisciplinarity shape the book and your collaborative working relationship?
The book was conceptualized from the beginning as an interdisciplinary endeavor. Our collaboration started after OD had already finished a first draft of all chapters. Then BS offered substantial changes and additions, in particular to chapters 3 and 4. Both went over the entire manuscript multiple times, and the result is an interdisciplinary “philosophy of population health science” view.
We had initially recognized the need for an account of causation for the population health sciences that goes beyond and is different from the potential outcomes approach (POA) currently championed in analytical biostatistics and epidemiology.
The POA is a counterfactual framework for causal inference in epidemiology and biostatistics. It is rooted in the idea that you cannot both expose and not expose a subject to an intervention in order to see if it works. Instead of creating parallel universes, you must instead use a comparison case or group who are as similar as possible to the respondent except for treatment status. The POA offers tools to facilitate causal inference in observational studies, for example, by quasi-randomization of subjects to the treatment or control condition. Our approach is more holistic and less technical.
We think that decades, if not centuries worth of work in the philosophy of science are basically neglected by theoreticians in the health sciences. Therefore, we deliberately wanted to de-emphasize techniques such as directed acyclic graphs (DAGs), which are helpful in modeling prima facie causal relationships, but cannot contribute much to our understanding of what causes in population health actually are and how we can identify them.
We think that decades, if not centuries worth of work in the philosophy of science are basically neglected by theoreticians in the health sciences.
Instead, we first outline our perception of what health data science is (chapter 2), review the metaphysics of causation (chapter 3), and discuss some issues related to causal inference (chapter 4). Chapter 5 addresses how we arrive at knowledge produced by the data > information > evidence sequence, chapter 6 focuses on the notion of population risk, and chapter 7 advocates for an approach that integrates multiple kinds of evidence from different “systems sciences.” This is (hopefully) only the first edition of the book, and we sincerely hope to to expand it in the future.
Can you explain what you mean by a systems-based approach to causal inference in health informatics? Why is it an important direction for population health science?
A systems perspective on causality positions the book within the realm of systems approaches to the health sciences, i.e., systems biology and systems medicine (chapter 7). In population health, systems epidemiology is central to this idea, enabling the interplay between observation and experiment, and featuring a cycle from hypothesis via experiment and simulation towards refined hypothesis. Causal inference requires evidence from all such systems of representation (biological and social systems, etc.) as well as from systems of inquiry (lab experiment and population data science). In essence, we believe that for comprehensive causal-mechanical explanation, much more is needed than just one lab experiment or or one randomized study. Much of this evidence will be non-experimental (as in the smoking-lung cancer connection) and the question is how to conceptually integrate evidence from experimental and non-experimental work performed in human populations and in the lab. The communication between scientists from these different research fields is currently not formalized in population health. We ask: might it be helpful to have a framework that does just that?
The communication between scientists from these different research fields is currently not formalized in population health. We ask: might it be helpful to have a framework that does just that?
What can your readers gain from thinking more about the metaphysics of illness causation?
We don’t need to have an account of causation to find causes. We can know that something is a cause without knowing exactly why that is. But we do need an account of causation if we want to provide etiological explanations, which are based on causal and mechanistic evidence. Each system we look at (biology, psychology, sociology) might have its own kinds of causation, in the sense of causal mechanism. We will have an argumentative advantage if we have a solid grasp on what we are talking about when we are talking about causation at each of these levels and between them.
Take, for example, preterm birth. There are genetic, social, environmental, psychological, and other factors that have been identified as important. If these factors are all causal, are they all operating via the same kind of causation? If not, what are the differences? How can we integrate information from all relevant causal systems into one etiological explanation of preterm birth occurrence? Our book lays the groundwork for this kind of thinking and opens the door for holistic theory development in population health science.
All comments will be reviewed and posted if substantive and of general interest to IAPHS readers.