Many readers will remember the sequence of events in which former football star O.J. Simpson was acquitted of the murder of his estranged wife and a friend in a criminal trial, yet found liable for damages in a civil suit brought by the family of one of the victims. Leaving aside the sociological roots of the not-guilty verdict in the United States' tragic history of racial antagonisms, in analytical terms the discrepancy can be explained with reference to the higher standard of proof in a criminal trial (proof beyond a reasonable doubt) than in a civil proceeding where a claim for damages can be sustained on a preponderance of the evidence or, in some common law jurisdictions, on the balance of probabilities.
The idea of a standard of proof is critical to understanding the question posed in the title of this posting. A classic article published in 1978 by economist Talbot Page (1) used this concept to analyze public policies toward "environmental risks" like toxic chemicals, which share such characteristics as incomplete knowledge of the mechanism of action, long latency periods between exposure and illness, and irreversibility. He pointed out that most forms of scientific inquiry are organized around minimizing Type I errors – that is, 'false positives' or incorrect rejections of the null hypothesis. Page used the analogy of the standard of proof in criminal trials, and went on to argue that minimizing Type I errors may be a thoroughly inappropriate principle when applied to use of scientific evidence in public policy, because it fails to take into account uncertainty and consequences. Stated another way, "a risk/benefit assessment," albeit often an implicit one, "is part of every public policy action which is based upon the interpretation of the results of a scientific investigation." (2)
Waiting for "evidence of dead bodies" may be inappropriate when responding to health threats from environmental hazards.
Photo by biofriendly, reproduced under a Creative Commons licence.
This point has often been lost sight of in controversies about controlling toxic exposures in the environment and the workplace, with industry resisting regulation by demanding stronger – usually epidemiological – evidence and trying to cast the issue as one of scientific uncertainty: demanding what another economist has described as a "tobacco industry standard of proof." (3) Page correctly pointed out that: "In its extreme, the approach of limiting false positives requires positive evidence of 'dead bodies' before acting." This is, in fact, the standard of proof that has often been applied to research on the health effects of environmental hazards. A further point of importance is that the conventional threshold of statistical significance – 95 percent – may require extremely large and unmanageable sample sizes when the prevalence of a particular adverse outcome is only moderately elevated over background levels. (4) As Page pointed out, "there is literally no information content in a negative finding unless there is an analysis of ... the probability of a false negative." (1)
Choosing a standard of proof for purposes of public health policy therefore is unavoidably an ethical decision, having to do – as yet another author pointed out at around the same time – with the relative acceptability of being wrong in different kinds of ways (5) while we wait for evidence that may or may not be obtainable. Interestingly, a workshop on conceptual and methodological issues in public health science held at the University of Cambridge in 2010 revisited these questions, suggesting that understanding of them in the relevant research communities remains incomplete, even as they remain topical with respect to such issues as environmental causes of breast cancer .
The question of how much evidence is needed for action on social determinants of health underscores the value-laden nature of choices about the appropriate standard of proof. At least two issues are critical.
First, what kinds of research findings are relevant? Clinical epidemiology now widely accepts a hierarchy of evidence with the randomized controlled trial (RCT) at the top; presumably, this is what two authors writing on global health governance had in mind when they claimed that "[f]ew global health interventions are evidence-based, and interventions to improve population health among the poor are often untested ..." To some of us, this assertion is nothing short of bizarre, and neglects the fact that many interventions outside clinical settings cannot be assessed using RCTs, for reasons of ethics, logistics, or both. Colleagues and I pointed out a decade ago, in the context of research on preventing mental illness, that "choosing certain research strategies and standards of proof means the big questions ... probably will not be studied in ways that demonstrate the effectiveness of larger-scale, contextual interventions, and even the small questions will be asked in ways that seriously circumscribe the set of possible answers."
A methodologically pluralist approach, organized around what a former colleague calls a "portfolio of evidence," will yield more meaningful and policy-relevant answers. Unbeknownst to us, Michael Marmot had made a similar point the previous year in a general discussion of evidence for influences on population health: "The further upstream we go in our search for causes ... the less applicable is the randomized controlled trial. .... We must therefore rely on observational evidence and judgment in formulating policies to reduce inequalities in health. In this process, the best should not be the enemy of the good. While we should not formulate policies in the absence of evidence to support them, we must not be paralyzed into inaction while we wait for the evidence to be absolutely unimpeachable." (6) He continues to make this point.
395,000 Ontarians received help from food banks in March, 2011.
Image courtesy Ontario Association of Food Banks.
Second, is it necessary to wait for evidence that a particular policy or intervention leads to improved health outcomes, or is it sufficient to have evidence of reduction in risk factors or what might be called intermediate biological variables (like markers of allostatic load, in the context of prolonged stress) that are known to have an adverse effect on health outcomes? This question gains urgency from knowledge of the cumulative effects of negative contextual influences on health over the life course: "waiting for dead bodies" in this case, as in others, can amount to carrying out a large-scale experiment on non-consenting subjects, the results of which may not be available for a generation. Obviously, ongoing evaluation of interventions and policy changes is important, but how much more do we need to know before (for instance) doing what it takes to reduce food insecurity among people for whom eating a healthy diet while paying market rents is arithmetically impossible?
This is a rather polemical way of stating the question, but it is useful in order to get at the hard politics of debates about evidence. Many policies and interventions needed to reduce health disparities by way of social determinants of health will be explicitly redistributive – starting with reductions in income inequality, as noted in a forthcoming editorial in the American Journal of Public Health. As mentioned, companies facing costly regulation of their activities have long found it attractive to frame their opposition as based on the insufficiency of scientific evidence. Similarly, those who stand to lose from tackling "the inequitable distribution of power, money, and resources" – one of the three overarching recommendations of the Commission on Social Determinants of Health – may frame their opposition in terms of the need for more evidence rather than simple self-interest. One-percenters, and those on a fast track to that status, are not a natural constituency for redistributive policies. This is not of course the only explanation for hostility to the social determinants of health agenda, but it cannot be disregarded. Against this background, it's especially important to keep in mind that the appropriate questions are not only about the strength of evidence, but also about how uncertainty should be resolved in a context where "deferring a decision is a decision in itself." They are, in other words, rooted firmly in the domain of public health ethics. Only by insisting on this point can we be sure that debates about when and how to act involve – as they should – the language of values and social justice.
(1) Page, T. (1978) A Generic View of Toxic Chemicals and Similar Risks. Ecology Law Quarterly, 7, 207-244.
(2) Darby, W. (1979) An Example of Decision-Making on Environmental Carcinogens: The Delaney Clause. Journal of Environmental Systems , 9, 109-117.
(3) Crocker, T.D. (1984) Scientific Truths and Policy Truths in Acid Deposition Research. In T. Crocker, ed., Economic Perspectives on Acid Deposition Control (pp. 65-79). Ann Arbor Science Acid Precipitation Series vol. 8. Boston: Butterworth.
(4) See e.g. Higginson, J., Muir, C.S., Muñoz, N. (1992) Human Cancer: Epidemiology and Environmental Causes (pp. 39-44). Cambridge: Cambridge University Press.
(5) Jellinek, S. D. (1981) On the Inevitability of Being Wrong. Annals of the New York Academy of Sciences, 363, 43-47.
(6) Marmot, M. (2000). Inequalities in Health: causes and policy implications. In A. Tarlov & R. St.Peter, eds., The Society and Population Health Reader, vol. 2: A State and Community Perspective (pp. 293-309). New York: New Press.