Should the “gold standard” of randomized controlled trials really be the standard for transition treatment?

The standards we choose to set for the kinds of evidence that we consider valid and acceptable for a given purpose are something that can be wielded selectively or in bad faith for motivated purposes. Such maneuvers are sometimes attempted in anti-trans discourse, with clinical studies of transition treatment and its beneficial effects on trans people being labeled as inadequate due to their methodology and thus insufficient to demonstrate that transitioning is beneficial or necessary in the treatment of gender dysphoria. For instance, Hruz, Mayer, & McHugh (2017) assert:

Though there is very little scientific evidence relating to the effects of puberty suppression on children with gender dysphoria — and there certainly have been no controlled clinical trials comparing the outcomes of puberty suppression to the outcomes of alternative therapeutic approaches . . .

Coauthor Paul Hruz further stated during a 2017 Heritage Foundation panel:

“If you are going to take a standard approach to a treatment condition of any sort — not just gender issues, and make that drastic of a change — one would expect that there was a landmark study that was done, a randomized, controlled trial or a series of very important findings that consistently showed that this is a good idea,” warned Paul Hruz, a St. Louis-based doctor who is a professor of pediatrics, endocrinology, cell biology and physiology at the Washington University School of Medicine.

Such critiques pertain to different kinds of study design, and randomized controlled trials are just one kind of study design. In a randomized controlled trial, researchers randomly allocate subjects into a treatment group that is subjected to a given experimental condition, or a control group that is not. When the results for each group are compared following the study, any observed differences can thus be attributed to the experimental condition. (There are many other factors that can come to bear on the validity of these observations or the attribution of cause, but that is the basic idea.) This is the type of study design that actually is an experiment.

Conversely, an observational study does not carry out such an experiment – it does not randomly allocate subjects to treatment or control groups to compare the effects of a given intervention to its absence. For instance, an observational study might involve administering an antidepressant to all subjects in the study, and measuring their levels of depressive symptoms over time. This is not an experiment, as researchers are not controlling the variable of whether subjects receive a treatment or not.

But a randomized controlled trial is not always possible or advisable for the study of all treatments When the treatment being studied is transition – which can include hormone-suppressing puberty blockers, cross-sex hormone therapy, and a number of gender-affirming surgeries – the issues with conducting a randomized controlled trial of these treatments are rather obvious. Theo Schall of the Johns Hopkins Berman Institute of Bioethics outlines the problems that would crop up:

The gold standard for medical and social research is the randomized controlled trial, where study participants are randomly assigned to receive an intervention or to act as a control. An ideal study would be randomized, controlled, and have a large number of participants for statistical validity/generalizability. There are a number of reasons for the dearth of randomized controlled trials. Historically, there hasn’t been a lot of funding for trans health. There also aren’t all that many trans people in any given place, so statistically powerful research would require a wide geographic reach (and thus be more expensive). And there are ethical problems with true randomization. We can’t ask non-trans (or “cisgender”) people to get genital surgery just for comparison’s sake. It’s similarly difficult to ask gender dysphoric people, who suffer in contemporary social conditions, to wait on transition just so they can act as a control group. We can use cisgender people as controls instead, but this can have an impact on the specificity of research results.

Withholding treatments that are known to be highly effective in relieving gender dysphoria from those suffering from gender dysphoria would be considered unethical; such ethical obstacles are likewise present in other cases where the benefit of a given treatment is clear, and randomized controlled trials have at times been ended early because it was found to be unethical to continue withholding a clearly effective treatment from the control group. Blinding of whether an individual subject was part of the treatment or control group would also be impossible, as the effects of gender-affirming medical treatment would be obvious both to the subject and to the researcher.

So, Paul Hruz and his associates have set their standard of quality for clinical evidence: a randomized controlled trial. Medical transition is a kind of treatment that cannot be studied within such a trial. This makes it a simple matter for Hruz and others to reach their desired conclusion: that there will never be adequate evidence to support the use of medical transition as a treatment for gender dysphoria – and therefore it should not be used.

But this would only be the case so long as Hruz’s premise holds – that a randomized controlled trial is the only valid source of evidence to support such a treatment. Is this the case? No. A randomized control trial is superior to an observational study in that it allows for the direct attribution of causation – the finding that a given intervention caused a given response, whereas its absence produced a differing response. Observational studies merely find that a certain treatment is associated with a particular outcome.

That lack of ability to attribute causation does not therefore mean that an observational study’s findings are invalid or of no use whatsoever when it comes to acquiring evidence on the effects associated with a given treatment. Indeed, it would be a remarkable coincidence that study after study on transition treatment just so happens to show a beneficial effect on trans people across various outcomes, consistently and repeatedly. Schall continues:

Instead, the literature consists primarily of small observational trials with a fairly small number of participants each. In situations like these, scientists aggregate the data from those small trials into a meta-analysis so they can draw larger conclusions than would otherwise be possible. It’s not a perfect system, though – there’s a real risk of magnifying underlying problems. For example, if the small studies oversample trans women and undersample trans men (which tends to be the case), the meta-analysis may be similarly less generalizable to trans men. The absence of a control means it’s hard to tell if or to what extent the therapies are responsible for the outcomes. This doesn’t mean hormones, surgery, etc., aren’t good therapies – it just means the proof is weak. . . . If there was a lot of anecdotal evidence that suggested the extant literature was wrong, I’d be more inclined to grant weight to the opposition argument. But since anecdotal evidence seems to align with the weak scientific evidence rather than refute it, it appears that the current standard of care simply lacks robust evidence. This situation is likely to change with increased interest in trans healthcare and research.

And how do the results of observational studies compare to the results of randomized controlled trials? Do observational studies of a given treatment generally find results that are at odds with or otherwise substantially different from the results of randomized controlled trials of a given treatment? This does not appear to be the case. A Cochrane Collaboration review found that observational studies and randomized controlled trials generally return similar results, and that when the results do differ between observational studies and randomized controlled trials, it is important to look for causes other than the study design (Anglemyer, Horvath, & Bero, 2014):

. . . on average, there is little evidence for significant effect estimate differences between observational studies and RCTs, regardless of specific observational study design, heterogeneity, or inclusion of studies of pharmacological interventions. Factors other than study design per se need to be considered when exploring reasons for a lack of agreement between results of RCTs and observational studies.

Benson & Hartz (2000) compared the results for observational studies and randomized controlled trials for a variety of studied treatment, finding:

We found little evidence that estimates of treatment effects in observational studies reported after 1984 are either consistently larger than or qualitatively different from those obtained in randomized, controlled trials. . . . The fundamental criticism of observational studies is that unrecognized confounding factors may distort the results. According to the conventional wisdom, this distortion is sufficiently common and unpredictable that observational studies are not reliable and should not be funded. Our results suggest that observational studies usually do provide valid information.

Hannan (2008) further notes:

. . . for the most part, RCTs and OS do arrive at the same conclusions. Furthermore, RCTs and OS can be used synergistically to obtain more and better information about the relative merits of alternative interventions/treatments. . . . The design and ultimate conduct of the study is the principal criterion to consider, not the type of study per se.

Hruz’s implicit contention that because a study is not a randomized controlled trial, that study’s findings are not of sufficient quality and rigor to inform clinical practice, is not supported by what is known about the patterns of findings of these respective study designs. The findings of observational studies largely do tend to be consistent with those of randomized controlled trials, and where randomized controlled trials may be impossible, properly-conducted observational studies can serve to provide valuable data – data that we would not have if we were to presume that the alternative to a randomized controlled trial is to conduct no study at all. ■

How much does testosterone deepen voice pitch for trans men? »

« Large study of trans youth on HRT finds zero incidents of thrombosis

Categories: Ethics Transgender medicine

Tags: medicinesciencetransphobia

Zinnia Jones: My work focuses on insights to be found across transgender sociology, public health, psychiatry, history of medicine, cognitive science, the social processes of science, transgender feminism, and human rights, taking an analytic approach that intersects these many perspectives and is guided by the lived experiences of transgender people. I live in Orlando with my family, and work mainly in technical writing.

View Comments (1)

Kay says:

February 1, 2020 at 5:21 PM

Yes!! I’m also wondering about how the specific diagnostic framing of gender dysphoria as something treated by a certain medication leads to wrong conclusions like this. It’s often felt to me like it can strip trans people of our agency over our own bodies. Somewhere it feels like the positive desire for hormonal transition gets lost—the diagnostic system sees problems and suffering that needs to be fixed, but seems unable to hold that we feel a genuine desire for the correct hormones as well. And medically, that desire is not legible as an expression of need from our bodies. It’s still a bit hard for me to put words to it precisely because it seems like something amiss in the medical ontological fabric. A lot of what I’ve experienced in transitioning is that my body desires estrogen as if it’s a nutrient that I just happen to not endogenously produce. That desire is independent from the suffering from dysphoria, and it doesn’t really make sense to speak of treating or fixing desire. For the most part the suffering comes from the long term impacts of testosterone and lack of access to estrogen (which, also relevant to this discussion, is something the medical industry has artificially restricted access to. Exogenous hormones have existed in folk medicine practices for thousands of years). I also wonder about how this distinction appears in trans kids. I think that if someone had tried to apply the lens of gender dysphoria to me before puberty and before testosterone, it wouldn’t have shown up. But if I had been asked directly, I might have been able to articulate the desires and needs of my body.

This is all intertwined with the ways that the binary narrative of transitioning from one gender to another, while true for some trans people, is entirely insufficient as a general model and leads to all sorts of wrong ideas.