Meta-research metrics matter: letter regarding article "indirect tolerability comparison of Deutetrabenazine and Tetrabenazine for Huntington disease".

Here we discuss the report by Claassen and colleagues describing an indirect treatment comparison between tetrabenazine and deutetrabenazine for chorea in Huntington's disease using individual patient data. We note the potential for discrepancies in apparently statistically significant findings, due to the rank reversal phenomenon. We provide some cautionary observations and suggestions concerning the limitations of indirect comparisons and the low likelihood that good quality evidence will become available to guide clinical decision comparing these two agents.


To the Editor,
We read with interest the report by Claassen and colleagues describing an indirect treatment comparison between tetrabenazine (TBZ) and deutetrabenazine (DEU) for chorea in Huntington's disease (HD) using individual patient data [1].
DEU is a form of TBZ, chemically-modified to optimize its pharmacokinetic properties. Both are vesicular monoamine transporter 2 (VMAT2) inhibitors and each was tested successfully against placebo for chorea associated with HD [2,3]. No double-blinded head-to-head comparison has been performed or is planned to compare the efficacy and safety profiles of these compounds.
Indirect treatment comparisons are useful meta-research tools when little or no data directly comparing treatment are available [4]. In meta-research, the use of individual patient data instead of aggregate data has many potential advantages, such as the power to study subgroups and to control for confounding factors. To some extent Claassen et al. used individual patient data, since raw patient data from the FIRST-HD trial testing DEU was incorporated into the analysis. This partially overcomes possible reporting bias from the literature, such as adverse events frequencies not recorded in primary reports, and provides the opportunity to use more complex and complete statistical models adjusted to important covariates. It is a shame that individual patient data from the TETRA-HD trial [3] were not included, especially since both trials were performed by the same study consortium (the Huntington Study Group). In addition, it would have been both possible and interesting to undertake exploratory analyses to find out whether subgroups of patients with different genders, CAG repeat lengths, baseline levels of functional ability, motor symptoms and quality of life differed in regards to safety profile.
We also think it unfortunate that the authors limited their report to safety data, while it would be extremely relevant to learn how the efficacy profiles of TBZ and DEU compared using the individual patient data available to them. We recently performed an indirect treatment comparison using all published aggregate data from the same trials, and found no difference between the primary efficacy outcomes of both trials: the total maximal chorea score change from baseline mean difference was between TBZ and DEU was −1.00 (95% confidence interval: −3.04 to 1.04) [5].
Interestingly, in contrast to Claasen and co-authors' findings, our safety outcomes did not demonstrate any difference between odds ratios of TBZ over DEU (Table 1). However, when converting our raw dataset to risk differences instead of odds ratios, as presented by Claassen et al., our results were in line with theirs, in regards to direction and magnitude of effect. This is a well-described statistical phenomenon called rank reversal. It stems from the fact that different measures (e.g. risk differences, risk ratios, and odds ratios) are affected differently by dissimilar baseline risks [6]. Bucher's model of indirect treatment comparisons was originally designed for odds ratios, but others have applied it to risk ratios and risk differences, with proper adjustment.
When indirectly comparing treatments, the choice of presented metric matters. We believe it is important for readers to be made aware of the potential for apparently discrepant findings that may arise from the rank reversal phenomenon [6]. Undoubtedly, risk differences increase interpretability, but odds ratios are the only measure that guarantees the avoidance of impossible predicted event rates when extrapolating the results for real populations (for example, applying a risk difference of 0.1 to a population with a risk of 0.05 would give rise to an apparent population risk with a value less than zero, which is implausible) and relative measures are known to be more consistent than absolute measures [7][8][9].
Choice of data and outcomes aside, an overarching concern is that none of the included clinical trials has been appropriately powered to investigate the safety of these compounds, and an indirect treatment comparison of one trial per agent (i.e. TBZ versus placebo, and DEU versus placebo), cannot improve the precision of the results. Therefore, the results of these comparisons [1,5] should be interpreted with caution; and although regarded as best available evidence, they are nonetheless of low quality according to the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework [10].
Given all this, only a direct comparison between TBZ and DEU would be able to rigorously test whether the efficacy profiles of these compounds significantly differ, and our sample size calculation (over 600 participants) suggests that such a trial is unlikely to take place [5]. Further postauthorisation safety studies or other observational studies will be required to provide robust evidence on the safety profile of DEU to inform safe prescribing decisions.

None.
Availability of data and materials Not applicable.
Authors' contributions FBR conceptualized, designed, interpreted, wrote the first draft and revised the manuscript. GSD, JC and JJF interpreted and revised the manuscript for intellectual content. EJW conceptualized, designed, interpreted, and revised the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Competing interests
All the authors of this manuscript performed and published an aggregate data indirect treatment comparison between tetrabenazine and deutetrabenazide for chorea in Huntington's disease. The authors declare that they have no other competing interests.  Presentation as risk differences produces statistically significant differences between DEU and TBZ that are not seen when presented as odds ratios; neither approach is intrinsically more accurate and an awareness of the difference is important. TBZ, tetrabenazine; DEU, deutetrabenazine; OR, odds ratio; 95% CI, 95% confidence interval; RD, risk difference; SAE, severe adverse events; *p-value < 0.05