Thomas Hoskyns Leonard Blog: WHY WE NEED RANDOMIZATION IN CLINICAL TRIALS

WHY WE NEED RANDOMIZATION IN CLINICAL TRIALS

Tom Leonard October 2019

INTRODUCTION

More and more clinical trials for medical and psychiatric medications are failing to adhere to key standards i.e randomization of the subjects and replication of the experiment to reduce the effects of unlucky randomization (See Fundamentals of Clinical Trials by Friedman et al, Springer, 1981). However, without such standards, any statistical conclusions are at best highly subjective and at worst totally misleading. An incisive comment by Dr. Ewart Shaw, who I met a couple of years ago at the University of Warwick, is appended, and Professor Gillian Raab of Edinburgh University is preparing a paper on further justifications of randomisation in clinical trials..

***While there are some practical problems in the implementation of randomisation it should always be attempted in order to avoid functionally useless and possibly harmful experiments on multiple human subjects.***

EXAMPLE: While the highly questionable CATIE study (Lieberman et al, 2005, Shortreed and Moodie, 2012) randomly assigned 1493 patients with chronic schizophrenia to different, very harmful atypical anti-psychotics, 74% of these patients discontinued their drugs within 15 months of treatment. Many of these were harmed by side effects and others by receiving inappropriate treatment for their condition. If the patients hadn't been assigned at random. then the ill-gotten conclusions would have been totally useless. As it was, the 1493 patients in the study were not chosen at random from any large 'population of interest'. Therefore, the results obtained by the multitudinous co-authors were ungeneralisable in the sense that they were effectively irrelevant to any large population.

Shortreed and Moodie later used these results while attempting to justify a horribly irresponsible 'optimal scheme' for switching patients between different anti-psychotics and different sets of potentially harmful side effects

Professor Jeffrey Lieberman, erstwhile Chairman of Psychiatry, Columbia University, a place where Lucifer lingers.

I have sent a copy of this article to the editor of a forthcoming Springer volume where, I understand, some of the contributing authors, guided by wishy washy philosophies and offbeat philosophers, are still advocating Lindley-Novick exchangeability assumptions as an excuse to avoid randomization. I think that this is very misleading for practitioners, and puts human lives at risk.

I CHALLENGE THE EDITOR (who is a leading medical statistician) to respond as a comment on this blogpost. and to say why a volume with several harmful chapters should be published at all,.

1. Single Sample of Binary Observations

Suppose that it is required to estimate the proportion θ out of the N people in a population S who suffer from a disease D, and that a sample of size n is selected for this purpose. For i=1,n, let

x(i)= 1 if the ith. person suffers from disease D
0 otherwise.

If EITHER the n people have been chosen at random WITH replacement from S
OR the n people have been chosen at random WITHOUT replacement from S, and N is large
compared with n,

THEN the binary responses x(1), ----,x(n) may be taken to be independent with common
expectation θ.

In this case, the observed frequency

y=x(1)+----+x(n)

possesses a binomial distribution with probability θ and sample size n. Consequently the sample
proportion z=y/n is an unbiased estimator of θ with variance θ(1-θ)/n.

For example, when the true θ is 0.01 and n=10000, z had expectation 0.01 and standard deviation 0.0001.

If θ is unknown, n=10.000 and we observe z=0.0102, then this is an unbiased estimate of θ with an estimated standard error of about 0.0001. We can therefore be approximately 95.44% confident that the true θ lies in the interval (0.0100, 0.0104).

Note that such very large samples sizes are needed to evaluate population proportions to reasonable degrees of accuracy even when the data result from a controlled, randomized experiment.

Unfortunately, if we have purely observational data, where no randomization is completed at the
design stage, then there are no grounds, without further assumption, for taking the binary responses to be independent or indeed to possess a common mean. Indeed, the 'obvious' assumption that y possesses a binomial distribution would be at best highly subjective and at worst misleading, as would any assumptions about the expectation and variance of z. The binomial assumption is nevertheless all too frequently made in practice, often on grounds of (simple minded!) 'simplicity'.A suitably parametrised hypergeometric distribution for y when the sampling is without replacement again demands that the sample should be chosen at random, in which case it is exact rather than approximate ..

One possibility when analysing non-randomized data would be to follow Lindley and Novick (1981)
by taking x(1),---, x(n) to be exchangeable in the sense of De Finetti (1937), i.e. in formal terms by
assuming that the joint distribution of these binary responses is invariant under any permutation of the suffices. In subjective terms, you could make this assumption a priori if you feel before viewing the n observations that you have a symmetry of information about these binaty responses.

Exchangeability implies that each x(i) possesses a binary distribution with a common expectation θ ,
and hence that z is an unbiased estimator of this expectation, which by a conceptual leap could be taken to be the unknown population proportion. However, it does NOT imply that the binary responses are independent, or that y possesses a binomial distribution (or a hypergeometric distribution when the sampling is without replacement). For example, no estimable standard deviation fo z is obviously available.

Suppose, in conceptual terms, that we would be prepared to assume exchangeability of the binary
responses whatever the value of n (for full mathematical rigour we would need address the situation where the sampling is with replacement, so that arbitrary large values of n can be considered, If the sampling is instead without replacement, the two-stage structure described below will be completely general whenever the x(i) are positively correlated, ).

Then De Finetti's much celebrated exchangeability theorem then tells us that the joint distribution of the binary responses must be describable in the following two stages, for some choice of the c.d.f. F:

Stage 1: Conditional on the value of a random variable u on the unit interval (0,1), the responses
x(1),---,x(n) possess independent binary distributions with common expectation u

Stage 2: The random variable u possesses c.d.f. F and expectation θ.

For example, let F denote the c.d.f. of a beta distribution with parameters
α= γθ and β= γθ(1-θ),and hence with mean θ and variance θ(1-θ)/ (γ+1).

Then the observed y possesses a beta-binomial distribution with mean n θ , and variance
var (y)= nτθ(1-θ) where τ= (n+γ)/ (1+γ) is the overdispersion factor ( τ tends to unity as γ tends to infinity, in which case u has mean and zero variance, corresponding to the previous binomial assumption for y).

Consequently, the unbiased estimator z has mean θ and variance τθ(1-θ)/n. A large value of τ would greatly inflate this variance, together with any estimated standard error for z.

Unfortunately, F is not identifiable, beyond its mean θ , from the current data set. For example, if F
is taken to be the c.d.f. of a beta distribution, then the overdispersion factor τ is not identifiable from the current data, whatever the sample size!!

This is because the joint probability mass function of the binary responses, unconditional on u,
depends only on the observed frequency y, the sample size n and the unknown c.d.f. F. When viewed as a functional of F, this is the likelihood functional of F, given the data, and therefore summarises the
information in the data about F.

As the likelihood functional only depends upon the data via the one-
dimensional statistic y, nothing about F apart from the mean θ can be estimated from the data. (The likelihood can be expressed as an expectation with respect to u given F of a function of u and y)

Consequently the Lindley-Novick exchangeability assumption is of very limited use indeed,
While it justifies unbiased estimation of the population proportion it does not justify more
general inferences,unless large amounts of information (e.g. prior information) from other
sources are combined with the information in the sample,

Moreover, replication does not obviously help. Suppose that we take r samples of size n from
thesame population S. Then, without randomisation it would not be obvious how to justify an assumption of independence of the m samples of binary responses.

If the rxn responses are instead taken to be exchangeable, then De Finetti's theorem implies
their conditional independence, but nothing more than their common mean can be estimated
from the replicated data set.

Dennis Lindley (my Ph.D, supervisor at UCL 1971-73)
When Dennis was appointed to the Chair of Statistics at UCL
in 1967 it was said that it was 'as if a Jehovah's Witness had
become Pope." One of the first papers he gave me to read was
De Finetti's 1937 paper on subjective probability and
exchangeability. It was like an edict from Rome

2. Clinical Trials

In clinical trials comparing the recovery rate for patients receiving a drug with the rate for patients receiving a placebo, some patients should be assigned at random to the treatment group, and the remainder to the control group, and reference is not always made, as it should be, to a larger population.

Lindley and Novick quite amazingly claim that their particular exchangeability assumptions replace the need for randomization in clinical trials (in particular those meriting the comparison of binary observations in a treatment group with those in a control group). However, by obvious mathematical extensions of the arguments of section 1 (which now take N to denote the total number of patients participating in the trial) no valid statistical inferences can be drawn from such trials without further strong assumption. Similar arguments hold for more complex clinical trials.

Moreover, Lindley and Novick claim that subjective assumptions of exchangeability can be used to resolve the classical Simpson's paradox./confounding variable problem in clinical trials. This is blatantly untrue in any objective sense. For further discussion of Simpson's paradox in this context see Leonard (1999, Ch.3), where the paradox and its resolution by randomization is described in detail via a three-directional approach.

Ewart Shaw

Comment from Dr, Ewart Shaw

I agree completely (also about the more general case). I discussed this briefly with Dennis (Lindley) over thirty years ago, mainly saying that because of the financial and other pressures on researchers, and the scope for unintentional or intentional bias, I couldn't trust any non-randomised trial, and would hope that those responsible for making possibly far-reaching decision based on the trial's results wouldn't trust it either, no matter how clever the model and well-intentioned the researchers. So the researchers would have carried out a functionally useless experiment on human subjects, which is simply immoral. I'm not sure how convinced Dennis was by my arguments!

Acknowledgement

I would like to thank Gillian Raab for recent discussions on this topic. Gillian is working on other justifications of randomisation in clinical trials, She spent part of her career working with Professor David Finney, who was famous for developing systems which sought to ensure the safety of drugs,

Thomas Hoskyns Leonard Blog

Search This Blog

Sunday, 6 October 2019

WHY WE NEED RANDOMIZATION IN CLINICAL TRIALS

1 comment: