Analysis of clinical trials


The analysis of clinical trials involves many related topics including:

One basic guidance document on this topic is the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use guidance E9.[1]

Choice of analysis setEdit

Failure to include all participants in the analysis may bias the trial results. Most trials do not yield perfect data, however. "Protocol violations" may occur, such as when the patients do not receive the full intervention or the correct intervention or a few ineligible patients are randomly allocated in error. Despite the fact that most clinical trials are carefully planned, many problems can occur during the conduct of the study. Some examples are as follows:

  • Patients who do not satisfy the inclusion and/or exclusion criteria are included in the trial,
  • A patient is randomized to Treatment A, but is treated with Treatment B,
  • Some patients drop out from the study, or
  • Some patients are not compliant, that is, do not take their medication as instructed, and so on.

As treatedEdit

As-treated analysis has the general idea of comparing the subjects by the treatment regimen that they received. It does not consider which treatment they were assigned for the treatment.

Intention to treatEdit

Randomized clinical trials analyzed by the intention-to-treat (ITT) approach provide fair comparisons among the treatment groups because it avoids the bias associated with the non-random loss of the participants. The basic ITT principle is that participants in the trials should be analysed in the groups to which they were randomized, regardless of whether they received or adhered to the allocated intervention. However, medical investigators often have difficulties in accepting ITT analysis because of clinical trial issues like missing data or adherence to protocol.

Per protocolEdit

This analysis can be restricted to only the participants who fulfill the protocol in terms of the eligibility, adherence to the intervention, and outcome assessment. This analysis is known as an "on-treatment" or "per protocol" analysis. A per-protocol analysis represents a "best-case scenario" to reveal the effect of the drug being studied. However, by restricting the analysis to a selected patient population, it does not show all effects of the new drug. Further, adherence to treatment may be affected by other factors that influence the outcome. Accordingly, per-protocol effects are at risk of bias, whereas the intent-to-treat estimate is not.[2]

Handling missing dataEdit

One of the most important problems in analyzing a clinical trial is the occurrence of the dropout. Under the Declaration of Helsinki, patients in clinical trials must participate entirely voluntarily and must have the right to leave the trial at any time. This ethical imperative makes missing data an inevitable problem of clinical trials, and requires appropriate analysis methods to account for it. Since patients often drop out because they find a treatment doesn't seem to be working for them or because it causes harmful side effects, missing data is often correlated with the treatment's efficacy or safety. This type of selection bias makes a reliable assessment of a clinical trial's results particularly difficult. Methods to address missing data make assumptions about the relationship between dropout and study results in order to produce results which account for the missing data. Because the assumptions underlying a particular method may be inappropriate to a given study, care and expertise is required to address the issue.

Last observation carried forwardEdit

One method of handling missing data is simply to impute, or fill in, values based on existing data. A standard method to do this is the Last-Observation-Carried-Forward (LOCF) method.

The LOCF method allows for the analysis of the data. However, recent research shows that this method gives a biased estimate of the treatment effect and underestimates the variability of the estimated result.[3][4] As an example, assume that there are 8 weekly assessments after the baseline observation. If a patient drops out of the study after the third week, then this value is "carried forward" and assumed to be his or her score for the 5 missing data points. The assumption is that the patients improve gradually from the start of the study until the end, so that carrying forward an intermediate value is a conservative estimate of how well the person would have done had he or she remained in the study. The advantages to the LOCF approach are that:

  • It minimises the number of the subjects who are eliminated from the analysis, and
  • It allows the analysis to examine the trends over time, rather than focusing simply on the endpoint.

However, the National Academy of Sciences, in an advisory report to the Food and Drug Administration on missing data in clinical trials, recommended against the uncritical use of methods like LOCF, stating that "Single imputation methods like last observation carried forward and baseline observation carried forward should not be used as the primary approach to the treatment of missing data unless the assumptions that underlie them are scientifically justified."[5]

The basic assumption underlying LOCF—that patients who are given treatments get better, which makes treating missing data as if the past had continued unchanged conservative—is often not true. Many drugs treat conditions, such as cancer, heart failure, or AIDS, in which patients are expected to get worse or die while under observation; and where success comes from maintaining the status quo, prolonging life or preventing deterioration, not from curing or improving. In addition, even curative drugs may have harmful and sometimes deadly side effects and safety problems. For these types of trial contexts, handling missing data as if the past had continued unchanged may result in overreporting efficacy or underreporting harmful safety problems, biasing the results in ways that make the investigational treatment appear safer or more efficacious than it actually is.

In addition, even when they do not add inappropriate bias, simple imputation methods overestimate the precision and reliability of the estimates and the power of the trial to assess the treatment. When data is missing, the sample size on which estimates are based is lowered. Simple imputation methods fail to account for this decrease in sample size, and hence tend to underestimate the variability of the results.

Multiple imputation methodsEdit

The National Academy of Sciences advisory panel instead recommended methods that provide valid type I error rates under explicitly stated assumptions taking missing data status into account, and the use of multiple imputation methods based on all the data available in the model. It recommended more widespread use of Bootstrap and Generalized estimating equation methods whenever the assumptions underlying them, such as Missing at Random for GEE methods, can be justified. It advised collecting auxiliary data believed to be associated with dropouts to provide more robust and reliable models, collecting information about reason for drop-out; and, if possible, following up on drop-outs and obtaining efficacy outcome data. Finally, it recommended sensitivity analyses as part of clinical trial reporting to assess the sensitivity of the results to the assumptions about the missing data mechanism.[5]

While the methods recommended by the National Academy of Science report are more recently developed, more robust, and will work under a wider variety of conditions than single-imputation methods like LOCF, no known method for handling missing data is valid under all conditions. As the 1998 International Conference on Harmonization E9 Guidance on Statisticial Principles for Clinical Trials noted, "Unfortunately, no universally applicable methods of handling missing values can be recommended."[1] Expert statistical and medical judgment must select the method most appropriate to the particularly trial conditions of the available imperfect techniques, depending on the particular trial's goals, endpoints, statistical methods, and context.


  1. ^ a b International Conference on Harmonization, Guidance for Industry E9, Statistical Principles for Clinical Trials, 1998
  2. ^ Sussman, Jeremy B.; Hayward, Rodney A. (2010-05-04). "An IV for the RCT: using instrumental variables to adjust for treatment contamination in randomised controlled trials". BMJ (Clinical Research Ed.). 340: c2073. doi:10.1136/bmj.c2073. ISSN 1756-1833. PMC 3230230. PMID 20442226.
  3. ^ Salim, Agus; MacKinnon, Andrew; Christensen, Helen; Griffiths, Kathleen (2008). "Comparison of data analysis strategies for intent-to-treat analysis in pre-test–post-test designs with substantial dropout rates". Psychiatry Research. 160 (3): 335–345. doi:10.1016/j.psychres.2007.08.005. PMID 18718673.
  4. ^ Molnar, F. J.; Hutton, B.; Fergusson, D. (2008). "Does analysis using "last observation carried forward" introduce bias in dementia research?". Canadian Medical Association Journal. 179 (8): 751–753. doi:10.1503/cmaj.080820. PMC 2553855. PMID 18838445.
  5. ^ a b National Research Council; Division of Behavioral and Social Sciences and Education; Committee on National Statistics; Panel on Handling Missing Data in Clinical Trials (2010). The Prevention and Treatment of Missing Data in Clinical Trials. pp. 110–112. doi:10.17226/12955. ISBN 978-0-309-15814-5. PMC 3771340. PMID 24983040.
  • AR Waladkhani. (2008). Conducting clinical trials. A theoretical and practical guide. ISBN 978-3-940934-00-0