|
Evidence based conclusions on the efficacy of a treatment - What can be learned from risk assessment?
A comparative clinical trial for which the randomization of the patients is either
impossible, unwelcome, or inopportune, misses a basic justification for the establishment
a causal relationship between the treatment and the health outcome. Various designs of
observational studies have been developed with the aim of identifying and defining
treatments which may have a curative or palliative effect for patients despite the absence
of this methodological requirement. The discussions of pros and cons of these approaches
make obvious the need of new methodologies for clinical studies when treatments and
effects are to be related in a non-randomized set-up. In this situation, it may be helpful
to adopt an approach similar to that of toxicology where the exposure to hazardous
substances is related to the possible noxious effects on human health. Usually randomized
studies are unavailable for risk assessments so that toxicological epidemiology has to
base its conclusions on best available evidence. In this contribution analogies,
resemblences, and dissemblances between risk assessment and treatment evaluation using
nonrandomized studies are shown, and, on the basis of partial concordance, a proposal for
the achievement of Evidence Based Therapy Assessment (EBTA) is derived for a causal
relationship between treatment and its effects on the human disease relief. EBTA may be
helpful for structuring, ordering and weighting medical evidence when consensus on
treatment recommendations has to be found in the face of results from randomized as well
as non-randomized studies, and other data. The well designed, well conducted and correctly evaluated randomized clinical trial (RCT) is the method of first choice for the comparison of treatments in preventive, curative or palliative medicine. It is directed towards answering a scientific question on the superiority or equivalence of treatments. The principle of randomization protects against biasedness, which in clinical studies may result from one of the many uncontrolled or uncontrollable factors acting on health or interacting with the disease and the treatment. In R.A. Fishers words:
Thus, randomization became the basic methodology for assuring causality, which may be denoted as statistical causality, not pretending that there may be no other ways to reach causality. If a statistically significant difference in a clinical outcome variable is detected in a RCT, then -under the statistical error probabilities- the difference between the treatments is considered as its cause. In short, the RCT is without selection bias, controls known and unknown prognostic factors and provides a sound basis for statistical hypothesis testing. It thus became the ground for experimental medical research. However, it is well known that there are medical questions and situations of medical decision making where randomization is not welcome, not feasible or not possible. A large amount of methodological research has been invested into the field of nonrandomized treatment evaluation. Most approaches have tried to define criteria suitable to reconstitute as best as possible some analogues prerequisites of causality. It is now widely agreed that two basic principles have to be met case when abandoning randomization: prospectively planned studies and control of prognostic factors and confounders. Note that these principles have already been formulated very early, see Louis [2]:
See also the paper of M. Gail [3] in honory of Jerome Cornfield and Bradford Hill. In observational studies these requirements can hardly be fulfilled. As major problems of observational studies have been noted the absence of structural and observational identity of the treatment groups compared, missing theoretical basis of significance testing, danger of selecting spurious effects using multiple testing of many endpoints, and - most important- missing distinction between the effects of the treatment and the prognostic factors [4]. |
||||||
| 2. The Motivation 2.1 Brain Tumor Clinical Trials Except for surgery and surgery combined with radiotherapy, there are no established standard treatment modalities for brain tumors. In particular, research of therapy of refractory patients and palliative therapy is needed. A Medline survey on published comparative trials between 1986 and 1995 resulted in a total of 59 studies, 49 (83%) of which were definitely randomized (2 were non-randomized and 6 were unclear). However most of these studies (44, 75%) had less than 100 cases per treatment arm and had therefore a low power for detecting differences in survival. The percentage of RCTs among 202 registered ongoing protocols [5] is even lower (26%). Those studies intend to improve results of surgery, to find optimal radio- or chemotherapy more active agents and better treatment modalities pre-, inter, or post surgery. But, even with multi-center trials the sample sizes remain small unless studies of different tumor types are combined, because of the large variety of brain tumors. Therefore, it is not clear if RCTs contribute much to overall progress of the treatment of brain tumors. The situation of other tumor locations may be similar, and the proportion of RCT among all trials may be even lower. 2.2 The Number of Patients in Randomized and in Non-Randomized Studies Increased use of observational studies has been motivated by the fact that only a small percentage of patients are treated in clinical trials [10]. Possible reasons for range from missing financial support to lack of knowledge and missing education of medical doctors [11]. Thus, e.g., it has been estimated that about 36% of all cancer cases present with inoperable or metastatic disease. For a country like Germany this amounts to more than 90 000 patients with advanced cancer annually, most of whom are candidates for clinical trials with innovative palliative treatments. By contrast, the number of patients reported from clinical trials ranges in the lower thousands per year. 2.3 The Need to Supplement The issues raised above show the need for new approaches beyond the RCT in terms of
public health efforts. It should be also noted that for the individual patient facing a
disease like cancer the decison making on the appropriate treatment is a dramatic choice
which should be based on all available knowledge. For better consensus on individual
patient treatment new approaches are evolving. One is the Evidence Based
Medicine concept of the Cochran Collaboration which is implementing outcome oriented
information systems to support and guide the doctors decision [14]. Other systems use the
information from clinical trials in medical decision systems based on knowledge based
systems of informatics [15]. Most of these systems are restricted to RCT as information
source because of its high quality and availability. However, a large amount of clinical
information on the effects of a drug and its performance in patients is obtained before
and beyond RCT. It therefore seems indicated and natural to include this information into
the process of deciding upon the individual patients treatment, to
meta-analyze all the results at present, and to formulate individual treatment
recommendations. Fundamental dilemmas of the RCT have been presented by Taylor et al [16]
who concluded from a survey among the ECOG serious questions regarding the
generalizability of RCT findings, the role of other end-points than survival and the
impact of RCT on the behavior of clinicians. 3. The Method Based on Similarity with Risk Assessment 3.1 Basic Elements of Risk Assessment Risk assessment is the scientific activity of evaluating toxic properties of a chemical and the conditions of human exposure to it both to ascertain the likelihood that exposed human will be adversely affected, and to characterize the nature of the effects they may experience. The assessment consists of four steps: Hazard identification, dose-response assessment, exposure assessment, risk characterization [17]. A weight of evidence analysis has been established which takes into account human epidemiological data, animal data, but also other data on the agent and the mechanisms of its toxic action. Hazard assessment addresses the question of whether the agent poses a hazard to humans and under which circumstances this will become manifest. Available data are evaluated for information on the conditions when hazard may be expressed and they are investigated for signs of the mode of action of an agent. Dose-response assessment sets out with the data in the range of observation and then extrapolates to low doses, preferably using dose response models based on general concepts of mode of action or on curve fitting procedures in their absence. Exposure assessment identifies the data and scenarios of their observation, the distribution of exposure in the population investigated, and the pathways of exposure. Risk characterization provides an integrative analysis for supporting the risk manager in public health. Results and evidence are summarized, the quality of the data and the degree of confidence in the estimates is described. In cancer risk assessment for example, the adverse events are the occurrence of a tumor or the death caused by the tumor. The agent acts via an exposure process which is often only partially known or very difficult to reconstruct. One distinguishes between risk and hazard, where the latter is the capacity to produce a particular type of adverse health effect. Hazard has to be considered more as qualitative endpoint and typically it is determined via association or correlation studies and especially by applying statistical hypothesis tests. The definition of the appropriate model function describing the relationship between exposure and risk has been a controversy over many years in risk assessment. For a valid risk assessment data have been used from epidemiology and from animal experiments but also from physical, chemical and structural analyses of the chemical, which describe its properties and inform about metabolism, toxicokinetic and toxicity. Additionally biomarker data for exposure and effect have been incorporated into decision processes, see [18] for an example.
Table 1: Factors for weighting evidence of effects in the presence of data from humans, animals and other sources. 3.2 The Similarity There are some obvious similarities between risk assessment and drug effect assessment, not only in the field of statistical methodology which is often quite identical in both areas, e.g. when survival methods are applied, but also conceptually. Consider e.g the defintion of the basic elements of both tasks: risk and benefit. A risk is the probability that an adverse event occurs in a subject that is exposed to the specifically known agent. A therapeutic benefit is the probability that the (adverse event) disease disappears in the investigated subject if the specific treatment is given. The assessement of the therapeutic benefit of a treatment is therefore analogous to the evaluation of therapeutic properties of a drug. As in risk assessment it becomes decisive to ascertain the likelihood that a treated human show an effect and to characterize the nature of the effects he/she experiences. Therefore, a methodology valid for risk assessment should be transferrable to the assessement of therapies. Table 1 summarizes factors for weighting evidence for risk assessment [17] as possible factors of interest in efficacy assessment. Notice, that the considerations above have been restricted to the assessment of treatment efficacy, as in most meta-analyses as well. When it comes to drug safety, the analogy is even more obvious because of the similarity of the endpoints. 3.3 The Dissimilarity Clearly there are various differences between risk assessment and drug research which can not be neglected and which prohibit a formal translation of decision criteria from one area to the other. A major difference is that in treatment evaluation the target population is a group of patients, which might show more extreme reactions than persons supposed to be exposed when healthy.Also, the risk-benefit relation is quite different in clinical trials. While in clinical trials one usually wants to prove the existence of an effect, the aim of risk assessment is mostly to show the non-existence or safety of an agent or procedure. The number of drugs is rather limited and therefore usually much more information has to be processed for one drug than in risk assessment were thousends of agents await their assessment pressing for screening procedures and qualitative assessment. Often, when an agent has been classified as carcinogen it can be replaced by an non-noxious substance. If a drug has been identified as active the search may just start for a more active treatment. 3.4 Weight of Evidence Evaluation in Risk Assessment Recent reexamination of EPA guidelines for risk assessment [17] resulted in the proposal of a weight of evidence evaluation which combines evidence from humans, animals and other data sources. Table 1 exhibits the major factors used in this approach for risk assessment. Most important are the independent studies with positive or null/negative outcome. Each study can be weighted further by a number of criteria. It is quite obvious that these criteria are applicable for the assessment of oncological clinical data of the treatment of cancer patients. Table 2 shows the summary of this sort of evidence, when risk is classified by three descriptors known/likely, cannot be determined, and not likely. By relating hazard assessment to efficacy assessment, exposure assessment to dosing, dose response assessment to dose intensity studies and risk characterization to efficacy characterization, one should be able to obtain similar criteria for the assessment of the efficacy of therapies. 3.5 The Role of Dose The scaling of dose is generally of relatively low concern in clinical trials compared to risk assessment. The reason is the dominant role of the low dose extrapolation problem in risk assessment, which has no equivalent in drug research. Hence, dose plays no prominent role in therapeutic research where the goal is to achieve a standardized dosing scheme and a dose ranging in a small window only. Therefore, dose response analysis has no direct counterpart in drug efficacy analysis. On the other hand, dose intensity has been of concern in recent discussions of high-dose treatment of cancer patients. Also, in clinical trials underdosing or overdosing should should be avoided. In addition, in drug research one might be interested in the lowest dose producing a prespecified therapeutic effect, see [19]. Unbiased estimation of dose in a clinical trial is by no means trivial. Actually one has to account for the time course of the drug application, compliance, dose reduction because of adverse events and other complications arising during the treatment period [20].
Table 2: Summarizing Weight of Evidence in RISK ASSESSMENT 4. Evidence Based Therapy Assessement (EBTA) Motivated by the weight of evidence evaluation for carcinogenic risk assessment described above, a procedure is outlined below which may be helpful in combining evidence from clinical studies of different methodological quality and information from human randomized and non-randomized studies as well as from non-human studies. Similar as in risk assessment a two step procedure is proposed where the second step combines the evidence collected in the first. A tentative scheme for that combination is given in Table 3. Note, however, that this is not meant to be a recipe for use in practice. Rather, it is intended to provide general guidance of how to proceed when relevant results from different studies come together. The procedure must be adapted according to the special circumstances. 4.1 A Tentative Procedure A procedure of an Evidence Based Therapy Assessement is defined in two steps based on the translation of epidemiological human data to clinical human data and their subdivision into data from randomized and from non-randomized studies, and on the translation of animal data into preclinical data. The first step consists of four parts. STEP I
Table 3: A tentative summary of weight of
evidence for efficacy characterisation of a drug by a descripton into the three categories
present/likely (PL), cannot be determined (CANNOT) and not
likely/ not present (UNLIKE) when data from randomized (RCT) non-randomized
(NON-RCT) and experimental non-human (NON-HUM) data are available. STEP II The second step of efficacy characterization is for the integration of these four parts
into a summary which should provide treatment recommendations differentiating with respect
to prognostic factors which have to include the disease stage, age of the patient and
others. At this stage results from all three types of sources -randomized trials,
non-randomized trials and observational studies in general and non-human findings- will be
available. In order to provide an idea how to combine such information a tentative
classification into three descriptors of evidence of efficacy is given. If not overdone,
classifications of this type is thought to be useful for objective concensus on treatment
preferences. 5. Discussion 5.1 Evidence Based Conclusion versus Evidence Based Medicine Evidence based therapy assessment (EBTA) has been introduced above as a general means for deciding upon the efficacy of treatments and for the derivation of treatment recommendations. As in risk assessment, from where it has been translated, it may serve as a more rational tool than those used at present e.g. at consensus conferences where evidence for and against single treatment modalities is collected, weighted, discussed and summarized. Assessment of therapeutic evidence of a treatment is strictly a scientific process consisting of efficacy identification, dosing assessment and dose-response assessment for both efficacy and safety, and efficacy assessment from designed studies. Those parts are put together in a comprehensive efficacy characterization which should be the basis for disease and health management which is a public health task and will have to evaluate economic, social and political consequences of an implementation of the efficacy characterization. This approach is therefore based on a strict separation of the assessment and the management phase. In this aspect EBTA acts in the same field as meta-analyses, which combine evidence from a number of well defined and selected RCTs and provide an efficacy estimate on the basis of the best available information and the best available method. Actually, if for a medical question exclusively RCTs had been performed, the procedure would collapse to a meta-analysis. EBTA is in some contrast to Evidence Based Medicine (EBM) introduced by Sackett & Rosenberg [22], see also [23] and propagated within the Cochrane Collaboration [24]. EBM is a means to facilitate the translation of new medical evidence into clinical practice for better services of all patients, but it takes also influence on public health decisions and government health policy. Besides knowledge identification and synthesis which belongs to science there is also the component of targeting the new evidence to decision makers which can be considered as a -needed- interface between the therapeutic evidence assessment and the disease and health management. Obviously, the dilemma of public health policy between individual-patient ethic of effectiveness and population-health ethic of efficiency [25] is neither solved nor addressed in the model of EBTA, but may be touched in EBM. It seems to be important for drug research to locate efficacy and effectiveness in the science part and efficiency in the economic part of public health policy. 5.3 The Role of Information The most important part of all efforts to improve the decision on best treatment in EBM or EBTA is the efficacy assessment of clinical studies. Efforts are onging to collect this information systematically, see e.g. the Cochrane Collaboration [24] and to improve this information, see e.g. the CONSORT agreements [26]. Quality scores have been developed for RCTs [27]. Results from non-randomized studies may be classified also by a quality score which may take into account exclusion/non-exclusion of confounding, power, follow-up and missing values, characterization of dosing, quality of documentation, availabilty of a protocol, quality of publication etc. 5.3 The Role of Modeling Statistical models have been applied intensively in therapeutic reserach, only to
mention the time-to-event modeling with the proportional hazards model, which has been
also used with success in tumor incidence amalysis in risk assessment. However, the
mechanistic modeling of risk assessment, see [28], has no counterpart in the assessement
of treatment effects although there have been many attempts to model the disease and the
treatment process, which unfortunately were less successful than in risk assessment. In
Step II of the procedure above all information should be combined to describe the
mechnisms of drug action and its curative effect for the disease. If possible, a
biologically-based model should be build for predicting the regression of the disease in
dependence of the treatment. Default assumptions in complex modeling should be allowed.
|
||||||||||||||||||||||||||||||||||||||||||||||||