Contents

Contributors

Editors:
U. Abel,
A. Koch

Search
Linklist

© Copyright

Published by
symposion logo

Nonrandomized Comparative Clinical Studies -

Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997

Order printed volume

"The 60-Minutes-Myocardial Infarction Project": Comparison with a Registry and a Randomized Clinical Trial

A. Koch, A. Hörmann, H. Löwel, J. Senges

Abstract

There is an ongoing debate about whether observational studies can produce reliable information on treatment comparisons. Randomized clinical trials are the accepted gold standard for this purpose. It is, however, impossible to investigate all important issues in randomized trials. Large observational studies are frequently performed and large clinical databases are available. Thus it is a relevant question, how reliable data from nonrandomized studies are. In this contribution, data of a large nonrandomized multicenter study on decision-making with respect to thrombolytic treatment in patients with acute myocardial infarction are compared with a randomized clinical trial and a population based registry. It is demonstrated that similar event rates are observed in those subgroups of the observational study that are comparable with the randomized study or the registry. Especially, there is no indication of underreporting of deaths in the observational study that would invalidate all investigations on treatment comparisons based on the observational study in the first step. Although such results can hardly be generalized to other situations, they might help balance the view on so-called horror-stories, where results of observational studies could not be verified in subsequent randomized clinical trials.

Introduction

Randomized clinical trials are the accepted gold standard for the comparison of two or more treatments. It is therefore advised to follow this standard whenever possible. However, medical research cannot get by without observational studies, because there are situations in which a randomized treatment comparison is inadaequate or even impossible in practice. Moreover, the amount of parameters that might influence the efficacy or safety of a treatment is so vast that, given only limited resources, it is impossible to investigate even the most prominent ones in adequately sized randomized trials[1].

These points demonstrate the need to deal with observational clinical studies. Influenced of a number of "horror stories"[2], where randomized clinical trials did not confirm the effects postulated on the basis of previously performed observational studies, the community of biostatisticians is extremely reserved in recommending observational studies or even participating in their planning and conduct. This attitude, however, is not based on scientific grounds. Thus, e.g., the "horror stories" might result from or be favored by publication bias because if a hypothesis derived from observational studies is confirmed in subsequent randomized trials (which is the normal process of gaining evidence in medical research) then this is of no greater methodological interest. Moreover it has not yet been proven that observational studies, which are planned with the same methodological rigor as that which is accepted for the planning and conduct of randomized trials, will not produce similar treatment effects as randomized studies (see [3] for a more elaborate discussion of this point.).

The main problem of observational studies is that it is difficult to prove both internal validity (which is brought about by the process of randomization) and external validity. Often it remains unclear whether the results from observational studies provide a realistic picture of the situation under investigation, or whether the patients are just a spontaneous collection of cases, documented at the discretion of the participating doctors.

Large-scale observational studies have been performed with the purpose of investigating the doctors' behaviour in everyday's practice, gaining knowledge from documented observations of treated patients, and demonstrating the quality of patient care. Health insurance companies and hospitals collect large databases, and it is a question of ongoing debate whether these sources can be used to perform treatment comparisons[4,5].

The main prerequisite for these comparisons is that the validity of the data sources has been examined. By means of a comparison with the results of a randomized clinical trial and a registry, this paper attempts to demonstrate that a large observational study of patients with acute myocardial infarction provides unbiased estimates for mortality of patients during the course of the in-hospital stay and that these data can therefore be used to investigate certain subgroups of patients that have, up to now, not been studied in randomized clinical trials.

Subjects and methods

Data sources

In this paper the results from three different sources are compared: The '60-minutes myocardial infarction project' (the myocardial infarction project) was incepted to investigate prehospital delay in patients with acute myocardial infarction, the door to needle time in the subgroup of patients that underwent thrombolytic treatment, and current practice in the evaluation of indications and contraindications for a thrombolytic treatment. Between July 1st, 1992 and September 30th, 1994 a total of 14980 patients were documented. The 136 participating hospitals varied widely in size. 9 centers contributed more than 250 patients to the study, while 36 hospitals documented the in-hospital stay for less than 50 patients.

Various relevant medical questions can be investigated with these data. For example, during the course of the study some centers have incorporated acute coronary angioplasty into clinical practice as an alternative treatment to thrombolysis and as an alternative treatment offer for those patients for which strong contraindications to a thrombolytic treatment exist. By the end of the study, 634 patients had undergone acute coronary angioplasty, which, at that time, was the largest available case series on this treatment. It was of great interest to evaluate efficacy and safety of coronary angioplasty in comparison to a thrombolytic treatment, which still is the current standard for a re-opening of the occluded coronary vessels. In order to give a meaningful interpretation of the results of the matched comparison of patients undergoing coronary angioplasty with those receiving thrombolysis the completeness of the patient collection needed to be examined. The results of the treatment comparison, which demonstrate a slight superiority for patients treated with angioplasty, have been published in [6]. For a detailed presentation of the project and its results, see the biostatistical report[7].

Information on baseline characteristics and outcome variables in this study are compared with the GUSTO-trial[8], a four-arm multinational multicenter trial investiging four thrombolytic strategies, which was performed in 15 countries with 1081 hospitals recruiting 41021 patients. Patient recruitment startet on December 27th, 1990 and was completed approximately at the time the first phase of the myocardial infarction project ended (February 22nd, 1993).

The third data source is the Augsburg registry, which was established in 1984 as a center for the World Health Organisation project entitled 'Multinational Monitoring of Trends and Determinants in Cardiovascular Disease' (MONICA). The source population for the Augsburg registry included residents (aged 25 to 74 years) of the city of Augsburg, the county of Augsburg, and the county of Aichach-Friedberg. The design of the Augsburg registry has been described in detail in a series of publications[9,10]. The registry covers 13 hospitals within and 13 hospitals outside the study area, respectively, and has been operated by the major hospitals in the region, which provide acute care to approximately 75% of acute myocardial infarction patients of the region.

Statistical methods

The reliability of the data of the myocardial infarction project was assessed by means of two descriptive comparisons:

1) By imposing the criteria for inclusion / exclusion of the GUSTO trial on the original patient data of the myocardial infarction project, a subset of patients was identified. In this subgroup the baseline characteristics as well as the survival rates were then compared with the published information from the main publication of the GUSTO trial[8]. Provided that baseline patient characteristics were comparable in the two data sources, the hypothesis was that potential differences in the outcome variables were the consequence of (a) selection in randomized clinical trials or (b) an underreporting of certain cases with respect to potentially severe complications in the observational study.

2) The comparison with the Augsburg registry was performed by means of imposing the age restrictions of the registry on the original patient data of the myocardial infarction project and by identifying concurrent patients that were admitted to hospital and documented in the Augsburg registry. Again, the two subgroups were compared descriptively.

Results

Comparison with the GUSTO-trial

Table 1: Criteria for inclusion of the GUSTO-trial and definition of the respective subgroup in the myocardial infarction project:

patient included in the GUSTO trial definition of the respective subgroup in the myocardial infarction project
prehospital delay < 6h < 6h
duration of chest pain > 20 min. information not available
EKG-signs ST-segm.elevation > 0,1 mV in two or more limb leads or > 0,2 mV in two or more precordial leads information not available

(EKG is definitive for an acute myocardial infarction)

systolic blood pressure £ 180 mm Hg or resp. to therapy information not available
(
£ 180 mm Hg)
previous stroke no ü
active bleeding no information not available
(active ulcer)
previous trauma or surgical intervention no ü
noncompressible vascular puncture no vascular puncture
previous treatment with streptokinase no information not available
previous participation in the trial no ü

ü : the respective information is available from the questionnaire of the myocardial infarction project and can be used for the definition of the subgroup.

Table 1 summarizes the eligibility criteria for patients enrolled in the GUSTO trial and the corresponding definitions that were used to define a comparable subgroup in the myocardial infarction project. Whenever the corresponding information was not directly available from the items in the questionnaire of the myocardial infarction project, the surrogate definitions that have been used are given in parentheses. With respect to the diagnosis of the myocardial infarction, precise criteria for EKG signs were specified in the GUSTO protocol, whereas in the protocol of the myocardial infarction project only a summary evaluation of the treating physician was asked for. Patients with previous stroke and previous trauma or surgical intervention could be identified and were excluded. Information on active bleeding was not directly available. In the myocardial infarction project, peptic ulcer was asked for as a concomitant disease. It was assumed that doctors record this information only if it is relevant for the current treatment decision. Except the rare situation that a secondary event in the same patient has been recorded during the observational period of the myocardial infarction project, no information on previous treatment with streptokinase was available for patients with a reinfarction. We expect, however, that this information, although important for the assessment of differences in the efficacy of thrombolytic agents, is of only minor importance for the clincal course of currently treated patients.

In summary, most relevant parameters for the definition of a subgroup that is comparable to the GUSTO patients can be derived from the questionnaire of the myocardial infarction project.

4659 out of 14360 patients documented in the myocardial infarction project fulfilled the criteria mentioned above. 4086 (87%) and thus the vast majority of these patients received a thrombolytic treatment. It was intended to investigate and compare also the typical complications associated with lysis. Therefore, the following analyses refer to the subgroup of 4086 patients that have been treated with lysis.

Table 2: Baseline characteristics of the patients in the GUSTO trial and the corresponding subpopulation of the myocardial infarction project:

GUSTO trial subgroup of the myocardial infarction project
number of patients

9841

4086

age (years)†

62 (52,70)

62 (54,70)

female sex (%)

25

25

systolic blood pressure
(mmHg)

130 (111,144)

130 (120,150)

heart rate†

73 (62,85)

76 (64,89)

previous infarction (%)

16

15

time to randomisation (min)

120 (90,180)

115 (65,180)ƒ

time to treatment (min)

164 (115,232)

145 (96,210)

ƒ The myocardial infarction project is a nonrandomized study. Thus prehospital delay is defined as the time between onset of symptoms and admission to the hospital.

Values followed by numbers in parentheses are median values with the 25th and the 75th percentiles shown inside the parentheses.

Table 2 presents the baseline characteristics of the two populations. The four treatment groups of the GUSTO trial were well balanced with respect to these parameters except for the observed time to treatment, where the median in the four groups ranged from 164 to 170 minutes. In general, the results of the GUSTO trial and the myocardial infarction project are in good agreement. Exceptions are the delay between the onset of symptoms and the time of randomization or the begin of treatment. Possible reasons for the longer time intervals observed in the GUSTO-trial are differences in admission strategies in various European and American countries and the process of seeking the patient's consent to randomization, as well as the randomization process itself.

Table 3: Variables to assess efficacy and safety of the administered treatments:

GUSTO-trial† subgroup of the myocardial infarction project
mortality within 24 hours

2,3 - 2,9

day of arrival: 2,1
first day : 1,7
death within 48 hours: 4,6
30 day mortality

6,3 - 7,4

8,7 (in the hospital)

stroke

0,49 - 0,94

0,5

bleeding leading to transfusion

5,1 - 5,6

0,6

allergic reaction

1,6 - 5,8

1,2

allergic shock

0,2 - 0,7

0,1

The ranges of results observed in the four treatment groups of the randomized clinical trial are given.

 

Results on mortality and complications (Table 3) need some further comments. In the GUSTO trial, precise information on the date and time of death was recorded. By contrast, only the date of death is available in the myocardial infarction project. In addition, doctors in the myocardial infarction project were asked to specify whether death occurred within 48 hours after admission to the hospital. In the four treatment groups of the GUSTO trial, 2.3% to 2.9% of the patients died within 24 hours after admission to the coronary care unit, leading to a significant treatment effect in this study. 2.1% of the patients documented in the myocardial infarction project did not survive the day of admission to the hospital. 3.8% of the patients died on the day of arrival in the hospital or at the first day after admission. In the myocardial infarction project, the 30-day mortality was higher. It has to be noted that no information is available on whether patients were moved to secondary care units or were discharged early from the hospital. Thus, the higher mortality observed in the myocardial infarction project might indicate some hidden selection in randomized clinical trials that can not be assessed from differences in baseline characteristics in the two patient populations (except for a slightly higher mean heart rate which might indicate a higher number of high risk patients in the subgroup of the myocardial infarction project).

In summary, the comparison of survival rates does not provide any indication for a serious underreporting of deaths in the observational study, such as might have been expected in advance.

On the other hand, it has to be mentioned that especially for a well defined complication (bleeding leading to transfusion) there was a large discrepancy between the results from the randomized clinical trial and the observational study. We had expected to see some underreporting of minor complications in the observational study. It is known that the reported number of adverse events is highest with a new drug and diminishes in the time after its introduction, even if the frequency of administration remains constant in the population. One might expect that doctors gradually get used to complications they feel important with new treatments, and that therefore recording of adverse events is more accurate and sensitive in randomized clinical trials than in everyday practice. In the current comparison with regard to the variables allergic reaction, allergic shock, and bleeding leading to transfusion all and even the severe complications were reported less frequently than expected from the results of the randomized clinical trial.

This result makes it advisable not to use severe bleeding complications in treatment comparisons based on the results of the observational study.

Comparison with the Augsburg registry

The comparison of the myocardial infarction project and the Augsburg registry was performed by the selection of a subgroup in each of the two data sources. Cases in the Augsburg registry were included if the hospital admission was between July 1st, 1992 and September 30th, 1994 (the observational period of the myocardial infarction project). As the myocardial infarction project was restricted to transmural infarctions, only these patients were eligible for the comparison.

The subgroup of the myocardial infarction project was defined (1) by restricting patients' age to the range of 25 to 74 years, and (2) by including only cases that survived beyond the first 24 hours after admission to the hospital (according to the regulations of the registry, full clinical information on type and diagnosis of the infarction is available only if a patient survives for more than 24 hours).

Table 4: Selection of cases in the Augsburg registry and comparison with the corresponding subgroup in the myocardial infarction project: mortality:

number of observations number of deaths percentage
Augsburg registry,
all cases

2154

753
(before admission to the hospital)

35,0

Augsburg registry,
patients admitted to the hospital

1401

446
(day of arrival)

31,8

Augsburg registry,
transmural infarctions

755

75
(in the hospital)

9,9

myocardial infarction project (subgroup)

10327

1002
(in the hospital)

9,7

 

Table 4 describes the selection of cases in the Augsburg registry and tabulates the associated mortality. In total, 2154 patients were registered in the time interval specified above. 753 cases (35%) died before hospital admission. In this fairly large patient group, the hospital personnel did not even have a chance to intervene. Of the remaining 1401 cases, 446 (31,8%) died on the day of admission. In the remaining group of 950 patients, 755 cases (79%) were classified as transmural infarctions. Survival rates in this subgroup are compared with the rates in the subgroup of the myocardial infarction project defined above. These results do not provide any indication for a serious underreporting of deaths in the myocardial infarction project.

Table 5: Distribution of age and gender in the various subgroups of the Augsburg registry and the corresponding subgroup in the myocardial infarction project:

number of observations 25-55 years 56-70 years 70-74 years male gender
Augsburg registry,
all cases

2154

21,8 %

55,1 %

23,1 %

71,3 %

Augsburg registry,
patients admitted to the hospital

1401

22,6 %

55,5 %

21,9 %

72,7 %

Augsburg registry,
transmural infarctions

755

29,3 %

54,3 %

16,4 %

75,5 %

myocardial infarction project (subgroup)

10327

29,9 %

56,5 %

13,6 %

75,9 %

 

Table 5 presents the influence of the selection process to define a comparable subgroup in the Augsburg registry on the distribution of the variables age and sex. The results indicate that elderly patients have a higher risk of dying before hospital admission and that males are overrepresented in the group of patients that arrive at the hospital. Again, the overall comparison shows the two subpopulations selected from the registry and the observational study to be in good agreement.

Discussion

Whenever results from nonrandomized studies are used to compare two treatments, one has to accept that a difference in the efficacy of the two treatments is only one reason for observed effects between two patient groups. Even if the groups are comparable with respect to observed baseline characteristics, it is only the process of randomization that controls for all (and even the unknown) prognostic factors at the date of randomisation. Multivariate methods can only adjust for differences between the two treatment groups with respect to known prognostic factors.

Although some potential sources of bias that might invalidate the results from studies with historical controls (e.g., stage migration due to changes in diagnostic methods (the so-called "Will Rogers Phenomenon"[11], chronology bias[12]) are of minor importance in the case of prospectively documented parallel groups, it can still not be excluded that patients who are treated differently are in fact different from the beginning.

Randomized clinical trials are undertaken to assess the superiority of one treatment over another. Due to the large number of important prognostic factors and the variety of possible modifications in the type of drug, its dosage and form of application, one has to accept on the other hand, that results from randomized clinical trials are far from sufficient to guide treatment decisions in every-day practice. Constantly, doctors have to weigh indications and contraindications for certain treatments to find a balance between efficacy and safety of the administration of a certain drug to a certain patient. As large observational studies have demonstrated, it frequently occurs that a treatment is administered even if the patient does not match all the criteria for inclusion in the randomized clinical trials that have assessed the efficacy of this treatment.

Observational studies give the opportunity to investigate the outcome in subgroups of patients that have never been included in randomized clinical trials. A close analysis and discussion of the issues of safety or efficacy of two treatments, for example, in observational studies clearly indicates that valid results can only be derived if the documented cases form a random sample from all relevant cases, or if all relevant cases have been documented. Both conditions, which are requirements for the interpretation for observed effects in observational studies, are hard to assess from a dataset. External information is necessary to make definitive statements about the validity of the database.

It is important to provide some additional evidence for the absence of a so-called documentation bias (which may arise if patients are documented at the discretion of the participating doctors). This paper has tried to show how documentations from registries and the results from randomized clinical trials can help increase the confidence in the conclusions drawn from treatment comparisons in an observational study

Overall, the results for the comparison of a subgroup of patients from the myocardial infarction project that met the inclusion criteria of the randomized clinical trial, and another subgroup that was selected according to the restrictions of the Augsburg registry demonstrate a good general agreement of the corresponding distributional parameters.

It is, of course, not justified to draw general conclusions about the validity of results from observational studies. The myocardial infarction project was a prospective study that had been thoroughly planned and monitored. A relatively simple approach using only two questionnaires per patient was used and attempts were made to avoid unnecessary work for the participating doctors. This is also reflected by a high percentage of formal complete and correct case record forms (> 90%). Moreover, principal investigators in coronary care units are possibly more experienced in participating in clinical trials and following their rules than doctors in other areas of medicine.

We have presented an example of an observational study that is in good agreement with a similar randomized clinical trial (another example is tonsillectomy for recurrent throat infection [13], see [14] for further examples). These examples help balance the possibly biased view of observational studies resulting from some landmark papers [15-17] that described bad experiences with nonrandomized clinical studies.

Acknowledgement

The first author is indebted to Heike Dinkel for her continuous support of his activities and the current analyses based on the data of the myocardial infarction project that have been presented in this paper.

References

[1]
Koch A, Windeler J, Abel U: Anwendungsbeobachtungen: zu Begriff und Nutzen. Med Klin 91:103-105, 1996
[2]
Weinstein MC: Allocation of subjects in medical experiments. The New England Journal of Medicine 291:1278-1285, 1974
[3]
Abel U, Koch A: Randomisation in klinischen Studien: Empirisch begründet oder nur ein Dogma? Internist 38:318-324, 1997
[4]
Byar DP: Why data bases should not replace randomized clinical trials. Biometrics 36:337-342, 1980
[5]
Hlatky MA: Using databases to evaluate therapy. Statistics in Medicine 10:647-652, 1991
[6]
Zahn R, Koch A, Rustige J, Schiele R, Wirtzfeld A, Neuhaus K, Kuhn H, Gülker H, Senges J: Primary angioplasty versus thrombolysis in the treatment of acute myocardial infarction - a matched pairs study. Am J Cardiol 79:264-269, 1997
[7]
Koch A, Dinkel H: Das 60-Minuten-Herzinfarktprojekt: Biometrischer Abschlußbericht. Technical Report der Abteilung Medizinische Biometrie Nr. 26, 1996
[8]
The GUSTO Investigators: An international randomized trial comparing four thrombolytic strategies for acute myocardial infarction. The New England Journal of Medicine 329:673-682, 1993
[9]
Löwel H, Lewis M, Hörmann A, Keil U: Case findings, data quality aspects and comparability of myocardial infarction registers: results of a south German register study. J Clin Epidemiol 44:249-260, 1991
[10]
The WHO MONICA project: Myocardial infarction and coronary deaths in the World Health Organisation MONICA Project: registration procedures, event rates, and case-fatality rates in 38 populations from 21 countries in four continents. Circulation 90:583-612, 1994
[11]
Feinstein AR, Sosin DM, Wells CK: The Will Rogers phenomenon. Stage migration and new diagnostic techniques as a source of misleading statistics for survival in cancer. The New England Journal of Medicine 312:1604-1608, 1985
[12]
Haines SJ: Randomized clinical trials in the evaluation of surgical innovation. J Neurosurg 51:5-11, 1979
[13]
Paradise JL, Bluestone CD, Bachmann RZ, Colborn DK, Bernard BS, Taylor FH, Rogers KD, Schwarzbach RH, Stool SE, Friday GA, Smith IH, Saez CA: Efficacy of tonsillectomy for recurrent throat infection in severely affected children. The New England Journal of Medicine 310:674-683, 1984
[14]
Ellenberg JH: Biostatisical collaboration in medical research. Biometrics 46:1-32, 1990
[15]
Sacks H, Chalmers TC, Smith J: Randomized versus historical controls for clinical trials. Am J Med 72:233-240, 1982
[16]
Byar DP, Simon RM, Friedewald WT, Schlesselmann JJ, Demets DL, Ellenberg JH, Gail MH, Ware JH: Randomized clinical trials. The New England Journal of Medicine 295:74-80, 1976
[17]
Mantel N: Cautions on the use of medical databases. Statistics in Medicine 2:355-362, 1983