Introduction

Coronary heart disease (CHD) is the major cause of death in developed countries. In the last decades, the mortality rate of CHD has declined in the United States and Western Europe including the Netherlands [14]. Two possible explanations for this decreasing mortality rate are a decline in the population risk of CHD leading to a lower incidence, or a better survival of cases with CHD resulting in a lower case-fatality rate. Information about the trend in the incidence rate could be used to distinguish between these two explanations [5].

Because cardiovascular disease registries are lacking in most countries, record linkage with national hospital discharge and mortality data is often used to estimate the incidence of CHD [5, 6]. Recently, Koek et al. [7] estimated the incidence rate of a first acute myocardial infarction in 2000 by record linkage to the Dutch hospital discharge register (HDR) and causes of death registry from Statistics Netherlands. They found a crude incidence rate (per 100,000 persons per year) of 293 in men and 174 in women.

The validity of these estimates, however, depends on the completeness and accuracy of the data in the national registers and the accuracy of the linkage. Therefore, several studies have investigated the validity of data about CHD or acute myocardial infarction in national registers by comparing them with specific study registers [815]. These studies showed a wide range in the estimated values for the validity of the data in national registers.

Less is known about the validity of incidence estimates using record linkage with national hospital discharge and mortality data for unstable angina pectoris and heart failure. These estimates may be more problematic, because the diagnoses of these diseases are more difficult to make. Only two studies investigated the validity of the diagnosis of heart failure in national HDRs [16, 17]. Both studies indeed showed lower values for the validity of this diagnosis in national registers.

In this study, we used the disease registry of the cardiovascular registry Maastricht (CAREMA) cohort study to estimate the incidence rates of CHD, acute myocardial infarction, unstable angina pectoris, heart failure, and sudden cardiac arrest. We compared these incidence rates with the incidence rates estimated using hospital discharge data to assess the completeness and validity of the latter.

Materials and methods

Study population

The CAREMA cohort consists of participants of two large monitoring projects in the Netherlands living in the Maastricht region: the monitoring project on cardiovascular risk factors (PPHVZ) 1987–1991 [18] and the monitoring project on chronic disease risk factors (MORGEN Project) 1993–1997, [19] including the transition year (1992) between these projects. Each year, a random sample of people aged 20–59 years was selected from the municipal registries of Maastricht and surrounding communities: Eijsden, Margraten, Meerssen, and Valkenburg aan de Geul. Between 1987 and 1997, 21,662 men and women, born between 1927 and 1977, were included in this study, of whom 21,148 participants (97.6%) had given informed consent to retrieve information from the municipal population registries and from their general practitioner and specialist.

Follow up

Migration and mortality follow-up

A migration and mortality follow-up was performed by record linkage of the CAREMA cohort to the municipal population registries. During follow-up until 31 December 2003, 2,106 persons (10.0%) had migrated to a municipality outside the Maastricht region, 621 persons (2.9%) had emigrated, and 791 persons (3.7%) had died. Furthermore, 12 persons (0.1%) were lost to follow-up, of whom 9 persons appeared to have migrated out of the Netherlands just before their baseline study date.

Cardiologic follow-up

Cardiologic follow-up was performed by record linkage of the CAREMA cohort to several hospital registries of the University Hospital Maastricht (UHM). In April 2004, the cohort was linked to the hospital information system (HIS) of the UHM using a combination of date of birth, gender and the first four characters of the family name [20]. In the HIS, 20,632 cohort members (97.6%) could be found. Subsequently, these subjects were linked to the cardiology information system (CIS) of the UHM department of cardiology using the personal identification number of the HIS as identifier. For all people that visited the UHM department of cardiology, the CIS contains all reports to the general practitioner and information from visits to the emergency ward or outpatient clinic for heart problems, hospital admissions for cardiologic diseases, physical examinations and treatments. Among the 20,632 persons, 4,694 (22.8%) were known in the CIS. The cardiologic history of these persons was abstracted and coded by trained registrars under guidance of a cardiologist (AG).

Furthermore, the CAREMA cohort was linked to the Maastricht circulatory arrest registry (MCAR) [21] of the UHM department of cardiology to identify people who suffered from a sudden cardiac arrest.

For participants who died, the cause of death was obtained from Statistics Netherlands. Causes of death have been coded according to the ninth revision of the international classification of diseases (ICD-9) until 1996, and thereafter according to the tenth revision (ICD-10). Among the 791 deceased cohort members, 276 persons (34.9%) had a cardiovascular disease as primary or secondary cause of death (ICD-9 390–459; ICD-10 I00-I99). The cause of death was unknown for 24 cohort members (3.0%) who died outside the Netherlands, while five persons (0.6%) could not be linked to the causes of death registry from Statistics Netherlands.

The following data was registered in the CIS-based registry: date of migration to a municipality outside the Maastricht region, date of emigration, date of death including cause of death, and the presence of a clinical diagnosis including date of diagnosis and several other characteristics of an acute or silent myocardial infarction, unstable or stable angina pectoris, heart failure, atrial fibrillation, ventricular fibrillation/tachycardia, several cardiologic treatments, and sudden cardiac arrest. All data were checked for completeness, possible errors, and inconsistencies.

In addition, the CAREMA cohort was linked to the HDR of the UHM to enlarge the completeness of the cardiologic follow-up. In the HDR, the discharge diagnoses of all admissions to the UHM have been registered using the ninth revision of the international classification of diseases (ICD-9-CM). By this linkage, only four participants were found with a discharge diagnosis of CHD (ICD-9 codes 410–414) in the HDR that were not linked to the CIS. After checking their medical history, they were no additional cases for the analyses.

Because some delay might have occurred in the registration of events in the hospital registries, the follow-up was censored at 31 December 2003 to ensure the completeness of the follow-up.

Statistical analyses

Incidence estimates

In the present study, CHD is defined as incident acute myocardial infarction, unstable angina pectoris, coronary artery bypass grafting (CABG), percutaneous transluminal coronary angioplasty (PTCA), or CHD death.

Incident cases were defined in two ways, i.e., based on causes of death and either the CIS or the HDR. Persons with cardiac diseases as primary or secondary cause of death according to Statistics Netherlands were defined as cases using the following ICD-codes: ICD-9 410–414 and ICD-10 I20–I25 for CHD; ICD-9 410 and ICD-10 I21–I22 for acute myocardial infarction; ICD-9 413 and ICD-10 I20 for unstable angina pectoris; ICD-9 428 and ICD-10 I50 for heart failure; and ICD-9 798 and ICD-10 I46, R96, and R98 for sudden cardiac arrest. In addition, cases were defined according to the clinical diagnosis of the disease, made by experienced cardiologists, as extracted from CIS for the CIS-based definition. This clinical diagnosis was mostly based on the diagnosis mentioned in the report to the general practitioner. Furthermore, additional information, such as enzyme levels, ECG and echo findings, was recorded in CIS and was used to check whether the patient’s clinical signs and symptoms were in agreement with this diagnosis. For the HDR-based definition, cases were defined according to their hospital discharge diagnosis using the following ICD-9 codes: 410, 411.1, and 413.1 for CHD; 410 for acute myocardial infarction; 411.1 and 413.1 for unstable angina pectoris; and 428 for heart failure.

During follow-up, participants may have had multiple cardiac diseases. For each disease separately, incident cases were defined according to the first occurrence of that disease, irrespective of the occurrence of other diseases investigated in this study. For this reason, the sum of cases with an acute myocardial infarction and cases with unstable angina pectoris is higher than the total number of CHD cases.

For each cardiac disease separately, person time at risk was calculated from baseline until end of follow-up i.e., clinical diagnosis of the disease in case of the CIS-based definition and date of hospital admission in case of the HDR-based definition, migration to a municipality outside the Maastricht region, emigration, death or censoring at 31 December 2003, whichever occurred first. Incidence rates were calculated as the number of incident cases divided by the disease-specific person time at risk.

In the analyses, participants with a migration date to a municipality outside the Maastricht region before their baseline study date (n = 26) and participants lost to follow-up (n = 12) were excluded, leaving 21,110 cohort members. In addition, cases with CHD at baseline (n = 347), based on self-report or diagnosis date before baseline in the CIS-based registry, were excluded in the analyses of CHD, acute myocardial infarction, and unstable angina pectoris. Cases with heart failure at baseline (n = 7), based on a diagnosis date before baseline in the CIS-based registry, were excluded in the analyses of heart failure. For sudden cardiac arrest, no prevalent cases were excluded.

Comparison between CIS-based and HDR-based definitions

In the analyses, the CIS-based registry was used as gold standard. A positive match between the registries was defined as a registration with the specific disease in both the CIS-based and HDR-based registry within a time frame of 6 months prior to or post diagnosis in CIS (true positives). Sensitivity was calculated as the number of cases with a positive match divided by the total number of cases in the CIS-based registry. Positive predictive value was calculated as the number of cases with a positive match divided by the total number of cases in the HDR-based registry. The 95% confidence intervals were calculated using the standard error of the estimate of the binomial distribution in the usual manner. Stratified analyses were performed for sex, age at diagnosis, and study period.

Results

Incidence estimates

During follow-up, 815 cases were registered with CHD in the CIS-based registry, 481 cases with acute myocardial infarction, and 390 cases with unstable angina pectoris (Table 1). The incidence rates per 100,000 person-years were 362.2 for CHD, 212.2 for acute myocardial infarction, and 171.8 for unstable angina pectoris (Table 2). In addition, 154 cases were registered with heart failure of whom 68 cases (44.2%) self-reported CHD at baseline or had been diagnosed with CHD prior to the diagnosis of heart failure in CIS (Table 1). Among the 152 cases with sudden cardiac arrest in the CIS-based registry, 57 cases (37.5%) self-reported CHD at baseline or had been diagnosed with CHD or heart failure prior to the diagnosis of sudden cardiac arrest in CIS. The incidence rates per 100,000 person-years were 66.4 for heart failure and 65.4 for sudden cardiac arrest (Table 2).

Table 1 Baseline characteristics of the CAREMA cohort in The Netherlands, 1987–2003
Table 2 Estimated incidence rates from the HDR-based and CIS-based registry in the CAREMA cohort in 1987–2003

In the HDR-based registry, 656 cases were registered with CHD during follow-up, 417 cases with acute myocardial infarction, 269 cases with unstable angina pectoris, and 84 cases with heart failure. There were no cases with sudden cardiac arrest as discharge diagnosis in the HDR. The incidence rates of these diseases derived from the HDR-based registry were lower than those derived from the CIS-based registry (Table 2). Especially in the older age categories (50–59 and 60–69 years) of both men and women, the incidence rates from the HDR-based registry were lower compared with those from the CIS-based registry, except for female cases with acute myocardial infarction. For both men and women, the estimated incidence rates per age category from the CIS-based and HDR-based registry are given in the Appendix.

Validity of the HDR-based registry

For the HDR-based definition of CHD, the sensitivity and positive predictive value were 72 and 91%, respectively (Table 3). A positive match was found in 590 (70.8%) of the 833 cases with CHD in one or both of the registries. For 43 persons (5.2%), the time difference between the diagnosis in the HDR-based and CIS-based registry was longer than 6 months. Furthermore, 182 CHD cases (21.8%) were found only in the CIS-based registry, while 18 cases (2.2%) were found only in the HDR-based registry. Because normally CHD refers to the ICD-9 codes 410–414, the HDR-definition of CHD was extended to these ICD-9 codes in additional analyses. In doing so, the sensitivity increased from 72 to 85% while the positive predictive value decreased from 91 to 85%.

Table 3 Sensitivity (Se) and positive predictive values (PPV) for the cardiovascular diseases in the HDR-based registry compared with the CIS-based registry in 1987–2003 within a time frame of 6 months prior to or post diagnosis in CIS

For the HDR-based definition of acute myocardial infarction, the sensitivity and positive predictive value were 84 and 97%, respectively (Table 3). For 404 cases (83.3%), a positive match was found between the HDR-based and the CIS-based registry. For eight cases (1.6%), the time difference between the diagnosis in the HDR-based and CIS-based registry was longer than 6 months. The remaining cases (15.1%) were found in either one of the registries.

The sensitivity and positive predictive value for the HDR-based definition of unstable angina pectoris (53 and 78%, respectively) were substantially lower (Table 3). Only 208 out of the 420 cases with unstable angina pectoris (49.5%) were found in both registries, while 185 cases (44.0%) were registered only in either one of the registries. For 27 cases (6.4%), the time between the diagnoses was more than 6 months.

The most important reasons to be only registered in the CIS-based registry for cases with CHD, acute myocardial infarction, and unstable angina pectoris were as follows: a different discharge diagnosis in the HDR-based registry (68, 59, and 75%, respectively), mostly ICD-9 codes 413.90 (other and unspecified angina pectoris) and 414.00 (coronary atherosclerosis), a diagnosis based on outpatient files (18, 6, and 21%, respectively), or a hospital admission in another Dutch hospital (1, 3, and 0%, respectively), or in a foreign hospital (4, 9, and 1%, respectively). All cases with CHD, acute myocardial infarction and unstable angina pectoris that were registered only in the HDR-based registry had a diagnosis of another (11, 100, and 40%, respectively) or no cardiovascular disease (89, 0, and 60%, respectively) according to the CIS-based registry.

For the HDR-based definition of heart failure, the sensitivity and positive predictive value were 43 and 80%, respectively (Table 3). A positive match was found in 66 cases (41.8%), while 79 cases (50.0%) were registered only in either one of the registries. For 13 cases (8.2%), the time between the diagnoses was more than 6 months. The most important reasons for cases with heart failure to be registered only in the CIS-based registry were a diagnosis based on outpatient files (36%), a different discharge diagnosis in the HDR-based registry (36%), or a diagnosis of heart failure made during hospital admission for another cardiovascular disease, but not registered as discharge diagnosis (21%).

In the stratified analyses, slightly higher sensitivities were found in women compared with men, except for heart failure (data not shown). By contrast, the positive predictive value was slightly higher in men than in women. Furthermore, higher sensitivities were found in the age category <50 years compared with the age category ≥50 years, except for heart failure. For CHD and acute myocardial infarction, the positive predictive values were also higher in the age category <50 years, while these values were higher in the age category ≥50 years for unstable angina pectoris and heart failure. Both the sensitivities and positive predictive values were higher in the study period 1996–2003 than in the study period 1987–1995, except for the sensitivity of heart failure.

Discussion

In the comparison between the HDR-based and CIS-based registry, relatively high sensitivities and positive predictive values were found for CHD and acute myocardial infarction, while these values were considerably lower for unstable angina pectoris and heart failure. Furthermore, high percentages of the cases were only found in the CIS-based registry, varying from 14.2% for acute myocardial infarction to 47.5% for heart failure. These cases were missed or miscoded in the HDR-based registry. As a consequence, the incidence rates in the HDR-based registry were considerably lower than the incidence rates in the CIS-based registry, especially for unstable angina pectoris and heart failure.

Several reasons may have contributed to the differences found between the HDR-based and CIS-based registry. The diagnoses from CIS were abstracted and coded by trained registrars under guidance of a cardiologist (AG). Therefore, the diagnoses in the CIS-based registry are probably less susceptible to misclassification.

Furthermore, the CIS also contains information about visits to the outpatient clinic and emergency ward for heart problems. Cases diagnosed at these departments with cardiac diseases that do not warrant hospitalisation were still registered in the CIS-based registry. These cases were missed when only data from the HDR was used, leading to an underestimation of the incidence rates, especially for diagnoses that do not warrant hospitalisation, as can be seen from this study.

For the incidence estimates in this study, data were used from the University Hospital Maastricht (UHM). Because of the central and unique position of this hospital in the study region, the cardiologic follow-up is expected to be almost complete. Only a few cases will be missed, partly due to a diagnosis in another Dutch or in a foreign hospital. However, part of these cases may have visited the outpatient clinic of the UHM department of cardiology in a later stage, so that they were still registered in the CIS-based registry but not in the HDR-based registry. Nonetheless, cases diagnosed in another Dutch hospital would probably be found when data is used from the national HDR, while cases diagnosed in a foreign country would still be missed. In the Netherlands, however, record linkage to the national HDR is difficult, because of the limited number of identifying variables in this register.

Because the definition of CHD in the CIS-based registry was restricted to a clinical diagnosis of acute myocardial infarction, unstable angina pectoris, CABG or PTCA, we also narrowed the definition of CHD in the HDR-based registry to cases with ICD-9 codes 410, 411.1, and 413.1 as discharge diagnosis. When the definition in the HDR-based registry was extended to ICD-9 codes 410–414, the sensitivity increased from 72 to 85%, which can be explained by the larger range of ICD-9 codes used for specific CHD in the HDR. The positive predictive value, however, decreased from 91 to 85%, which can be explained by the inclusion of all ischemic heart diseases in the HDR-based registry, including stable angina pectoris and chronic CHD that were not included in the CIS-based registry.

The identification of incident cases in this study was partly performed by record linkage of the CAREMA cohort to the causes of death registry from Statistics Netherlands. Although we did not validate these cases, several studies in the Netherlands showed that the registration and coding of causes of death by Statistics Netherlands had a higher validity compared with other European countries [22, 23]. Because the cause of death registry was used in both the CIS-based and HDR-based registry for the identification of incident cases, this has led to an improvement of the comparison between these registries. Additional analyses, in which incident cases identified by linkage to Statistics Netherlands were excluded, showed small decreases in positive predictive values but considerable decreases in the sensitivities. This was due to a relatively large increase in the number of cases registered only in the CIS-based registry after exclusion. This means that the record linkage with the causes of death registry was especially favourable for the completeness of the HDR-based registry.

In this study, clinical diagnoses in CIS were used for the identification of cases in stead of diagnostic criteria. However, 321 of the 417 cases (77.0%) with a clinical diagnosis of acute myocardial infarction in the CIS-based registry met the diagnostic criteria of the European Society of Cardiology and the American College of Cardiology [24]. The remaining 23.0% had incomplete data. Of the 321 cases that met the diagnostic criteria in the CIS-based registry, 291 cases (90.7%) were also registered with acute myocardial infarction in the HDR. Thus, even for a diagnosis of an acute myocardial infarction based on diagnostic criteria, a considerable part of the cases was not registered with this diagnosis in the HDR. During follow-up, more sensitive screening tests became available for the diagnosis of an acute myocardial infarction. Because of these tests, clinical decision-making may have changed during the follow-up period.

The estimates of the incidence rates from the HDR-based registry are comparable to those reported by Koek et al. [7] for the Netherlands as a whole. For comparison purposes, we calculated an expected incidence rate of acute myocardial infarction using the national incidence rates of Koek et al., age and gender-standardized to the CAREMA cohort. This expected incidence rate was higher than the incidence rate in the HDR-based registry (201.9 and 183.4 per 100,000 person-years, respectively), which may be explained by a lower incidence rate in the study population, [25] which is restricted to the Maastricht region, compared with the average Dutch population. Furthermore, regional differences in the coverage and validity of local hospital discharge registries may also explain this discrepancy. Conversely, the expected rate was lower than the incidence rate in the CIS-based registry (212.2 per 100,000 person-years).

Several studies have investigated the validity of hospital discharge and/or mortality data on acute myocardial infarction by comparing these data with specific study registers [815] or physician reviews [5, 2628]. In these studies, a wide range of estimated values for the sensitivity and positive predictive value was found due to differences in case identification. However, most of the studies demonstrated that hospital discharge and/or mortality data underestimate the incidence of acute myocardial infarction in the population as was found in our study.

Furthermore, three Finnish validation studies found higher sensitivities and positive predictive values in men compared with women [9, 12, 15]. In our study, the positive predictive value was also higher in men, while the sensitivity was higher in women. In the stratified analyses, we also found higher sensitivities and positive predictive values in the study period 1996–2003 compared with the period 1987–1995. This implies an improvement of the validity of the HDR in time.

Only two studies investigated the validity of hospital discharge data on heart failure using the definition of heart failure by the European Society of Cardiology [29]. Ingelsson et al. [16] found a positive predictive value of 82% which is slightly higher than the value of 80% found in our study. A considerably lower value of 65% was found in the study by Khand et al. [17]. However, Khand et al. used a broader range of ICD-codes (including ICD-10 codes I25.5 and I42.9) which are probably less sensitive for a definite diagnosis of heart failure.

In many of the above mentioned studies, data from the WHO MONICA project were used [8, 9, 1115]. This project is a multicenter study which monitors the incidence of myocardial infarction (MI) in several countries using study-specific MI registers. All events that occurred in the study population were registered according to previous defined diagnostic criteria. Although the registration in the CIS-based registry was based on a clinical diagnosis made by experienced cardiologists, a large number of cases with an acute myocardial infarction (77%) met the diagnostic criteria of the European Society of Cardiology and the American College of Cardiology as described earlier in this discussion. In addition, none of the registered cases with complete data did not fulfil these diagnostic criteria. In the CIS-based registry, however, not only diagnoses of an acute myocardial infarction were registered but also diagnoses of silent myocardial infarctions, unstable and stable angina pectoris, and heart failure. However, the registrations in the CIS-based registry were only made for people living in the Maastricht region. In the MONICA project, the centers did also not have national coverage [8]. Therefore, the estimated incidence rates of both the CIS-based registry and the MI registers of the MONICA project may not be generalised to a national level.

Because cardiovascular disease registries are lacking in most countries, record linkage with hospital discharge and mortality data is often used to estimate the incidence rates of CHD and other cardiologic diseases. However, this study and previous studies have shown that a considerable part of the cases is missed or miscoded using hospital discharge data. Therefore, incidence rates based on these data may underestimate the true incidence rates, especially for unstable angina pectoris and heart failure.

Furthermore, an accurate identification of cases is even more important in etiological studies in which risk estimates are based on the comparison between cases and non-cases. Case identification based on hospital discharge and mortality data may lead to biases in the results of these studies. Therefore, these data should be used with caution in epidemiological studies, especially in etiological studies. Although the CIS-based registry has several advantages over HDRs, some events may be still be missed. In etiologic studies, it is important to keep in mind the potential weaknesses of such registries.