|
|
||||||||
|
Interact CardioVasc Thorac Surg 2009;9:203-208. doi:10.1510/icvts.2008.199083 © 2009 European Association of Cardio-Thoracic Surgery
The first Latin-American risk stratification system for cardiac surgery: can be used as a graphic pocket-card score
a Instituto de Cardiología, Hospital Español de Buenos Aires, 2975 Belgrano Avenue, Ciudad autónoma de Buenos Aires, C1209AQK, Buenos Aires, Argentina Received 24 November 2008; received in revised form 27 April 2009; accepted 29 April 2009
*Corresponding author. Department of Cardiovascular Surgery, Instituto FLENI, 2325 Montañeses C1428AQK, Belgrano, Buenos Aires, Argentina. Tel.: +54 11-5777-3200; fax: +54 11-5777-3209.
This study aims to develop the first Latin-American risk model that can be used as a simple, pocket-card graphic score at bedside. The risk model was developed on 2903 patients who underwent cardiac surgery at the Spanish Hospital of Buenos Aires, Argentina, between June 1994 and December 1999. Internal validation was performed on 708 patients between January 2000 and June 2001 at the same center. External validation was performed on 1087 patients between February 2000 and January 2007 at three other centers in Argentina. In the development dataset the area under receiver operating characteristics (ROC) curve was 0.73 and the Hosmer–Lemeshow (HL) test was P=0.88. In the internal validation ROC curve was 0.77. In the external validation ROC curve was 0.81, but imperfect calibration was detected because the observed in-hospital mortality (3.96%) was significantly lower than the development dataset (8.20%) (P<0.0001). Recalibration was done in 2007, showing excellent level of agreement between the observed and predicted mortality rates on all patients (P=0.92). This is the first risk model for cardiac surgery developed in a population of Latin-America with both internal and external validation. A simple graphic pocket-card score allows an easy bedside application with acceptable statistic precision.
Key Words: Risk stratification; Cardiac surgery; Outcome
Accurate risk assessment has become an essential part of cardiac surgery practice worldwide that offered an invaluable tool of quality assurance and made the process of informed consent of the patients more feasible and more ethical [1]. During the last two decades, several risk assessment models have been developed to predict the risk of mortality after cardiac surgery based on a patient's preoperative parameters, most of which were developed in North America and Europe [1–5]. However, we and others have recently shown that geographical differences in the risk profiles of patients, different surgical strategies, different types of cardiac surgeries and different centers cause major variability in the performance of those risk stratification or assessment models [6–9]. Therefore, it is not surprising that risk stratification models suffer inferior performance when applied to patient groups other than the ones on whom they were developed [8–10]. The aim of this study was to develop a local risk model on a population of Latin-America and to validate its performance both internally and externally and also compare it to established risk stratification models. We also aimed to provide a simple graphic pocket-card bedside score that is very user-friendly.
2.1. Data collection The database at the Instituto de Cardiologia, Hospital Español de Buenos Aires, Argentina was established in June 1994 in line and based on the Society of Thoracic Surgeons (STS) database [2]. A full list of established STS standard definitions for preoperative risk factors and complications can be accessed on the website http://www.sts.org. Clinical outcome was based on in-hospital mortality defined as death during the initial hospitalization. Data were collected prospectively on all adult patients undergoing cardiac surgery and continuously monitored for any missing and incorrect entries and audited by the institution. Complete data were available for a total of 4698 patients and represented 100% of the patients identified in our database during the study period. Data on all adult patients who underwent cardiac surgical procedures and were registered into our database between June 1994 and January 2007 were included in the study. Cardiac surgical procedures included isolated coronary artery bypass grafting (CABG), any valve repair or replacement, valve surgery with CABG, aortic thoracic surgery, cardiac surgery with carotid endarterectomy, adult congenital cardiac surgery and heart transplantation. Patients who underwent implantation or explantation of ventricular assist devices as their primary surgery were excluded from the study. The analysis was run in three consecutive steps. The first step was a retrospective analysis on the data on 2903 patients who underwent cardiac surgery between June 1994 and December 1999 at the Instituto de Cardiologia, Hospital Español de Buenos Aires in order to develop the risk model (development dataset). The second step was an internal prospective validation dataset performed on 708 patients operated between January 2000 and June 2001 at the same institute. The third step was an external prospective validation of the model on 1087 patients operated on at three other hospitals in Buenos Aires: Instituto FLENI, Clinica Suizo-Argentina and Sanatorio de la Trinidad between February 2000 and January 2007 (external validation dataset).2.3. Model development and validation Forty-nine preoperative variables were analyzed. Every risk factor was analyzed as a categorical variable. For continuous variables, when the relationship with outcome was not linear such as age, appropriate cut points were determined from a bibliography [1]. Univariate analyses were performed with 2-test or Fisher exact test. Continuous variables were expressed as mean±S.D. and categorical variables were expressed as percentages. A multivariate stepwise regression model was used to identify risk factors for in-hospital mortality. Factors included were those significant (P<0.05) by univariate analyses or by following clinical importance criteria. Then, standard logistic regression analysis was formulated using the development dataset. The original regression coefficients for each variable were used to calculate patient-specific predicted probability of operative mortality according to the logistic regression equation. Model discrimination was assessed by the area under the receiver operating characteristic (ROC) curve [11] and model calibration was assessed by Hosmer–Lemeshow (HL) test [12]. The reliability of risk score prediction was also evaluated by comparing the observed mortality rates with those predicted by the risk score in all patients and across quintiles of risk as previously suggested [1, 2, 7, 13] (Table 3). The difference between the mean observed mortality and the mean expected mortality was evaluated by paired t-test [14]. A value of P<0.05 was considered significant. Because mortality rates for the validation dataset were lower than rates in the development dataset (3.96% vs. 8.20%), the 1999-original model was recalibrated for use on group validation [10, 13, 15]: a logistic regression equation for hospital mortality prediction was derived with the 1999-original model as the independent variable and in-hospital mortality in validation dataset as the dependent variable [14, 15]. Prognostic performance of our local score was compared with the European System for Cardiac Operative Risk Evaluation (EuroSCORE) and the Parsonnet score 2000-version (2000-Parsonnet score) [3–5]. Data analyses were performed using SPSS statistical software package, version 13.0.
2.4. Development of graphic risk score We aimed to develop a simplified graphic score to calculate the risk using a convenient pocket-sized card at bedside. All coefficients were multiplied by 10 and rounded to the nearest half-integer, following empiric criteria for clinical significance. The total risk score is the sum of point values assigned to each risk factor detected at the time of patient evaluation. A graph was plotted via a mathematical function to assess the relationship between the clinical approximate model and the original logistic regression model, patient by patient.
Patient characteristics and respective mortality rates of development and external validation data are summarized in (Table 1). The development dataset consisted of 2903 patients with a mean age of 62.8±11.6 years, 26.5% were female and mortality was 8.2%. The internal validation dataset consisted of 708 patients with similar characteristics and a mortality of 8.3%. The external validation dataset consisted of 1087 patients with a mean age of 62.9±11.9, 23.3% were female and mortality was 3.96%. Multivariate analyses identified 18 risk factors related to in-hospital mortality. Risk factors, beta coefficients, odds ratios, 95% confidence intervals (CI), and weighted scores are listed in Table 2. The area under the ROC curve for the regression model was 0.73 (95% CI, 0.66–0.79); the HL test was non-significant (P=0.88).
The 18 risk factors were weighted for the definitive scoring system. The correlation coefficient between full multivariable model and the graphic score was 0.97. Based on this, we developed a graphic pocket-sized additive score card that can be used at bedside to calculate risk (Fig. 1).
The internal validation dataset showed reasonable discrimination with area under the ROC curve of 0.77 (95% CI, 0.74–0.80). However, the external validation dataset showed better discrimination with area under ROC curve of 0.81 (95% CI, 0.75–0.87), but imperfect calibration due to significantly lower in-hospital mortality rates in the external validation dataset (Table 1). The graphic score demonstrated a strong predictive capacity with areas under ROC curve of 0.81 similar to the full multivariable model. To improve our graphic score, recalibration was done (Fig. 2). Both observed and predicted mortality in clinically relevant five risk groups of the 1999-original and 2007-recalibrated models across a spectrum of patient risk groups are presented in Table 3. Discrimination of the 2007-recalibrated model was similar to the 1999-original model, with area under ROC curve of 0.81 (95% CI, 0.75–0.87) (Fig. 3). The HL test was non-significant ( 2=1.51, P=0.68) and an excellent level of agreement between the observed and predicted rates of mortality on all patients (P=0.92) was observed. This predictive power was maintained across the five quintiles of risk.
The area under the ROC curve for EuroSCORE was 0.80 (95% CI, 0.74–0.86), similar to the graphic score (0.81, 95% CI, 0.75–0.87) but 2000-Parsonnet score showed area under ROC curve of 0.70 (95% CI, 0.61–0.79) (Fig. 3). The EuroSCORE showed good calibration on all patients (P= 0.12) but overestimated mortality in the lowest quintiles of risk (quintiles 1 and 2). Calibration of 2000-Parsonnet score on all patients was poor (P=0.002) because of underestimated mortality in the higher quintile of risk (quintile 5) (Table 3).
This study develops the first Latin-American risk stratification system, validates it both internally and externally and creates a simple graphic score that can be used as a pocket-card at bedside. Recent evidence showed that risk scoring systems suffer inferior performance when used in patient populations different from the ones on which they were developed [8–10]. This could be attributed to the geographic differences in risk profile, demographic and epidemiological variation in co-morbidity, lifestyle and socioeconomic factors between countries and continents. There are also considerable differences in surgical strategies, types of surgeries, center volumes and economic resources. To our knowledge, this is the first attempt to develop a Latin-American risk stratification model and validate it both internally and externally and eight years after development. The type of external validation that we performed was more stringent than randomly splitting the data into development and validation datasets [8, 13]. Furthermore, we developed a simple graphic score in the form of a pocket-sized card and validated its performance. In this graphic model, the sum of absolute values indicates a point in the graphic curve. This graphic view allows the patient, family and physicians to better comprehend the potential mortality risk of surgery based on the patient's preoperative parameters. Data used in the analyses were collected according to STS standards [2] and, thus, our model has the advantage of being developed on objective definitions. The avoidance of subjective definitions has proven essential to acceptable risk models' performances [7, 10]. We believe that the association between graphic-methodology with objective definitions has made the graphic score both clinically reliable and attractive. In the external validation dataset, the 1999-original model showed good discrimination but poor calibration due to overestimation of predicted in-hospital mortality. The recalibrated model showed significant improvement, a similar experience reported in prior models [10, 13, 15]. Also, this model showed excellent calibration in all patients and its predictive power was maintained into five quintiles of risk. When compared to EuroSCORE, the graphic score predicted mortality with a comparable area under the ROC curve of >0.80 in the external validation dataset. However, 2000-Parsonnet score had an area under the ROC curve of 0.70. EuroSCORE showed good calibration in all patients but overestimated mortality in the lowest quintiles of risk. The 2000-Parsonnet score had poor calibration in all patients and underestimated mortality in the higher quintile of risk (Table 3). Although this model has been developed in Argentina as a representative community of Latin-America, yet, its external validation was on patients operated on in three centers in the same country. For the risk model to be more representative of the general Latin-American population, other countries' datasets should have been involved in the development or in the external validation dataset. However, this is the first risk model for cardiac surgery developed in a population of Latin-America and validated both internally and externally and also temporally. The graphic score we developed is a simple pocket-card that allows easy bedside application with acceptable statistical precision.
Related Article
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |