1. Trang chủ
  2. » Giáo án - Bài giảng

development and validation of risk assessment models for diabetes related complications based on the dcct edic data

9 3 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 9
Dung lượng 293,54 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The models are developed on data from the Diabetes Control and Complications Trial DCCT and the Epidemiology of Diabetes Interventions and Complications EDIC studies, and are validated o

Trang 1

Development and validation of risk assessment models for

diabetes-related complications based on the DCCT/EDIC data

Vincenzo Lagania,⁎ , Franco Chiarugia, Shona Thomsonb, Jo Furssec, Edin Lakasingc,

Russell W Jonesc, Ioannis Tsamardinosa,d

a

Institute of Computer Science, Foundation for Research and Technology—Hellas, Heraklion, Greece

b

Herts Valley Clinical Commission Group, Hertfordshire, United Kingdom

c

Chorleywood Health Center, Chorleywood, United Kingdom

d

Computer Science Department, University of Crete, Heraklion, Greece

a b s t r a c t

a r t i c l e i n f o

Article history:

Received 28 November 2014

Received in revised form 10 February 2015

Accepted 1 March 2015

Available online xxxx

Keywords:

Risk assessment models

Risk stratification

Risk factors

Diabetes complications

Risk model external validation

Aim: To derive and validate a set of computational models able to assess the risk of developing complications and experiencing adverse events for patients with diabetes The models are developed on data from the Diabetes Control and Complications Trial (DCCT) and the Epidemiology of Diabetes Interventions and Complications (EDIC) studies, and are validated on an external, retrospectively collected cohort

Methods: We selectedfifty-one clinical parameters measured at baseline during the DCCT as potential risk factors for the following adverse outcomes: Cardiovascular Diseases (CVD), Hypoglycemia, Ketoacidosis, Microalbuminuria, Proteinuria, Neuropathy and Retinopathy For each outcome we applied a data-mining analysis protocol in order to identify the best-performing signature, i.e., the smallest set of clinical parameters that, considered jointly, are maximally predictive for the selected outcome The predictive models built on the selected signatures underwent both an interval validation on the DCCT/EDIC data and an external validation on

a retrospective cohort of 393 diabetes patients (49 Type I and 344 Type II) from the Chorleywood Medical Center, UK

Results: The selected predictive signatures containfive to fifteen risk factors, depending on the specific outcome Internal validation performances, as measured by the Concordance Index (CI), range from 0.62 to 0.83, indicating good predictive power The models achieved comparable performances for the Type I and, quite surprisingly, Type II external cohort

Conclusions: Data-mining analyses of the DCCT/EDIC data allow the identification of accurate predictive models for diabetes-related complications We also present initial evidences that these models can be applied

on a more recent, European population

© 2015 Published by Elsevier Inc

1 Introduction

Computational models for assessing the risk of diabetes-related

complications are becoming more and more prevalent in diabetes

clinical research (Palmer, 2013) Risk assessment models can be

defined as mathematical tools that evaluate the risk of experiencing

an adverse outcome on the basis of patient’s clinical profile These

models are employed in clinical practice for assisting the clinicians in

stratifying patients according to the gravity of their conditions and the

possible evolution of their clinical trajectories Moreover, devising risk

assessment models usually leads to the identification of novel risk

factors associated with a given complications In turn, this knowledge potentially grants a better understanding of diabetes pathophysiology (Ajmera, Swat, Laibe, Le, & Chelliah, 2013)

We analyzed the information collected during the Diabetes and Complication Control Trial (DCCT) (The Diabetes Control and Complications Trial Research Group, 1993) and the Epidemiology of Diabetes Interventions and Complications study (EDIC) (Nathan et al.,

2005) for deriving risk assessment models for seven different diabetes-related complications and adverse events: Cardiovascular Diseases (CVD), Hypoglycemia, Ketoacidosis, Microalbuminuria, Pro-teinuria, Neuropathy and Retinopathy Particularly, for each compli-cation we tried to identify the minimal set of clinical parameters that, considered jointly, are maximally predictive Identifying such minimal sets of risk factors leads to models easier to interpret, possibly providing intuitions into the mechanisms originating the disease, while discarded factors are either irrelevant or redundant given

Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Conflicts of interest: The authors declare that there are no conflicts of interest.

⁎ Corresponding author N Plastira 100, Vassilika Vouton, GR-700 13 Heraklion, Crete,

Greece Tel.: +30 2810 391070; fax: +30 2810 391428.

E-mail address: vlagani@ics.forth.gr (V Lagani).

http://dx.doi.org/10.1016/j.jdiacomp.2015.03.001

1056-8727/© 2015 Published by Elsevier Inc.

Contents lists available atScienceDirect Journal of Diabetes and Its Complications

j o u r n a l h o m e p a g e :W W W J D C J O U R N A L C O M

Trang 2

the selected ones Borrowing a notation commonly used in

ge-nomic research (Subramanian & Simon, 2010), hereafter we will

refer to such parsimonious, predictive sets of risk factors as

pre-dictive signatures

During our analyses we employed a complex machine-learning

protocol (Lagani & Tsamardinos, 2010) in order to simultaneously (a)

identify the predictive signatures, (b) derive the best models over the

selected signatures and (c) unbiasedly assess the performances of the

models on the DCCT/EDIC data (internal validation) Moreover, we

retrospectively collected data from 393 Type I (49) and Type II (344)

diabetes patients, followed in the Chorleywood Medical Center (CHC),

United Kingdom (UK), in the period 2004–2014 The models were

evaluated on this external cohort, in order to assess their

transfer-ability on a population with different characteristics with respect to

the one followed in the DCCT/EDIC study

The results of the validation indicate that models trained on a USA/

Canada cohort of diabetes patients enrolled in the 80’s can actually

transfer on a cohort of contemporary European patients

Transfer-ability increases when the models are re-calibrated on the new data

by conserving the original predictive signature This suggests that while

the effect size of each risk factor may change over time and across

different geographical area, factors that were highly predictive in the

80’s can still help clinicians in correctly stratifying diabetes patients

according to their risk

2 Research design and methods

2.1 DCCT/EDIC data

The DCCT design has been described elsewhere (The Diabetes

Control and Complications Trial Research Group, 1993) Briefly, 1441

Type I diabetes patients (13 to 39 years of age) were enrolled in the

study from 1983 to 1989 and followed, on average, for 6.5 years The

study was designed as a randomized control trial, with patients

randomly assigned to conventional or intensive insulin therapy Two

distinct cohort were enrolled: the primary intervention cohort was

composed of patients with albumin concentration≤ 40 mg/24 h, no

retinopathy and having diabetes for 1 to 5 years, while the secondary

intervention cohort comprises subjects with a longer history of

diabetes (1 to 15 years), mild to moderated non-proliferative diabetic

retinopathy, and albumin excretion rate≤ 200 mg/24 h An

exhaus-tive clinical examination was performed at baseline (including

medical history, physical examination, electrocardiogram, and

labo-ratory analyses), while patients’ conditions and risk factors were

re-assessed annually (with glycosylated hemoglobin measured

quarterly (The DCCT Research Group, 1987))

In 1994, 1394 subjects out of the original 1441 DCCT patients (97%)

accepted to participate in a long term follow-up, the EDIC study, whose

main objective was to collect prospective data on the evolution of

macrovascular and microvascular complications (Epidemiology of

Diabetes Interventions and Complications (EDIC) Research Group,

1999) The EDIC followed the same methods of DCCT, with only minor

modifications in the schedule of the measurements of glycosylated

hemoglobin (measured annually), fasting lipid levels and renal function

(re-assessed every two years)

For our analyses we selected fifty-one clinical parameters

measured at DCCT baseline (see Table 1 in the Supplementary

Material) These clinical parameters were selected by a panel of

clinical practitioners as the ones commonly used to date in the

treatment of diabetes Remaining parameters were either measured

solely during the DCCT for research purposes or are not employed in

the clinical practice anymore This selection was performed in order

to enhance the conformity of our results with the medical procedures

followed in modern clinical settings

2.2 Outcomes definition

We have defined seven different outcomes, each one corresponding

to a severe diabetes-related complication or adverse event Several studies (Nathan et al., 2005; The Diabetes Control and Complications Trial Research Group, 1993, 1995a, 1995b, 1995c, 1995d, 1997) have defined and studied similar diabetes-related complications on the DCCT/EDIC data Whenever possible, we have adopted the same definitions suggested by these previous works

2.2.1 Cardiovascular disease (CVD) Following the work presented in (Nathan et al., 2005), we define CVD as thefirst occurrence of any of the following events: Cardiovas-cular death, Acute Myocardial Infarction, Bypass graft/Angioplasty, Angina Pectoris, Cardiac Arrhythmia, Major ECG abnormality, Silent Myocardial Infarction, Congestive Heart Failure, Transient Ischemic Attack, Arterial Event requiring surgery

The relatively young age of the subjects included in the DCCT study led to a particularly low incidence of CVD events: only twenty-eight subjects (1.94%) experienced any macro or microvascular complica-tions One of the main objectives of the EDIC study was to record and study the incidence of CVD complications in the DCCT cohort after the end of the DCCT follow-up We decided to define two distinct outcomes for cardiovascular diseases: thefirst one, hereafter named CVD-DCCT, takes in consideration the DCCT follow-up and includes only the CVD events that occurred during the DCCT study; the second outcome, namely CVD-EDIC, considers the combined follow-up period

of both DCCT and EDIC and includes the CVD events that occurred in both studies

2.2.2 Hypoglycemia and ketoacidosis The Hypoglycemia and Ketoacidosis outcomes were defined as any serious hypoglycemic and ketoacidosis event, respectively, requiring hospitalization, as reported by the patients in each quarterly visit 2.2.3 Microalbuminuria and proteinuria

Microalbuminuria was defined as albumin/creatinine ratio (ACR) greater than or equal to 2.5 mg/mmol (men) or 3.5 mg/mmol (women) (The National Collaborating Centre for Chronic Conditions, 2008), or albumin concentration greater than or equal to 20 mg/l, while Proteinuria was identified by an albumin/creatinine ratio greater than

or equal to 30 mg/mmol or albumin concentration greater than or equal

to 200 mg/l

2.2.4 Neuropathy The Neuropathy outcome was defined as the presence of abnormalities in the autonomic function During the DCCT Neurop-athy was diagnosed on the basis of“physical examination and history confirmed by unequivocal abnormality of either nerve conduction or autonomic nervous system” (The Diabetes Control and Complications Trial Research Group, 1995d) In the CHC validation cohort we used an alternative definition based on the presence of dysfunctions in bowel/ bladder or erectile dysfunction

2.2.5 Retinopathy The presence and severity of retinopathy were assessed in the DCCT study according to a scale derived from the Early Treatment Diabetic Retinopathy Study Scale (ETDRS) (see Tables 1–2 in The Diabetes Control and Complications Trial Research Group, 1995e) Currently, the UK Retinopathy Severity (UKRS) scale (The Royal College of Ophthalmologists, 2012) is usually employed in clinical practice in UK We translated the DCCT–ETDRS measurements in UKRS values, according to the conversion schema reported in Table 1.1 of the Diabetic Retinopathy Guidelines (The Royal College of Ophthalmologists, 2012) (see also Table 3 in Supplementary Mate-rial) After the conversion, we adopted an approach similar to (The

2 V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Trang 3

Diabetes Control and Complications Trial Research Group, 1995e) and

we defined a “retinopathy event” as any worsening in the retina

condition that lasted at least six months

2.3 Derivation of the computational models and internal validation

The goals of our analyses are (a) identifying the best predictive

signature for each outcome, (b)fitting a computational risk-assessment

model over each signature and (c) assessing the predictive

perfor-mances of these models The presence of censoring in the DCCT/EDIC

data requires the adoption of specialized methods for achieving these

goals.“Censoring” in these context means that the information about

the outcome can be partial; particularly, the data used in this work are

affected by right-censoring, i.e., for some subjects the exact

time-to-event is not known, and the only available information is that they

were event-free up to a given point (follow-up time)

More formally, the baseline visit of the DCCT data can be

represented as a dataset D containing m = 1441 diabetes patients,

where each patient is represented as a vector of measurementsxi

defined over a set of n = 51 risk factors X = {X1,…, Xj,…, Xn} Each

outcome K is represented by a tuple Ok= {(ti,δi)}, whereδiis a binary

variable indicating that subject i experienced the specific event (δi= 1)

or not (δi= 0), while tiis the recorded time-to-event or follow-up time

The best signature and predictive model for each outcome are indicated

asXk⁎ ⊆ X and Mk, respectively

Survival Max–Min Parent Children (SMMPC, (Lagani & Tsamardinos,

2010)), Lasso Cox Regression (Tibshirani, 1997), Bayesian Variable

Selection (BVS, (Faraggi & Simon, 1998)), and Forward and Univariate

Selection (Bøvelstad et al., 2007) were employed as feature selection

methods for identifying the best performing signatures These feature

selection methods are based on different theoretical foundations and

assumptions; however, they all attempt to identify a setX* ⊆ X that is

highly predictive with respect to the outcome Notably, while all

methods try to keepX* parsimonious, only SMMPC provides theoretical

guaranties about retrieving a minimal-sizeX* (Tsamardinos, Brown, &

Aliferis, 2006)

Once a signatureX* is identified, predictive models can be fitted over

it Cox regression (Cox, 1972), Ridge Cox regression (Van Houwelingen,

Bruinsma, Hart, Van’t Veer, & Wessels, 2006), Accelerated Failure Time

(AFT) models (Kalbfleisch & Prentice, 1980), Random Survival Forest

(RSF (Ishwaran, Kogalur, Eugene, & Blackstone, 2008)) and Support

Vector Machine Censored Regression (SVCR, (Shivaswamy, Chu, &

Jansche, 2007)) were employed as regression algorithms for model

fitting All regression methods provide models that are able to calculate

a single-point risk estimate for any new subject xm + 1, under the form

rm + 1, k= Mk⁎(xm + 1) These estimates can then be used for ranking

patients according to their relative risk Particularly, for (Ridge) Cox

Regression and AFT models the risk estimates are given by ri=∑βjxij,

whereβ is the coefficient provided by the regression procedure SVCR

and RSF predictions are given by weighted combinations of

kernel-function products and single survival-tree predictions, respectively

Each of these feature selection and regression algorithms requires

the user to provide one or more“hyper-parameters”, i.e., parameters

that are not directly estimated from the data and that must be specified a

priori For example, the hyper-parameterλ in the Lasso Cox Regression

regulates the level of shrinkage for the coefficients and, indirectly, the

number of variables to be included in the regression model SVCR

models require the specification of an appropriate kernel function and

cost-parameter C The hyper-parameters used for each method are

listed in the Supplementary Material

We employed a complex experimentation protocol in order to (a)

find for each outcome the best combination of feature selection and

regression algorithms, along with their respective optimal

hyper-parameters (model selection) and (b) provide an unbiased assessment

of the predictive performance of the selected model (internal

validation/performance estimation) Model selection was performed

through cross validation In cross validation, the data are partitioned

in N separate folds, and each fold is in turn held out for performance estimation purpose (test set) while the rest of the data (training set) is employed for deriving predictive models When N is equal to the number of samples, the procedure is named leave-one-out The configuration that obtains the best average performance over the N folds is then applied on the whole set of data, in order to obtain the final predictive signature X* and the corresponding model M* The predictive performances of the final models were assessed through nested-cross validation (Statnikov, Aliferis, Tsamardinos, Hardin, & Levy, 2005) Nested-cross validation is an extension of the common cross validation procedure, where an inner loop of cross validation is performed within each training set The inner loop serves for selecting the best combination of algorithms and hyper-parameters, while the N test sets of the outer cross validation are used exclusively for performance estimation The procedure provides

a vectorP = {P1,…, PN} of estimated performances, whose average valueP is typically taken as single-point estimate Notably, nested-cross validation estimates are usually conservative (Tsamardinos, Lagani, & Rakhshani, 2014) Figs 1 and 2 in the Supplementary Material provide a visual representation of both procedures All performances are measured in terms of Concordance Index (CI (Uno, Cai, Pencina, D’Agostino, & Wei, 2011)) The CI metric is specific for right censored survival data, and it can be interpreted as the probability that the model will correctly rank two randomly selected subjects in accordance to their actual risk of experiencing a given event Similarly to the Area Under the Receiver Operator Curve metric for binary classification problems (AUC (Fawcett, 2006)), a value of CI equals to one indicates a perfect rank in terms of relative risk, while a value of 0.5 indicates a random ordering

In both nested and standard cross validation the variables of each training set are standardized to have zero-mean and unitary standard deviation Test sets are standardized according to the mean and standard deviation values of the corresponding training set More-over, categorical variables are transformed in sets of binary variables, one binary variable for each category In this way the feature selection methods are free to include in each model only the categories that are relevant for the outcome at hand

2.4 External validation Validation data were retrospectively collected from 393 diabetes patients who were admitted at the CHC premises between 2004 and

2014 Forty-nine patients (12.5%) had Type I diabetes, while the remaining ones were diagnosed with Type II diabetes For each patient and for each outcome we considered thefirst visit where the risk factors included in the corresponding predictive signature were measured Patients that already developed a specific complication at the time of thefirst visit were not employed for the validation of the respective predictive model Missing values were replaced with the average or mode values of the respective predictors, as calculated on the DCCT baseline data The data collection procedure produced seven distinct datasets, one for outcome, with a number of included subjects ranging between 274 and 343 and with an average follow-up between 37.6 and 69.4 months Table 2 in the supplementary material describes the distribution of the validation data and compares it with the DCCT cohort

3 Results 3.1 The predictive signatures and their interplay Thefinal risk assessment models are reported inTable 1 Each model is composed of a number of risk factors ranging fromfive to ten, for a total of twenty-five risk factors included in at least one model For each outcome a different regression algorithm was chosen by the

3

V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Trang 4

model selection procedure: Ridge Cox Regression for CVD-DCCT,

Ketoacidosis and Proteinuria outcomes, Accelerated Failure Time

models for CVD-EDIC, Neuropathy and Retinopathy, linear-kernel

Support Vector Machines and Random Survival Forest for

Hypogly-cemia and Microalbuminuria, respectively The corresponding

opti-mal feature selection methods are reported in Supplementary Table 4

Each regression algorithm produces coefficients with a specific

interpretation; particularly, Ridge Cox Regression coefficients

repre-sent a hazard ratio change in the logarithmic scale This means that for

a standard-deviation unit increase (i.e., 1.594%, DCCT scale) in glycated hemoglobin (HbA1c) the hazard of a CVD complication becomes e0.204= 1.23 times higher AFT and linear-kernel SVCR coefficients act as linear multipliers for the expected time to event This means that for the same increase in HbA1c the expected time before developing Neuropathy decreases by 4.812 months RSF usually provides highly non-linear models, where the effect of each

Table 1

Risk assessment models.

Clinical parameters CVD-DCCT (Cox

Regression)

CVD-EDIC (Accelerated Failure Model)

Hypoglycemia (Support Vector Machine)

Ketoacidosis (Ridge Cox Regression)

Microalbuminuria (Random Survival Forest)

Proteinuria (Ridge Cox Regression)

Neuropathy (Accelerated Failure Model)

Retinopathy (Accelerated Failure Model)

# Models

Marital Status −0.146

(Never Married) 0.095 (Divorced)

−80.667 (Married)

5.24E-006 (Widowed)

|0.043|

(Married)

−0.26 (Married)

5

Albumin-urine

value (mg/24 h)

Insulin Regime

(Strict/Standard

control)

380.5 (Strict) |0.054| (Strict) −0.036

(Strict)

3

Retinopathy level

(R0, R1, R2, R3)

|0.091| (R2) 16.113 (R2) 1.437 (R0)

−0.931 (R2)

3

Total Insulin Daily

Dosage (Units/

Weight)

Post Pubescent

diabetes duration

(in months)

Total diabetes

duration (in

months)

Presence of

neuropathy

Patient’s occupation 0.056 (Manager)

0.074 (Clerical) 0.031 (Laborer)

−0.088 (Student)

−2.61E-005 (Manager)

2

Smoke

(never/ex-smoker/current)

−0.114 (Never) 0.128 (Current)

−431.822 (ex-smoker)

2

Patient's body mass

index (kg/m 2

)

0.096 1

Patient attempted

suicide

Creatinine

Clearance (ml/min)

Family history of

IDDM

Family History

of NIDDM

HDL serum

cholesterol (mg/dl)

Systolic Blood

Pressure

Past history of

severe

hypoglycemia

Glomerularfiltration

rate (ml/min)

Gender specific

ideal body weight

Hospitalization(s)

due to ketoacidosis

in past year

Each row represents a risk factor, while each column reports a single model The header shows the outcome of interest for each model along with the regression algorithm selected

by the model-selection procedure (see the Method section) Cells report model coefficients, with empty cells indicating risk factors not included in the corresponding model Categorical risk factors can have multiple coefficients, one for each category included in the model The semantics of the coefficients depends on the used regression algorithm: log-hazard ratio for Ridge Cox Regression, survival time multipliers for Accelerated Failure Time models and (linear kernel) Support Vector Machines, relative variable importance for Random Survival Forest (see text for more details) The original AFT and SVCR coefficients’ signs have been switched in order to have positive values indicating an increase in the risk in all models Micro-albuminuria coefficients are reported as absolute values whose signs do not reflect an increase or decrease of the risk.

4 V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Trang 5

single risk factor can vary depending on the values of the other

predictors Consequently, covariates in an RSF model do not have a

univocal coefficient, i.e., it is not generally possible to assess if the factor

has a protective or deleterious effect However, a method has been

developed for estimating Variable IMPortance (VIMP) in the RSF

models, where the VIMP is proportional to the contribution of the

variable in the predictive performance of the model (Ishwaran, 2007)

The VIMP values for the Microalbuminuria model inTable 1have been

scaled in order to sum up to one, for ease of comparison

Given these different interpretations, it is not possible to compare

effect-sizes across different models However, within each model the

absolute value of each coefficient is directly proportional to the effect

size of the corresponding predictors, and can be used for raking

factors among each other We further set the signs of all coefficients

such that positive values indicate an increment in risk while negative

values indicate a decrease (except for the VIMP values of the RSF

model that are reported in absolute value)

3.2 Internal and external validation

Table 2 reports and contrasts the results of both internal and

external validation For the internal validation, we report the average CI

values obtained on the DCCT data through the nested-cross validation

procedure These values represent our expectations on the

perfor-mances that the models should achieve when applied on a validation

cohort coming from the same population of the training data, i.e., a

hypothetical validation cohort collected in the same years, in the same

geographical area and with similar characteristics of the DCCT data

(Tsamardinos et al., 2014) Models’ results lay in the range [0.6024–

0.8333], meaning that we expect all models to provide a relevant

improvement with respect to a random ranking the patients (CI = 0.5)

For all models, the CI is statistically significantly greater than 0.5

(p-value≤ 0.001, as calculated with a one-tail t-test) For each model,

we also report the interval spanned by the CI values calculated over the

external folds of the nested-cross validation procedure

For the external validation, the final models were separately

applied on the Type I and Type II diabetes patients of the Chorleywood

cohort The resulting CI values estimate the predictive ability of the

models on a UK-based population collected in recent times

Interestingly, the models perform surprisingly well, reaching

perfor-mances statistically significantly different from random guessing for

several models For each model we also report the bootstrapped

estimates (Efron & Tibshirani, 1986) of the 95% confidence interval

and a permutation-based p-value assessing the null hypothesis H0:

CI≤ 0.5 These permutation p-values are obtained by comparing the

observed CI value with the null-distribution obtained by randomly permuting 10,000 times the order of the predictions

For Type I diabetes, several models manage to achieve a relevant and statistically significant predictive performance, particularly the Micro-albuminuria, Neuropathy and Retinopathy models The CVD-DCCT and Hypoglycemia are also borderline significant The validation cohorts for the remaining models contain less than 5 events, and the respective results should be considered carefully

The external validation on Type II patients brought positive results

as well Particularly, both CVD models, as well as the Microalbumi-nuria and ProteiMicroalbumi-nuria models achieve statistically significant results

on a relatively large number of events The results of the Hypogly-cemia model are barely significant, but it is interesting to note that this model achieves almost identical results in both Type I and II external cohorts The Neuropathy and Retinopathy models did not prove to be better than random, and the Ketoacidosis model was not applicable on Type II diabetes patients

3.3 Calibration and re-assessment of the risk models Risk factors’ effect on the probability of developing diabetes-related complications may differ across geographical areas or over time, due to several reasons For example, the association between a given risk factor and the outcome may be (partially) mediated by a third, unknown and unmeasured quantity If the value of this third quantity changes across different places, or over time, then also the association between the risk factor and the outcome changes or even ceases It is worthwhile to underline that the DCCT and Chorleywood cohorts were collected in different countries, and the DCCT data collection started in 1983, while the earliest recorded visit in Chorley-wood was performed in 2004 (N20 years difference) Moreover, treatment options for diabetes patients (Franz et al., 2003; Gallen,

2004) and nutritional habits (Kuklina, Carrol, Shaw, & Hirsch, 2013) have provably undergone considerable changes during this period This implies that the models derived from the DCCT data may need

to be re-calibrated or revised in order to provide accurate predictions

on the Chorleywood cohorts, since the effects of the risk factors may differ between the two populations

We follow the approach suggested byVan Houwelingen (2000)for assessing the calibration of the single-point risk estimates r against a known outcome O = {(δi, ti)} The approach consists infitting a Cox regression model h(t|r) = h0(t)exp(α ⋅ r), where h(t|r) is the hazard

at time t given r, h0is the baseline hazard function, andα is the single coefficient of the model A perfectly calibrated model would produce

Table 2

Results of the internal and external validation of the models.

Type I Diabetes

Internal Validation

Type I Diabetes External Validation

Type II Diabetes External Validation Model name # Events Average

CI Cross-Validation

CI Interval

p-value H0: Aver.

CI ≤ 0.5

# Events CI CI 95%

Confidence Interval

p-value H0: CI

≤ 0.5

# Events CI CI 95%

Confidence Interval

p-value H0: CI

≤ 0.5 CVD-DCCT 28 0.7257 [0.50962–0.8629] 0.0001 5 0.6887 [0.4923–0.86207] 0.0932 32 0.7143 [0.62384–0.80563] b0.0001 CVD-EDIC 127 0.6204 [0.5549–0.69224] ≤0.0001 5 0.4862 [0.18084–0.81984] 0.5246 33 0.6099 [0.50211–0.71809] 0.0165 Hypoglycemia 408 0.6694 [0.58766–0.75118] ≤0.0001 8 0.6903 [0.5–0.8691] 0.0584 5 0.7002 [0.19012–0.97115] 0.0084 Ketoacidosis 130 0.6745 [0.59412–0.75479] ≤0.0001 3 0.8182 [0.23077–1] 0.0367 – – – Microalbuminuria 299 0.7421 [0.6751–0.77652] ≤0.0001 6 0.824 [0.66234–0.96875] 0.0078 116 0.5701 [0.52144–0.62193] 0.0058 Proteinuria 44 0.8330 [0.53521–0.96223] ≤0.0001 0 – – – 28 0.6569 [0.53261–0.77125] 0.0027 Neuropathy 149 0.6661 [0.54626–0.74187] ≤0.0001 6 0.735 [0.55102–0.90754] 0.0429 20 0.4359 [0.32132–0.56216] 0.8239 Retinopathy 969 0.6564 [0.60826–0.6745] ≤0.0001 17 0.7201 [0.58669–0.8745] 0.0025 70 0.5451 [0.47399–0.6189] 0.119 External validation was separately performed on Type I and Type II diabetes patients, while internal validation was performed only on Type I patients (as the DCCT study focused exclusively on Type I diabetes) For the internal validation and for each model (rows) we report the total number of events, the predictive performance expressed as nested-cross validated Concordance Index (CI), the interval spanned by the CI values obtained in the external loop of the nested-cross validation, and a p-value assessing the null-hypothesis that the CI is less or equal than 0.5, i.e., that the risk stratification provided by the model is not better than random For the external validations we report the CI values obtained by applying the final models on the external cohorts, along with the 95% confidence interval estimated through bootstrapping The p-values for the internal evaluation are calculated through one-tail t-test, while for the external evaluation they are obtained through a permutation-based test (see text for more detail).

5

V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Trang 6

α ≈ 1, while higher or lower values would indicate an

under-estimation or over-under-estimation of the actual risk, respectively

Table 3shows the calibration Cox regression coefficients for each

outcome The most calibrated models seem to be the ones

corre-sponding to CVD-DCCT, Proteinuria and Retinopathy (the latter on the

Type I cohort only), while all the other models seem to provide

predictions that are somewhat overly optimistic or pessimist These

results suggest that the models should be revised and re-evaluated on

the new data in order to provide more accurate predictions We thus

decided to re-fit the coefficients of the models on the external cohorts

and to assess the predictive performances of the revised models

through cross validation Specifically, for each outcome and external

cohort we performed a ten-fold cross-validation by using the same

signature, regression method and hyper-parameter configuration

select-ed on the DCCT/EDIC data For outcomes with fewer than 10 recordselect-ed

events we employed a leave-one-out cross-validation schema, and

the performance was calculated on all predictions pooled together

The adoption of this revision procedure implies that we assume that

the signatures selected on the data from the DCCT baseline visits have

a valuable predictive power also for the Chorleywood cohort

The results of model revision are reported inTable 3 All the models

showed at least a slight improvement in terms of average CI, except for

the CVD-DCCT and Hypoglycemia models in the Type II diabetes cohort

and for Neuropathy in the Type I cohort Some models achieve perfect

score (CI = 1), although the limited number of events available for

these outcomes suggests to consider these results carefully

4 Discussion

4.1 Mainfindings

The main contribution of the present work consists of the

derivation of a set of computational models for assessing the risk of

developing diabetes-related complications The models have been

derived on the basis of the baseline-visit data of the DCCT study and of

the DCCT/EDIC follow-up information Furthermore, the derivation of

the models led to the identification of the minimal-size, maximally

predictive set of features for each considered outcome, out of an initial

set offifty-one clinical parameters measured in the DCCT baseline

visit.Table 3 reports the clinical parameters included in each risk

assessment model, along with their respective coefficients Negative

coefficients indicate protective factors, while factors with positive

coefficients are associated with increasing risk

The level of glycated hemoglobin HbA1c demonstrated to be the

most relevant risk factor, being included in seven models out of eight

Particularly, high values of HbA1c are associated with increased risk of developing diabetes-related complications This is perfectly in line with the current literature (Huang, Liu, Moffet, John, & Karter, 2011; Marcovecchio, Dalton, Chiarelli, & Dunger, 2011; Weber & Schnell,

2009) and in particular with the previous studies on the DCCT cohort (The Diabetes Control and Complications Trial Research Group, 1996) Our analyses also point out the relevance of the marital status for predicting the probability of developing diabetes-related complica-tions and adverse events Being married is associated with a lower risk

of experiencing hypoglycemia or retinopathy worsening The pres-ence of a spouse is known to have a beneficial effect in different pathologies (Chung, Moser, Lennie, & Riegel, 2006; Goodwin, Hunt, Key, & Samet, 1987; Sugarman, Bauer, Barber, Hayes, & Hughes, 1993), and a recent work has demonstrated that, in heart failure patients, this beneficial effect is mediated by the medication adherence (Wu et al.,

2014) Thus, a possible explanation for our results is that being married increases the adherence to medication or diet, and this in turn improves the patient’s prognosis For the CVD and Ketoacidosis models being respectively divorced or widowed increases the risk of experiencing an adverse event In this case the marital status may act

as a proxy for the patient’s ages, since both divorced and widowed DCCT sub-cohorts are characterized by an older age than the rest The baseline value of the urine-albumin excretion rate turns out to be predictive of renal complications (i.e., Microalbuminuria and Proteinuria),

a result already known in the medical literature (Newman et al., 2005), and for the development of cardiovascular diseases and Neuropathy The CVD-DCCT and CVD-EDIC models are in agreement with the CVD risk factors previously identified on the DCCT/EDIC data; particularly, all elements in the signature of the DCCT-EDIC model are listed among the clinical characteristics at DCCT baseline that were significantly associated with cardiovascular disease over the course of the DCCT/EDIC Study (Nathan et al., 2005)

The predictive signatures of both CVD-DCCT and CVD-EDIC models closely resemble the results of different studies focusing on identifying relevant risk factor for cardiovascular complications in diabetes patients Particularly, our results are in good agreement with the results of the UK Prospective Diabetes Study (UKPDS)

The UKPDS was a landmark randomized controlled trial,

conduct-ed over a period of 14 years (1977–1991) and involved 5102 patients followed, on average, for a period of 10.7 years The study actually showed that strict control of blood glucose and blood pressure can lower the risk of diabetes-related complications in individuals recently diagnosed with Type II diabetes (Turner & Holman, 1996) Several risk assessment models were developed on the basis of the UKPDS data Thefirst UKPDS model (Stevens, Kothari, Adler, & Stratton,

Table 3

Results of models’ recalibration and re-assessment.

Type I Diabetes Revised models

Type II Diabetes Revised Models Model name # Events Calibration α Average

CI

Cross-Validation

CI Interval

p-value H0: Aver.

CI ≤ 0.5

# Events Calibration α Average

CI

Cross-Validation

CI Interval

p-value H0: Aver.

CI ≤ 0.5 CVD–DCCT 5 0.4989 1 – b0.0001 32 1.2238 0.6757 [0.54386–0.92683] 0.003 CVD–EDIC 5 −0.0011 0.6422 – 0.2712 33 0.0013 0.6621 [0.54348–0.83871] 0.0002

Microalbuminuria 6 0.0975 1 – b0.0001 116 0.0368 0.5932 [0.41146–0.67516] 0.0023

Neuropathy 6 0.0096 0.6496 – 0.2436 20 –0.0036 0.5285 [0.076923–0.87234] 0.3833 Retinopathy 17 0.6625 0.7521 [0.5–1] 0.0039 70 0.0822 0.5664 [0.42063–0.76238] 0.0381 For each outcome and external cohort, the calibration of the corresponding model is assessed (a) by applying the model on the external cohort and (b) by using the resulting vector

of risk scores r i as a predictor in a Cox regression Cox coefficients close to one indicate well calibrated models The predictive capabilities of the selected signatures are then re-assessed using only the external cohort data Specifically, for each outcome and external cohort the predictive performance of the selected signature, regression method and hyper-parameter configuration is assessed through ten-fold cross-validation For each model and each cohort the number of events, the calibration Cox regression coefficient α, the cross-validated CI value along with its corresponding interval over the cross-validation folds are reported The statistical significance of the CI values is assessed through a one-tail t-test Outcomes with fewer than 10 recorded events were evaluated with a leave-one-out cross validation schema, which allows better performance estimation.

6 V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Trang 7

2001) included Age, Gender, Race, Smoking, HbA1c, Systolic Blood

Pressure and Total Cholesterol/HDL Cholesterol ratio as predictors, and

focused on assessing the probability of developing Coronary Hearth

Diseases (CHD) The second version of the model (Clarke et al., 2004)

provides seven different mathematical equations for predicting as many

diabetes-related complications (stroke, heart failure, fatal or non-fatal

MI, other IHD, amputation, renal failure and blindness) and three

different equations for assessing the risk of mortality This second model

is based on the same predictors of thefirst one, but it also includes

information about the patients’ medical history (previous occurrences

of diabetes-related adverse events) and physiology (Body Mass Index,

BMI) The latest version of the UKPDS model was published recently

(Hayes, Leal, Gray, Holman, & Clarke, 2013), and it slightly modifies

the previous versions by including information about micro or

macro-albuminuria, estimated GFR, heart rate, white blood cell count

and hemoglobin

Interestingly, both our CVD models include a subset of UKPDS

predictors, namely Age, Smoking and HbA1c The CVD-DCCT model

also includes Systolic Blood Pressure and Weight, both considered in

the latest version of the UKPDS engine Moreover, our CVD models

and the UKPDS models are fully in agreement regarding the direction

of effect of the common predictors, i.e., all common predictors act as

risk factors, and never as protective factors

The Hypoglycemia model suggests that being married and having

a family history of non-insulin dependent diabetes have a protective

effect against hypoglycemic events, while a past history of severe

hypoglycemia, strict glucose control and an elevated number of

insulin units per kg of weight significantly increase patient’s risk It is

worthwhile to note that the negative effect of strict glucose control on

the probability of experiencing hypoglycemia was one of the main

outcomes of the DCCT study In particular, strict glucose control is

known to lower the risk of several diabetes-related complications

except hypoglycemia (The Diabetes Control and Complications Trial

Research Group, 1995a)

The Ketoacidosis model includes several factors, the most relevant

ones being (according to the magnitude of their respective coefficients)

HbA1c, Total Insulin Dosage, Post-Pubescent diabetes duration,

Choles-terol, Hospitalization(s) due to ketoacidosis in past year (risk factors)

and Gender specific ideal body weight (protective factor) To the best of

our knowledge this is thefirst study providing a predictive model for

assessing the risk of experiencing ketoacidosis Studies investigating the

association of clinical parameters with ketoacidosis exist (Egger, Davey

Smith, Stettler, & Diem, 1997), however they do not provide quantitative

models for the estimation of the risk of ketoacidosis These studies

generally point out that an intensified treatment is associated with the

probability of experiencing ketoacidosis, which is in agreement with

our results

The two models related to renal complications (Microalbuminuria

and Proteinuria) share several predictive factors, whose relevance in the

development of renal complication in diabetes patients is already

known in the literature and was even assessed on the DCCT data

(Lopes-Virella et al., 2013): HbA1c (The Diabetes Control and

Complications Trial Research Group, 1996), Albumin-urine value over

24 h (Newman et al., 2005), Insulin Regime (The Diabetes Control and

Complications Trial Research Group, 1995c) and Total diabetes duration

A recent study (Vergouwe et al., 2010) conducted on 1115 Type I

diabetes patients also confirms the relevance of HbA1c and

Albumi-n-urine value for predicting the progression of microalbuminuria, while

another study (Elley et al., 2013) conducted on a large New Zealand

cohort (25,736 Type II diabetes patients) and focusing on End-Stage

Renal Diseases (ESRD) also identifies HbA1c and Total diabetes duration

as relevant risk factors

The Neuropathy and Retinopathy models also share part of their

predictors, particularly HbA1c, the Retinopathy level at baseline, and

Post-pubescent diabetes duration, all factors that were found to be

associated with low peripheral nerve conduction (an indicator of

neuropathy) in a study of 456 diabetes Type I individuals (Charles et al., 2010) The association between HbA1c and Retinopathy progres-sion has been already studied and established (The Diabetes Control and Complications Trial Research Group, 1995b)

One further relevant contribution of our study is the validation of the models on the retrospective cohort collected in the Chorleywood Health Center For the Type I diabetes external cohort, four models out of seven achieved statistically significant (p-value b 0.05) results, while two models (CVD-DCCT and Hypoglycemia) achieved appreciable CI performance (0.6887 and 0.6903, respectively), also borderline statis-tically significant In the case of the Type II diabetes cohort, five models out of seven achieved results statistically significantly better than random guessing (CIN 0.5)

Models’ transferability generally increases when the models are re-calibrated on the new data while the original predictive factors are conserved All revised models perform better in terms of CI than the original models, with the exception of Neuropathy for Type I and CVD-DCCT/Hypoglycemia for Type II diabetes cohorts However, we note that for these models the revised CI values are within the 95% confidence interval of the CI results of the original models In general, these results support our hypothesis that the predictive signatures selected on the DCCT/EDIC data are able to give accurate predictions

on the cohorts collected in Chorleywood

4.2 Study limitations Thefirst relevant limitation of this study is the relatively restricted number of subjects and adverse events in the external validation cohorts In some cases the scarcity of recorded events did not allow a precise estimation of the models’ performances and respective confidence intervals Thus, our results only suggest that our models successfully transfer across populations, but more extensive studies on larger cohorts of Type I and Type II diabetes patients are needed in order

to gather further evidences

One more limitation concerns the Hypoglycemia and Ketoacidosis models Accurately evaluating the probability of experiencing these adverse events would require some short-term information about nutrition and physical activity, not present in the list of considered predictors Despite this limitation, both models achieve good level of predictive performances, in both the internal and external validation

5 Conclusions

We use the DCCT/EDIC data for deriving a set of computational models for assessing the risk of developing diabetes-related complica-tions in diabetes patients Each model is defined over a parsimonious set

of predictors (clinical parameters) with maximal predictive power for its specific outcome Predictors included in the models are generally in agreement with the current literature regarding risk factors for diabetes-related complications When applied on a retrospective va-lidation cohort collected in UK, the models often provide predictions that are significantly better than random, supporting the hypothesis that the models transfer on a population that is geographically distant and more recent than the one originally examined in the DCCT/EDIC studies Future works will focus on the validation of the models on larger cohorts of diabetes patients, both Type I and Type II, in order to further strengthen the results hereinto presented

Acknowledgements This work was performed in the framework of the FP7 Integrated Project REACTION (Remote Accessibility to Diabetes Management and Therapy in Operational Healthcare Networks) partially funded by the European Commission under Grant Agreement 248590

The work was also partially funded by the EPILOGEAS GSRT ARISTEIA II project, No 3446

7

V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Trang 8

The Diabetes Control and Complications Trial (DCCT) and its

follow-up the Epidemiology of Diabetes Interventions and

Complica-tions (EDIC) study were conducted by the DCCT/EDIC Research Group

and supported by National Institute of Health grants and contracts and

by the General Clinical Research Center Program, NCRR The data (and

samples) from the DCCT/EDIC study were supplied by the NIDDK

Central Repositories This manuscript was not prepared under the

auspices of the DCCT/EDIC study and does not represent analyses or

conclusions of the DCCT/EDIC study group, the NIDDK Central

Repositories, or the NIH

The authors would also like to thank the medical and technical

personnel of the Chorleywood Health Center for their

indispens-able assistance

Appendix A Supplementary data

Supplementary data and methods to this article can be found

online athttp://dx.doi.org/10.1016/j.jdiacomp.2015.03.001

References

Ajmera, I., Swat, M., Laibe, C., Le, Novère N., & Chelliah, V (2013) The impact of

mathematical modeling on the understanding of diabetes and related complications.

CPT pharmacometrics Syst Pharmacol2 (pp e54), e54 ([Internet] Available from:

http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3731829&tool=

pmcentrez&rendertype=abstract/nhttp://www.scopus.com/inward/record.url?

eid=2-s2.0-84881162079&partnerID=tZOtx3y1 ).

Bøvelstad, H M., Nygård, S., Størvold, H L., Aldrin, M., Borgan, Ø., Frigessi, A., et al.

(2007) Predicting survival from microarray data—A comparative study.

Bioinformatics, 23, 2080–2087.

Charles, M., Soedamah-Muthu, S S., Tesfaye, S., Fuller, J H., Arezzo, J C., Chaturvedi, N.,

et al (2010) Low peripheral nerve conduction velocities and amplitudes are

strongly related to diabetic microvascular complications in type 1 diabetes: The

EURODIAB Prospective Complications Study Diabetes Care, 33(12), 2648–2653

([Internet] [cited 2014 Nov 11] Available from: http://www.pubmedcentral.nih.

gov/articlerender.fcgi?artid=2992206&tool=pmcentrez&rendertype=abstract ).

Chung, M L., Moser, D K., Lennie, T A., & Riegel, B (2006) Abstract 2509: Spouses

enhance medication adherence in patients with heart failure Circulation, 114(18_

MeetingAbstracts), II_518 ([cited 2014 Nov 9] Available from: http://circ.

ahajournals.org/cgi/content/meeting_abstract/114/18_MeetingAbstracts/II_518 ).

Clarke, P M., Gray, A M., Briggs, A., Farmer, A J., Fenn, P., Stevens, R J., et al (2004) A

model to estimate the lifetime health outcomes of patients with type 2 diabetes:

The United Kingdom Prospective Diabetes Study (UKPDS) Outcomes Model

(UKPDS no 68) Diabetologia, 47(10), 1747–1759 ([cited 2014 Nov 11] Available

from: http://www.ncbi.nlm.nih.gov/pubmed/15517152 ).

Cox, D R (1972) Regression models and life-tables Journal of the Royal Statistical

Society, Series B, 34, 187–220.

Efron, B., & Tibshirani, R (1986) Bootstrap methods for standard errors, confidence

intervals, and other measures of statistical accuracy Statistical Science, 1(1), 54–75

(Institute of Mathematical, Statistics; [cited 2014 Oct 21]).

Egger, M., Davey Smith, G., Stettler, C., & Diem, P (1997) Risk of adverse effects of

intensified treatment in insulin-dependent diabetes mellitus: A meta-analysis.

Diabetic Medicine, 14(11), 919–928 (cited 2014 Nov 8] Available from: http://

www.ncbi.nlm.nih.gov/pubmed/9400915 ).

Elley, C R., Robinson, T., Moyes, S A., Kenealy, T., Collins, J., Robinson, E., et al (2013).

Derivation and validation of a renal risk score for people with type 2 diabetes.

Diabetes Care, 36, 3113–3120.

Epidemiology of Diabetes Interventions and Complications (EDIC) Research Group

(1999) Design, implementation, and preliminary results of a long-term follow-up

of the Diabetes Control and Complications Trial cohort Diabetes Care, 22(1),

99–111 ([cited 2014 Sep 13] Available from: http://www.pubmedcentral.nih.gov/

articlerender.fcgi?artid=2745938&tool=pmcentrez&rendertype=abstract ).

Faraggi, D., & Simon, R (1998) Bayesian variable selection method for censored

survival data Biometrics, 54, 1475–1485.

Fawcett, T (2006) An introduction to ROC analysis Pattern Recognition Letters, 27,

861–874.

Franz, M J., Warshaw, H., Daly, A E., Green-Pastors, J., Arnold, M S., & Bantle, J (2003).

Evolution of diabetes medical nutrition therapy Postgraduate Medical Journal,

79(927), 30–35 ([cited 2014 Oct 17] Available from: http://www.pubmedcentral.

nih.gov/articlerender.fcgi?artid=1742592&tool=pmcentrez&rendertype=

abstract ).

Gallen, I (2004) Review: The evolution of insulin treatment in type 1 diabetes: The

advent of analogues The British Journal of Diabetes & Vascular Disease, 4(6),

378–381 (cited 2014 Oct 17).

Goodwin, J S., Hunt, W C., Key, C R., & Samet, J M (1987) The effect of marital status

on stage, treatment, and survival of cancer patients JAMA, 258, 3125–3130.

Hayes, A J., Leal, J., Gray, A M., Holman, R R., & Clarke, P M (2013) UKPDS outcomes

model 2: A new version of a model to simulate lifetime health outcomes of patients

with type 2 diabetes mellitus using data from the 30 year United Kingdom

Prospective Diabetes Study: UKPDS 82 Diabetologia, 56(9), 1925–1933 ([cited

2014 Nov 11] Available from: http://www.ncbi.nlm.nih.gov/pubmed/23793713 ) Huang, E S., Liu, J Y., Moffet, H H., John, P M., & Karter, A J (2011) Glycemic control, complications, and death in older diabetic patients Diabetes Care, 34, 1329–1336,

http://dx.doi.org/10.2337/dc10-2377 (Available from:).

Ishwaran, H (2007) Variable importance in binary regression trees and forests Electronic Journal of Statistics, 1, 519–537 (Institute of Mathematical, Statistics; [cited 2014 Oct 10]).

Ishwaran, H., Kogalur, U B., Blackstone, E H., & Lauer, M S (2008) Random survival forest Annals of Applied Statistics, 2(3), 841–860 (cited 2014 Feb 27).

Kalbfleisch, J D., & Prentice, R L (1980) The statistical analysis of failure time data Internet New York: John Wiley and Sons (Available from: http://proquest.umi.com/pqdweb? did=745641091&Fmt=7&clientId=3748&RQT=309&VName=PQD ).

Kuklina, E V., Carrol, M D., Shaw, K M., & Hirsch, R (2013) Trends in high LDL cholesterol, cholesterol-lowering medication use, and dietary saturated-fat intake: United States, 1976–2010 [Internet] p 7 Available from: http://www.cdc.gov/ nchs/data/databriefs/db117.pdf

Lagani, V., & Tsamardinos, I (2010) Structure-based variable selection for survival data Bioinformatics, 26(15), 1887–1894 (Available from: http://www.ncbi.nlm.nih.gov/ pubmed/20519286 ).

Lopes-Virella, M F., Baker, N L., Hunt, K J., Cleary, P a, Klein, R., & Virella, G (2013) Baseline markers of inflammation are associated with progression to macro-albuminuria in type 1 diabetic subjects Diabetes Care, 36, 2317–2323 (Available from: http://www.ncbi.nlm.nih.gov/pubmed/23514730 ).

Marcovecchio, M L., Dalton, R N., Chiarelli, F., & Dunger, D B (2011) A1C variability as

an independent risk factor for microalbuminuria in young people with type 1 diabetes Diabetes Care, 34, 1011–1013.

Nathan, D M., Cleary, P A., Backlund, J -Y C., Genuth, S M., Lachin, J M., Orchard, T J.,

et al (2005) Intensive diabetes treatment and cardiovascular disease in patients with type 1 diabetes The New England Journal of Medicine, 353, 2643–2653.

Newman, D J., Mattock, M B., Dawnay, A B S., Kerry, S., McGuire, A., Yaqoob, M., et al (2005) Systematic review on urine albumin testing for early detection of diabetic complications Health Technology Assessment, 9(30), iii–vi ([cited 2014 Nov 9], xiii–

163 Available from: http://www.ncbi.nlm.nih.gov/pubmed/16095545 ) Palmer, A J (2013) Computer modeling of diabetes and its complications: A report on the fifth Mount Hood challenge meeting Value Health, 16, 670–685.

Shivaswamy, P K., Chu, W C W., & Jansche, M (2007) A support vector approach to censored targets Seventh IEEE Int Conf Data Min (ICDM 2007).

Statnikov, A., Aliferis, C F., Tsamardinos, I., Hardin, D., & Levy, S (2005) A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis Bioinformatics, 21(5), 631–643 ([cited 2014 Jan 19] Available from: http://www.ncbi.nlm.nih.gov/pubmed/15374862 ) Stevens, R J., Kothari, V., Adler, A I., & Stratton, I M (2001) The UKPDS risk engine: A model for the risk of coronary heart disease in type II diabetes (UKPDS 56) Clinical Science (London), 101, 671–679.

Subramanian, J., & Simon, R (2010) What should physicians look for in evaluating prognostic gene-expression signatures? Nature Reviews Clinical Oncology, 7(6), 327–334, http://dx.doi.org/10.1038/nrclinonc.2010.60 (Nature Publishing Group; [cited 2014 Aug 3] Available from:).

Sugarman, J R., Bauer, M C., Barber, E L., Hayes, J L., & Hughes, J W (1993) Factors associated with failure to complete treatment for diabetic retinopathy among Navajo Indians Diabetes Care, 16(1), 326–328 ([cited 2014 Nov 9] Available from:

http://www.ncbi.nlm.nih.gov/pubmed/8422803 ).

The DCCT Research Group (1987) Feasibility of centralized measurements of glycated hemoglobin in the Diabetes Control and Complications Trial: A multicenter study Clinical Chemistry, 33(12), 2267–2271 ([cited 2014 Sep 13] Available from: http:// www.ncbi.nlm.nih.gov/pubmed/3319291 ).

The Diabetes Control and Complications Trial Research Group (1993) The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus The New England Journal of Medicine, 329(14), 977–986, http://dx.doi.org/10.1056/NEJM199309303291401

([cited 2014 Jul 22] Available from: http://www.ncbi.nlm.nih.gov/pubmed/ 8366922/n ).

The Diabetes Control and Complications Trial Research Group (1995a) Adverse events and their association with treatment regimens in the diabetes control and complications trial Diabetes Care, 18, 1415–1427 (Available from: http://eutils.ncbi.nlm.nih.gov/ entrez/eutils/elink.fcgi?dbfrom=pubmed&id=8722064&retmode=ref&cmd= prlinks/npapers2 ://publication/uuid/BFE8DB4C-0CDB-4977–B262–5947EE56DDDE) The Diabetes Control and Complications Trial Research Group (1995b) The Relationship of Glycemic Exposure (HbAlc) to the Risk of Development and Progression of Retinopathy in the Diabetes Control and Complications Trial Diabetes, 44, 968–983.

The Diabetes Control and Complications Trial Research Group (1995c) Effect of intensive therapy on the development and progression of diabetic nephropathy in the Diabetes Control and Complications Trial Kidney International, 47, 1703–1720.

The Diabetes Control and Complications Trial Research Group (1995d) The effect of intensive diabetes therapy on the development and progression of neuropathy Annals of Internal Medicine, 122(8), 561–568 ([cited 2014 Sep 15] Available from:

http://www.ncbi.nlm.nih.gov/pubmed/7887548 ).

The Diabetes Control and Complications Trial Research Group (1995e) The effect of intensive diabetes treatment on the progression of diabetic retinopathy in insulin-dependent diabetes mellitus Archives of Ophthalmology, 113(1), 36–51 ([cited 2014 Sep 15] Available from: http://www.ncbi.nlm.nih.gov/pubmed/7826293 ) The Diabetes Control and Complications Trial Research Group (1996) The absence of a glycemic threshold for the development of long-term complications: The perspective of the Diabetes Control and Complications Trial Diabetes, 45(10),

8 V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Trang 9

1289–1298 ([cited 2014 Nov 9] Available from: http://www.ncbi.nlm.nih.gov/

pubmed/8826962 ).

The Diabetes Control and Complications Trial Research Group (1997) Clustering of

long-term complications in families with diabetes in the diabetes control and

complications trial Diabetes, 46, 1829–1839.

The National Collaborating Centre for Chronic Conditions (2008) Type 2 Diabetes,

National clinical guideline for management in primary and secondary care (update).

The Royal College of Ophthalmologists (2012) Diabetic Retinopathy Guidelines.

(London).

Tibshirani, R (1997) The lasso method for variable selection in the Cox model Statistics

in Medicine, 16, 385–395.

Tsamardinos, I., Brown, L E., & Aliferis, C F (2006) The max-min hill-climbing Bayesian

network structure learning algorithm Machine Learning, 65(1), 31–78.

Tsamardinos, I., Lagani, V., & Rakhshani, A (2014) Performance-Estimation

Properties of Cross-Validation-Based Protocols with Simultaneous

Hyper-Parameter Optimization SETN’14 Proceedings of the 79 h Hellenic conference on

Artificial Intelligence.

Turner, R C., & Holman, R R (1996) The UK Prospective Diabetes Study UK Prospective

Diabetes Study Group Annals of Medicine, 439–444.

Uno, H., Cai, T., Pencina, M J., D’Agostino, R B., & Wei, L J (2011) On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data Statistics in Medicine, 30, 1105–1117.

Van Houwelingen, H C (2000) Validation, calibration, revision and combination of prognostic survival models Statistics in Medicine, 19, 3401–3415.

Van Houwelingen, H C., Bruinsma, T., Hart, A A M., Van’t Veer, L J., & Wessels, L F A (2006) Cross-validated Cox regression on microarray gene expression data Statistics in Medicine, 25, 3201–3216.

Vergouwe, Y., Soedamah-Muthu, S S., Zgibor, J., Chaturvedi, N., Forsblom, C., Snell-Bergeon, J K., et al (2010) Progression to microalbuminuria in type 1 diabetes: Development and validation of a prediction rule Diabetologia, 53, 254–262.

Weber, C., & Schnell, O (2009) The assessment of glycemic variability and its impact on diabetes-related complications: An overview Diabetes Technology & Therapeutics,

11, 623–633.

Wu, J -R., Lennie, T A., Chung, M L., Frazier, S K., Dekker, R L., Biddle, M J., et al (2014) Medication adherence mediates the relationship between marital status and cardiac event-free survival in patients with heart failure Heart & Lung, 41(2), 107–114 (cited 2014 Nov 9] Available from: http://www.pubmedcentral.nih.gov/ articlerender.fcgi?artid=3288268&tool=pmcentrez&rendertype=abstract ).

9

V Lagani et al / Journal of Diabetes and Its Complications xxx (2015) xxx–xxx

Ngày đăng: 01/11/2022, 09:44

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN