1. Trang chủ
  2. » Ngoại Ngữ

Application of data mining techniques in the prediction of coronary artery disease use of anaesthesia time series and patient risk factor data

259 140 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 259
Dung lượng 5,33 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular

Trang 1

THE P REDICTION OF C ORONARY A RTERY D ISEASE :

Ellen Pitt, B.Sc (Hons), M.B.,B.S (UQ), M.IT (QUT)

Dr Richi Nayak

Submitted in fulfilment of the requirements for the degree of

Master of Information Technology (Research)

School of Information Systems Faculty of Science and Technology Queensland University of Technology

[2009]

Trang 2

ii

Trang 3

iii

Anaesthesia, physiological data, time-series, clustering, feature selection, predictors of outcome, anaesthesia complications, cardiac risk factors, data mining

Trang 4

iv

Trang 5

v

The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD Coronary vascular disease is also the leading cause

of anaesthesia related complications Stress electrocardiography/exercise testing is predictive of

10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes This study aims to find predictors

of CVD using anaesthesia time-series data and patient risk factor data Several pre-processing and predictive data mining methods are applied to this data

Physiological time-series data related to anaesthetic procedures are subjected to processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads) Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms

pre-The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69) For the raw time-series dataset, MR is 12.34 Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92

Trang 6

vi

methods, Cfs could identify a subset of correlated and non-redundant variables from the series alone datasets but models derived from these subsets are of one leaf only MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method

time-For models based on Cfs selected time-series derived and risk factor (RF) variables, the

MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09 The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with

MR of 10.25 and 10.36 respectively For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively DT rules are most comprehensible and clinically relevant The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant The addition of time-series derived variables to models based on risk factor

variables alone is associated with a trend to improved performance

Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological

variable values’ being outside the accepted normal range is associated with some improvement in

model performance

Trang 7

vii

Table of Contents

Keywords iii

Abstract v

Table of Contents vii

List of Tables xiii

List of Figures xv

List of Appendices xix

List of Abbreviations xxi

Statement of Original Authorship xxv

Acknowledgements xxvi

1 CHAPTER 1: INTRODUCTION 1

1.1 Background 1

1.2 Context 2

1.3 Research Objective 2

1.4 Research Questions 3

1.5 Thesis Outline 3

1.6 Significant Results 5

1.7 Other Findings 6

2 CHAPTER 2: LITERATURE REVIEW 7

2.1 Coronary Vascular Disease 8

2.1.1 Impact of cardiovascular disease 8

2.1.2 Risk factors and associated vascular disease 9

2.1.3 Diagnostic methods 11

2.1.4 Risk factor modification and revascularisation 23

2.2 Anaesthesia 25

Trang 8

viii

2.2.2 Anaesthesia monitoring 27

2.2.3 Choice of anaesthetic agent 30

2.2.4 Quality assurance 30

2.2.5 Summary 31

2.3 Data Mining Process 32

2.3.1 Data preparation 34

2.3.2 Modelling 38

2.3.3 Evaluation methods 41

2.4 Related Work 43

2.4.1 Issues in data mining medical databases 44

2.4.2 Application in medical domain 45

2.4.3 Summary 50

2.5 Implications 50

3 CHAPTER 3: RESEARCH DESIGN 53

3.1 Data Acquisition 54

3.2 Data Selection 55

3.2.1 Target variable selection 55

3.2.2 Case selection and segmentation 55

3.2.3 Variable selection 56

3.3 Data Pre-Processing 57

3.3.1 Data exploration 57

3.3.2 Pre-processing tasks 58

3.4 Data Modelling 59

3.4.1 Datasets 60

3.4.2 Feature selection / dimension reduction 60

3.4.3 Modelling methods 60

3.5 Performance Evaluation 62

Trang 9

ix

3.7 Ethics and Limitations 63

3.7.1 Ethical considerations 63

3.7.2 Limitations 63

3.8 Conclusion 65

4 CHAPTER 4: DATA EXPLORATION 67

4.1 Database Description 67

4.1.1 Variable groups 67

4.2 Time-Series Data 68

4.3 Demographic and Clinical Characteristics 77

4.3.1 Missing data 78

4.3.2 Gender 79

4.3.3 Age 80

4.3.4 ASA class 81

4.3.5 Case duration 84

4.3.6 Vascular disease 86

4.3.7 Risk factor characteristics 91

4.3.8 Weight distribution 94

4.3.9 Primary diagnosis group 95

4.4 Conclusion 95

5 CHAPTER 5: DATA PRE-PROCESSING 97

5.1 Data Selection 97

5.1.1 Target selection 97

5.1.2 Variable selection 97

5.1.3 Case selection and risk segmentation 98

5.2 Data Preparation 101

5.2.1 Outlier and noise removal 102

5.2.2 Time-series data reduction 103

Trang 10

x

5.2.4 Feature selection 117

5.2.5 Time-series dimension reduction methods 117

5.3 Conclusions 120

6 CHAPTER 6: DATA MODELLING AND ANALYSIS 121

6.1 Feature Selection Methods 122

6.1.1 Model accuracy 122

6.1.2 Model complexity 124

6.1.3 Effect of feature selection on other measures of model performance 125

6.2 Datasets 126

6.3 Evaluation Measures 129

6.3.1 Misclassification rate for balanced and unbalanced data 129

6.3.2 Area under ROC curve in balanced and unbalanced data 131

6.3.3 Sensitivity, specificity and predictive values 133

6.3.4 Kappa statistic in unbalanced data 135

6.4 Effect of ASA and its imputation 139

6.5 Prediction Methods 141

6.5.1 Comparison of methods from each class of prediction algorithms 141

6.5.2 Comparison of decision tree and rule based prediction algorithms 145

6.6 Effect of A Priori Risk Stratification 147

6.7 Model Complexity 151

6.7.1 Effect of dataset 151

6.7.2 Effect of pre-processing method and risk factor data 153

6.8 Comparison of Models for Prediction of corVD and anyVD 163

6.9 Effect of Non Coronary Vascular Disease Status on Prediction of corVD 168

6.10 Primary hypotheses and Statistical analyses 170

6.11 Summary of Findings 171

6.12 Summary 173

Trang 11

xi

7.1 Comparison with Exercise ECG 175

7.2 Known Difficulties with Prediction Based on Stress/Exercise ECG 178

7.3 Challenges of Time-Series Datasets 181

7.3.1 Pre-processing methods 182

7.3.2 Predictive method 184

7.4 Limitations 185

7.5 Research Design Lessons and Future Studies 186

7.5.1 Research Study Design Lessons 186

7.5.2 Future work 187

7.6 Conclusions 188

BIBLIOGRAPHY 191

APPENDICES 203

Appendix A: Time-series characteristics for Phase 1 and Phase 2 203

Appendix B: Description of ICD code groups 204

Appendix C: Examples of ST depression and elevation (Yanowitz 1996) 205

Appendix D: Variations of ST depression (Yanowitz 1996) 208

Appendix E: ECG lead placement (Yanowitz 1996) 209

Appendix F: Distribution of risk factor variables (shows proportion of type 1 diabetes, current and past smoking and various lipid disorders) 210

Appendix G: Distribution of vascular diseases burden per case 211

Appendix H: Examples of time-series derived variables 212

Appendix I: Effect of data pre-processing on count of heart rate values reaching adequate predicted maximum heart rate (phase 2, general populations and high risk subset) 213 Appendix J: Distribution of HR values in different patient groups 214

Appendix K: Effect of removal of outliers and noise on the number of cases considered to have significant ST deviation 216

Appendix L: Comparison of DT and rule based methods for selected datasets (AUC) 217

Appendix M: Comparison of DT and rule based methods for selected datasets (MR) 218

Trang 12

xii

Appendix O: Comparison of methods and datasets for risk category subsets (AUC) 220

Appendix P: Comparison of methods and datasets for risk category subsets (Kappa statistic)

221

Appendix Q: Comparison of methods in prediction of anyVD in selected subsets 222

Appendix R: Description of ST classes 225

Appendix S: Examples of J48 decision tree rule sets 226

Appendix T: Examples of decision trees 230

Appendix U: Comparison of DT and rule based methods for prediction of anyVD 233

(MR and AUC) 233

Trang 13

xiii

Table 2-1: Diagnostic tests for coronary artery disease 13

Table 2-2: Confusion matrix for binary classification 42

Table 4-1: Variable groups 68

Table 4-2: Number of data rows in Phase 1 and Phase 2 datasets 70

Table 4-3: Count of cases for which each of trend variables was measured 71

Table 4-4: Comparison of time-series characteristics for total valid time-series and time-series with clinical data available (Phase 1) 72

Table 4-5: Time-series characteristics for risk groups 72

Table 4-6: Outlier characteristics (at 3 standard deviations from mean) 73

Table 4-7: Available demographic data for each of the datasets 78

Table 4-8: Available data (count and percent) with ranges for age, gender, ASA class and weight 79

Table 4-9: Distribution of diagnoses cardiovascular diagnoses of relevance to the study 87

Table 4-10: Distribution of Vascular disease count 88

Table 4-11: Vascular diagnoses 89

Table 5-1: Components of case selection process 100

Table 5-2: Study populations and description 101

Table 5-3: Summary of datasets with variable examples 105

Table 6-1: Other performance measures for models based on all dataset either with or without RF variables (general population) 133

Table 6-2: Other performance measures for models based on all dataset either with or without RF variables (high risk population) 134

Table 6-3: Other performance measures for models based on all dataset either with or without RF variables (general population) 134

Table 6-4: Other performance measures for models based on all dataset either with or without RF variables (high risk population) 135

Table 6-5: Model complexity for general population (all variables, corVD prediction) 153

Table 6-6: Model complexity for general population (all variables, corVD prediction) 154

Table 6-7: Model complexity for all variables in high risk population (corVD) 155

Table 6-8: Model complexity for Cfs variables in high risk subset (corVD) 155

Table 6-9: Summary of model complexity for general population and prediction of corVD showing Cfs variables and subset merit 157

Trang 14

xiv

Table 6-11: Summary of model complexity for general population and prediction of anyVD showing

Cfs variables and subset merit 160

Table 6-12: Summary of model complexity for high risk population and prediction of anyVD showing

Cfs variables and subset merit 161

Table 6-13: Statistical analysis of hypotheses 170

Trang 15

xv

Figure 2-1: Example of an electrocardiogram showing a normal heart beat (Yanowitz 1996) 16

Figure 2-2: Examples of ST/HR correlation plots associated with exercise stress tests (Hamasaki, Nakano et al 1998) 19

Figure 2-3: The CRISP-DM process model (CRISP-DM 2000) 33

Figure 2-4: Symbolic representation of a time-series dataset (Lin, Keogh et al 2003) 37

Figure 3-1: Study design flow diagram (dataset B contains variables related to time-series from which outliers have been removed and dataset E contains counts of outliers and abnormal values) 62

Figure 4-1: Distribution of monitored variables (Phase 1=Test; Phase 2=Validn) 69

Figure 4-2: Distribution of NIBP measurements 73

Figure 4-3: Distribution of NIBP measurements as a percent of case duration 74

Figure 4-4: Heart Rate control chart for Phase 1dataset 74

Figure 4-5: ST segment level control chart for Phase 1 dataset 75

Figure 4-6: SpO2 control chart for cases in Phase 1 dataset 75

Figure 4-7: Mean HR values for entire case and for initial one fifth of case in VD categories 76

Figure 4-8: Standard deviation for heart rates in the VD categories 76

Figure 4-9: Mean ST values for entire case and for initial one fifth of case in VD categories 77

Figure 4-10: Gender distribution for all cases compared to low, high and very high risk cases 80

Figure 4-11: Age distribution in phase 1 and phase 2 80

Figure 4-12: Age distribution in risk subsets 81

Figure 4-13: Distribution of ASA classes in phase 1 and phase 2 cases 81

Figure 4-14: Distribution of emergency ASA class cases as a percent of cases in each class (phase 1 and phase 2 cases) 82

Figure 4-15: ASA class distribution for risk categories 82

Figure 4-16: ASA class distribution for emergency cases as percentage of total cases (risk subsets) 83 Figure 4-17: Patient characteristics in relation to vascular disease burden 84

Figure 4-18: Distribution of case duration for cases of Trend Length 30 minutes or greater and without evidence of case segmentation 85

Figure 4-19: Case duration for risk subsets 85

Figure 4-20: Case duration in relation to vascular disease status 86

Figure 4-21: Distribution of vascular disease location (phase1 and phase 2) 88

Trang 16

xvi

Figure 4-23: Distribution of vascular disease 90

Figure 4-24: Risk factor distribution in phase 1 and phase 2 92

Figure 4-25: Distribution of risk factor as percentage of cases 92

Figure 4-26: Distribution of risk factor count (phase1 and phase 2) 93

Figure 4-27: Distribution of risk factor count for risk subsets 93

Figure 4-28: Distribution of risk factor as percentage of cases 94

Figure 4-29: Weight distribution for phase 1 and phase 2 cases 94

Figure 4-30: Distribution of primary diagnostic groups (ICD code groups) 95

Figure 5-1: Derived variables 106

Figure 5-2: Distribution of corVD in stClass_4A subsets 107

Figure 5-3: Distribution of anyVD amongst stClass_4a (high risk population) 108

Figure 5-4: Distribution of corVD amongst stClass_4a categories (low risk population) 109

Figure 5-5: Example of time-series and correlation plots for HR and ST segments 110

Figure 5-6: Examples of ST/HR correlation based on raw time-series data (dataset A, upper section), outlier removed data (dataset B, mid section) and data with outliers removed and smoothed (dataset C, lower section) 111

Figure 5-7: Time-series and correlation plots for raw data, data with outliers removed and data with outliers and noise removed (Case A) 112

Figure 5-8: Time-series and ST/HR correlation plot (Case B) 112

Figure 5-9: Time-series and ST/HR correlation plot (Case C) 113

Figure 5-10: Time-series and ST/HR correlation plots (case D) 113

Figure 5-11: Time-series and correlation plots (Case E) 114

Figure 5-12 : Time-series and ST/HR correlation plots (Case F) 114

Figure 5-13: Example time-series plots consistent with repeated episodes of stress and recovery but magnitude of ST depression was not significant (Case F) 115

Figure 5-14: Box plot of ST segment levels in Dataset A, Dataset B and Dataset C for cases in which three ECG leads were monitored 115

Figure 5-15: Box plot of HR in Dataset A, Dataset B and Dataset C for cases in which three ECG leads were monitored 116

Figure 5-16: Clusters for heart rate time-series patterns in Phase 1cases in general population 119

Figure 5-17: ST segment time-series clusters (phase 2, high risk subset) 119

Figure 6-1: Comparison of feature selection method for high risk subset cases using dataset A and MR 123

Trang 17

xvii

Figure 6-3: Comparison of feature selection methods (model size) 125Figure 6-4: comparison of feature selection methods (other model performance measures) 125Figure 6-5: Comparison of models based on time-series data alone or in combination with risk factor

variables (corVD, general population) (MR) 127

Figure 6-6: Comparison of models based on time-series data alone or in combination with risk factor

variables (corVD, general population) (AUC) 128

Figure 6-7: Comparison of models based on time-series data alone or in combination with risk factor

variables (corVD, high risk population) (MR) 128

Figure 6-8: Comparison of models based on time-series data alone or in combination with risk factor

variables (corVD, high risk population) (AUC) 129

Figure 6-9: Prediction of coronary vascular disease using both stratified cross validation and

SpreadSubsample cross validation with J48 decision tree (misclassification rate) 130Figure 6-10: Prediction of coronary vascular disease using both stratified cross validation and

SpreadSubsample cross validation with J48 decision tree (misclassification rate) 131Figure 6-11: Prediction of coronary vascular disease using both stratified cross validation and

SpreadSubsample cross validation with J48 decision tree (AUC) 132.Figure 6-12: Prediction of any vascular disease using both stratified cross validation and

SpreadSubsample cross validation with J48 decision tree (AUC) 132Figure 6-13: Use of MR for evaluation of model performance based on selected subsets (RF_only,

RF_A and RF_F) in the general and risk subset groups (DT J48, corVD) 136

Figure 6-14: Use of AUC for evaluation of model performance based on selected subsets (RF_only,

RF_A and RF_F) in the general and risk subset groups (DT J48, corVD ) 136

Figure 6-15: Use of Kappa statistic for evaluation of model performance based on selected subsets

(RF_only, RF_A and RF_F) in the general and risk subset groups (DT J48, corVD ) 137

Figure 6-16: Use of MR for evaluation of model performance based on selected subsets (RF_only,

RF_A and RF_F) in the general and risk subset groups (DT J48, anyVD ) 137

Figure 6-17: Use of AUC for evaluation of model performance based on selected subsets (RF_only,

RF_A and RF_F) in the general and risk subset groups (DT J48, anyVD ) 138 Figure 6-18: Effect of inclusion of ASA class in corVD prediction models for general populations (MR)

Trang 18

xviii

Figure 6-23: Performance evaluation using the Kappa statistic for select data subsets in the low risk

and very high risk categories 143

Figure 6-24: Comparison of methods in prediction of corVD in selected datasets and using stratified cross validation in general population and high risk subset (MR) 144

Figure 6-25: Comparison of methods in prediction of corVD in selected datasets and using stratified cross validation (AUC) 144

Figure 6-26: Comparison of DT and rule based methods in the prediction of coronary VD in low risk and high risk populations following Cfs feature reduction (MR) 145

Figure 6-27: Comparison of DT and rule based methods in the prediction of coronary VD in low and high risk subsets following Cfs feature reduction (AUC) 146

Figure 6-28: Effect of risk category (RiskC) on prediction of corVD (MR) 148

Figure 6-29: Effect of risk category on prediction of corVD (Sensitivity and PPV)) 148

Figure 6-30: Effect of risk category on prediction of corVD (AUC and Kappa) 149

Figure 6-31: Effect of risk category on prediction of corVD (Specificity and NPV) 149

Figure 6-32: Measure of model complexity in prediction of corVD, all variables 152

Figure 6-33: Measures of model complexity in prediction of corVD using Cfs data 152

Figure 6-34: Comparison models for prediction of corVD and anyVD in general population (Sensitivity) 164

Figure 6-35: Comparison models for prediction of corVD and anyVD in general population (AUC) 165 Figure 6-36: Comparison models for prediction of corVD and anyVD in general population (PPV) 165

Figure 6-37: Comparison of models based on time-series data alone or in combination with risk factor variables (anyVD, general population, Cfs) (MR) 166

Figure 6-38: Comparison of models based on time-series data alone or in combination with risk factor variables (anyVD, general population) (AUC) 166

Figure 6-39: Comparison of models based on time-series data alone or in combination with risk factor variables (anyVD, high risk population, Cfs) (MR) 167

Figure 6-40: Comparison of models based on time-series data alone or in combination with risk factor variables (anyVD, high risk population, Cfs) (AUC) 167

Figure 6-41: Effect of nonCorVD status and risk category on prediction of corVD 168

Figure 6-42: Effect of nonCorVD status and risk category on prediction of corVD (AUC) 169

Figure 6-43: Effect of nonCorVD status and risk category on prediction of corVD (Kappa) 169

Trang 19

xix

List of Appendices

Appendix A: Time-series characteristics for Phase 1 and Phase 2 203

Appendix B: Description of ICD code groups 204

Appendix C: Examples of ST depression and elevation (Yanowitz 1996) 205

Appendix D: Variations of ST depression (Yanowitz 1996) 208

Appendix E: ECG lead placement (Yanowitz 1996) 209

Appendix F: Distribution of risk factor variables (shows proportion of type 1 diabetes, current and past smoking and various lipid disorders) 210

Appendix G: Distribution of vascular diseases burden per case 211

Appendix H: Examples of time-series derived variables 212

Appendix I: Effect of data pre-processing on count of heart rate values reaching adequate predicted maximum heart rate (phase 2, general populations and high risk subset) 213

Appendix J: Distribution of HR values in different patient groups 214

Appendix K: Effect of removal of outliers and noise on the number of cases considered to have significant ST deviation 216

Appendix L: Comparison of DT and rule based methods for selected datasets (AUC) 217

Appendix M: Comparison of DT and rule based methods for selected datasets (MR) 218

Appendix N: Comparison of methods and dataset for risk category subsets (MR) 219

Appendix O: Comparison of methods and datasets for risk category subsets (AUC) 220

Appendix P: Comparison of methods and datasets for risk category subsets (Kappa statistic) 221

Appendix Q: Comparison of methods in prediction of anyVD in selected subsets 222

Appendix R: Description of ST classes 225

Appendix S: Examples of J48 decision tree rule sets 226

Appendix T: Examples of decision trees 230

Appendix U: Comparison of DT and rule based methods for prediction of anyVD 233

Trang 21

xxi

Abbreviation Description

AARK Automated anaesthesia record keeping

ABPI Ankle brachial pressure index

AHA/ACC American Heart Association / American College of Cardiologists

AIM Anaesthetic information management

ANN Artificial neural networks

ASA American Society of Anesthesiologists

BP / SBP Blood pressure / systolic blood pressure

bpm Beats per minute

Bradycardia Heart rate below normal range

CABG Coronary artery bypass grafting

CAD Coronary artery disease = CVD

CART Classification and regression tree

CCTA Coronary computer tomographic angiography

cerVD Cerebral vascular disease

CHD Coronary heart disease = CVD

Chronotropic

HR response

Ability to increase heart rate in response to increased demand

Claudication Pain related to ischaemic tissue associated with atherosclerotic

vascular disease, frequently in legs and initially with exercise only

CO2ET, FI Inspired (FI) and expired (end tidal, ET) concentration of carbon

dioxide

CombinVD Coronary vascular disease and non coronary vascular disease

corVD Term representing the presence of coronary vascular disease in the

models developed here It has the same meaning as CVD, CHD

CRP C-reactive protein, a marker of inflammation

CVD Cardiovascular disease

CVP Central venous pressure measured via a venous line extended to a

central vein

DESET, FI Inspired (fraction inspired, FI) and expired (end-tidal, ET)

concentration of Desflurane, a volatile aneasthetic agent

DM Diabetes mellitus / data mining

Trang 22

xxii

FiO2 Fraction of inspired oxygen

FORF Random forests

FRS Framingham risk score

HR recovery Rate at which the HR returns to baseline level following exercise or

pharmacological stimulus

HRminDiff HR at which there is least variability in ST segment level

HRV Heart rate variability

HSM Health stability measure

Hypertension BP above normal range

Hyperventilation Rapid respiratory rate

Hypokalemia Low serum potassium level

Hypotension BP below normal range

Hypovolaemia Reduced volume of blood in the intravascular space

IL-6 Interleukin – 6, a marker of inflammation

LogR/ LR Logistic regression

LVEF Left ventricular ejection fraction, a measure of heart pump function

maxDiff Maximum variability of ST segment level at particular heart rate

MCDA Multi-criteria decision analysis

MET Metabolic equivalents, a measure of exercise performed

minDiff Minimum variability of ST segment level at specific HR

NIBP(sys) Non invasive Blood Pressure (systolic)

nonCorVD Non coronary vascular disease (either cerebral or peripheral)

O2ET, FI Inspired (FI) and expired (end tidal, ET) concentration of oxygen

PAA Piecewise aggregate approximation

PAD Peripheral arterial disease

PART Partial regression tree

PCA Piecewise constant approximation

perVD Peripheral vascular disease

PLA Piecewise linear approximation

PLR Piecewise linear regression

PMV Prolonged mechanical ventilation

PTCA Percutaneous transluminal coronary angioplasty (reducing lipid

Trang 23

R-R interval Distance between two consecutive R waves on the ECG

SAX Symbolic aggregate approximation

Serum

creatinine

Measure of kidney disease

SEVET /FI Inspired (fraction inspired, FI) and expired (end-tidal, ET)

concentration of Sevoflurane

SIRS Systemic inflammatory response syndrome

SMO Optimised support vector machines

SpO2 Peripheral oxygen saturation

SRI Stress recovery index

ST Level of ST segment in ECG tracing

SVM Support vector machines

T, A, T/A Thoracic, abdominal, thoraco-abdominal (aorta)

T wave Final component of ECG tracing, reflects repolarisation

Tachycardia Heart rate above normal range

TAN Tree augmented Nạve Bayes

TWA T-wave alternans, represents a beat to beat alteration in the shape,

amplitude and timing of the ST segment and T wave

Urea Another measure of kidney function

Trang 24

xxiv

Trang 25

xxv

Statement of Original Authorship

The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made

( Ellen Pitt )

Date: 13 August, 2009

Trang 26

xxvi

Acknowledgements

For the assistance and support I have received from my supervisors Dr Richi Nayak, Dr Alan Tickle and Dr Philip Cumpston, I am truly grateful The contribution of the Department of Anaesthesia and Health Information Services staff who supplied the datasets for the study is acknowledged with thanks as is the help I have received from my fellow students

Trang 27

1

This chapter outlines the background (Section 1.1) and context (Section 1.2) of this study

in the areas of atherosclerotic coronary artery disease and anaesthesia process and complications

It also addresses the process of data mining, in particular the use of time-series analyses in the setting of automatically collected anaesthesia related time-series data This chapter addresses as well the research objectives (Section 1.3) and research questions (Section 1.4) and provides an outline of the thesis (Section 1.5) and a summary of the significant findings (Section 1.6)

1.1 BACKGROUND

Atherosclerotic coronary vascular disease is a leading cause of morbidity and mortality worldwide (WHO 2005) Risk factors, preventative measures, diagnostic techniques and treatment are well known (Wilson, D'Agostino et al 1998) This knowledge has resulted in a reduction in the death rate from the disease but sudden cardiac death and significant cardiac morbidity resulting from late diagnosis of the disease remain a major public health concern (Wilson, D'Agostino et al 1998) As well, cardiac complications are the most likely cause of anaesthesia related deaths and post-operative morbidity (Wallace 2007) Prior knowledge of a cardiac diagnosis impacts decisions regarding the mode of anaesthesia delivery and intensity of perioperative monitoring (de Silva 2007) Hence, any means of identifying patients at increased risk for coronary vascular disease would be of benefit Physiological monitoring during all cases

of anaesthesia now provides haemodynamic and electrocardiographic data with the potential to identify previously undiagnosed coronary artery disease

This project has been undertaken to assess features of a large automatically collected database of physiological time-series data (AARK/AIM) in combination with several clinical features and to determine whether data mining analysis is able to identify predictors of atherosclerotic coronary vascular disease or a surrogate measure, the presence of any atherosclerotic disease related diagnosis (Murabito, Evans et al 2003) Should such predictive rules be identified, it is envisioned that, if validated in a larger population from several settings, that an inference engine /smart alarm capable of warning anaesthetists of increased coronary vascular disease risk could be of benefit in optimising the process of anaesthesia delivery and reducing associated cardiac complications As well there is the potential that accelerated diagnosis for some patients would lessen subsequent cardiac morbidity and mortality

Trang 28

2

1.2 CONTEXT

Increasingly data mining methods of both supervised (predictive) and unsupervised (clustering, association rules) are being applied to medical datasets, in particular to time-series physiological data (Bellazzi and Zupan 2008) The mining of medical time-series data as with time-series analysis in other domains is complicated by several features, primarily: data size, presence of noise as well as scale and offset inconsistencies Features of particular relevance in medical data analysis relate to the high likelihood of manually recorded data’s being unavailable

for analysis (Bellazzi and Zupan 2008) and electronic data likely to contain significant noise However, despite these problems, more recently developed methods for analysis of large time-series databases with instances of different duration have been shown to allow dimension reduction without loss of significant information (Lin, Keogh et al 2003; Ratanamahatana, Vlachos et al 2005) One such method is the Symbolic Aggregate Approximation (Lin, Keogh et

al 2003) which has been shown to allow clustering of electrocardiographic data as effectively as,

if not more so, than the gold standard measure of Euclidean distance

As well as the need for dimension reduction in time-series databases, the benefit of feature selection in static databases is well known and associated with marked reduction in model complexity and frequently with an increase in predictive accuracy (Witten and Frank 2005) (Witten and Frank 2005)

Features of intraoperative monitoring in certain populations and ambulatory and intensive care monitoring data are increasingly being mined for predictors of significant outcomes and diagnoses (Ramon, Fierens et al 2007) and methods specific for pre-processing physiological variable monitoring output have been proposed (Verduijn, Sacchi et al 2007)

1.3 RESEARCH OBJECTIVE

This study attempts to assess the ability of anaesthesia related data to predict atherosclerotic coronary vascular disease (CVD) in ways similar to stress electrocardiography The application of this method to a potentially otherwise unscreened population at risk for coronary artery disease is significant It has the potential to broaden primary and secondary prevention measures to those who may not otherwise avail themselves of preventative medical care Past efforts to identify increased cardiac risk from anaesthesia electrocardiographic monitoring data has been mostly in high risk vascular surgical procedures There are limited studies in which large numbers of unselected, non-cardiac anaesthesia cases have been subjected

to data mining (Landesberg 2005)

Trang 29

3

Cardiovascular disease is important in the general population and is of relevance in determining cardiovascular risk associated with anaesthesia delivery as well as the most appropriate anaesthetic agent The research objective for this study, therefore, is to determine what features of an anaesthesia information management (AIM) database may be useful in the prediction of CVD Several prediction data mining methods are to be applied to datasets of time-series variables and clinical data from the hospital’s health record database Health record

variables include several cardiovascular risk factors and the target variables of coronary vascular disease diagnoses and a diagnosis of atherosclerosis involving other major arteries (peripheral, aortic or cerebral) Data are subjected to feature reduction to allow development of predictive rules with lower complexity and potentially higher accuracy

A secondary goal of the study is to assess the utility of this electronic database as a means

of quality assurance The use of electronic databases has been shown to be more effective than the current manual review of cases and self reported incidents / events (Grant, Ludbrook et al 2008)

1.4 RESEARCH QUESTIONS

The questions this study attempts to answer are the following:

1) Can the techniques of data mining applied to time-series data predict the presence

of CVD (labelled corVD in database) in patients undergoing non-cardiac

anaesthesia? This question will address the effect of different pre-processing and prediction methods on the accuracy and complexity of prediction models as well

as the best measure of model performance?

2) Can the addition of the patient related CVD risk factors as prediction input

variables improve the performance of models based on time-series derived

variables alone?

1.5 THESIS OUTLINE

The remaining segments of this report address the literature review (Chapter 2), research design (Chapter 3), data exploration (Chapter 4) and pre-processing (Chapter 5) Data modelling and analysis of results are presented in (Section 6) with discussion and conclusions being addressed in (Section 7)

The literature reviewed in Section 2.1 relates to various elements of atherosclerotic coronary vascular disease, in particular its risk factors, prevention, diagnosis and treatment The importance of the risk factors, hypertension, diabetes, cigarette smoking and lipid disorders are stressed as is the importance of early diagnosis in the prevention of sudden cardiac death and long term cardiac morbidity and mortality The diagnostic methods used in this disease and their

Trang 30

4

effectiveness in screening asymptomatic and symptomatic patients in high and low risk populations are reviewed The need for effective detection methods in this disease is highlighted

by the increasing evidence that risk modification measures, revascularisation methods and

treatments that reduce a patient’s risk of sudden cardiac death are effective in reducing morbidity

and mortality Section 2.2 addresses the relevant segments of the anaesthesia literature covering primarily the increased anaesthesia risk posed by the presence of coronary vascular disease As the primary cause of anaesthesia associated deaths, identification of any risk factors for underlying coronary heart disease has been studied The use of automatically collected anaesthesia related physiological time-series data in a general population has been more limited Literature related to the importance of electronically collected physiological time-series data and its benefit in quality control initiatives is also presented

Chapter 2 also reviews the concepts of data mining in relationship to both static and temporal databases Focus is on feature selection methods, prediction methods and applications

of these methods primarily in medical contexts but especially where physiological monitoring data has been used The importance of pre-processing data and the difference in these methods as they are applied to static and time-series databases are reviewed as are the various categories of predictive and descriptive data mining methods The implications of the literature review for this study are presented in Section 2.4

Chapter 3, Research Design, provides an overview of the data acquisition and the processing and modelling methods employed in this study It covers the selection of the target variables and cases as well as time-series and risk factor variables Data exploration describes the characteristics of the time-series data and the demographic and clinical characteristics of the anaesthesia cases studied here The feature selection (FS) methods, Wrapper, Consistency, Cfs and ReliefF are compared in the pre-processing of static data The dimension reduction method, SAX (Symbolic Aggregate Approximation) transformation, and other quantitative abstraction methods are used for the large time-series dataset Modelling techniques include at least one from each of the major predictive algorithm categories (decision tree (DT) and rule based as well as nearest neighbour, regression, support vector machine and neural network based methods) Based on the known comprehensibility advantages of DT and rule based methods, several of these algorithms are used Evaluation methods based on balanced data are compared with n-cross validation and accuracy and model complexity are compared for the different datasets and methods The importance of the unbalanced nature of the data being analysed here is assessed and means of lessening its impact are presented

pre-Chapter 4, Data Exploration, addresses in more detail the data exploration component of the study and presents this as it relates to the time-series data and to the demographic and clinical

Trang 31

5

characteristics of the patients The use of time-series variables as a means of grouping the patients into risk categories is described The application of standard pre-processing methods such as outlier removal and noise reduction as well as SAX dimension reduction for the time-series data is described Datasets derived using these and other temporal abstraction methods and the addition

of risk factor variables to time-series derived data is also described

Chapter 5, Data Pre-Processing, addresses in more detail the methods used in data selection (case, target and input variables) as well as the methods used for outlier and noise removal Feature selection and dimension reduction methods are described Chapter 6, Data modelling and Result Analysis, shows the results of modelling and attempts to identify what methods are best suited to analysis of AIM data Discussion and conclusions are presented in Chapter 7 which also addresses what future investigations of this data may be undertaken and the implications of the results for development of an inference engine/smart monitor for use in anaesthesia practice

 Decision tree and rule based methods produce models which are more comprehensible to the clinicians for whom these models are derived

 Decision tree method (J48) produces model accuracies comparable with those produced using logistic regression and neural network methods Amongst the

DT and rule based methods, J48 models have accuracies that are either comparable or in the mid range depending on the dataset tested

2 The accuracy and other performance measures for time-series based prediction models is significantly improved by the addition of risk factor variables to the modelling process

Trang 32

6

1.7 OTHER FINDINGS

Other findings in this study include the following:

 Misclassification rate alone is an insensitive measure of performance in this dataset with unbalanced class distribution Area under the ROC curve (AUC) and Kappa statistics are less biased by the class distribution Sensitivity, specificity and positive and negative predictive values are of value to clinicians

 Pre-processing methods have limited effect on models based on risk factor (RF) and time-series variables however, for time-series only based models, the use of time-series data from which outliers have been removed result in models which perform marginally better as do models based on time-series variables related to the presence of physiological values outside the normal range Models based on summary data using the entire time-series raw data and summary data based on segments of the time-series produce models with similar accuracy

 Inclusion of non coronary vascular disease status with other risk factors results

in some improvement in model performance This improvement reaches statistical significance for models based only on all RF variables in the general population (P = 0.055) as it does for models based on a Cfs reduced datasets derived from all time-series variables (pre-processed by different methods) with RF variables (RF_all) (P=0.005)

Data was able to be pre-processed such that, for several physiological variables, some assessment of the deviation of values from mean for each case or from recognised normal values could be made Assessment of the clinical relevance

of these deviations would require additional clinical input The response to such values has been used by others to assess clinical significance The data may be useful as a screening tool in quality assurance efforts

Trang 33

7

This chapter expands upon the cardiological (Section 2.1) and anaesthetic (Section 2.2) issues raised in the introduction and addresses the data mining (Section 2.3) literature relevant to the technique in general and, more specifically, to the use of physiological time-series data in medical prediction tasks

Section 2.1 addresses the high mortality and morbidity associated with atherosclerotic coronary artery disease as well as associated risk factors, diagnostic methods and treatment It highlights the importance of prevention and early diagnosis Known preventative strategies as well as diagnostic and treatment methods are addressed The details of an ECG based, non-invasive means of predicting coronary artery disease risk in both symptomatic and asymptomatic patients are also covered Such description leads to consideration of the similarities between stress electrocardiographic data and that collected during anaesthesia and monitored in intensive care patients

Section 2.2 briefly describes the process of anaesthesia, the associated risks as well as the importance of physiological monitoring in the reduction of this risk Cardiovascular risk and its implication in the choice of anaesthesia mode are addressed Recent studies regarding the use of automatic monitoring data in quality assurance and the use of electrocardiographic changes in the perioperative period as a predictor of postoperative cardiac events are reviewed The use of electronically stored data in such efforts has been shown to be more effective than current methods of quality management (Grant, Ludbrook et al 2008)

Section 2.3 describes data pre-processing methods, in particular those feature reduction methods for static databases and dimension reduction methods used for time-series databases The predictive data mining methods to be used here are reviewed briefly and the advantages and disadvantages of each are addressed The descriptive method, clustering, is also briefly reviewed (Witten and Frank 2005) Section 2.4 reviews the application of data mining methods in the medical domain with reference to the difficulties in the use of medical data Also addressed is the use of physiological trend and other clinical data in the prediction of important clinical outcomes Section 2.5 highlights the implications of the reviewed works for this study

The literature review shows the importance of coronary artery disease both in general and

in relation to anaesthesia practice and stresses the importance of diagnosis at a stage when risk factor management and treatment can result in the reduction of associated death and chronic disease The similarity between exercise electrocardiography (exECG) and anaesthesia

Trang 34

8

physiological time-series data suggests that the latter may have a similar role in prediction of future coronary events If such could be shown, the currently under-utilized monitor output stored

in anaesthesia information management systems (AIM) may be of use as a screening procedure

It is to this end that the current study has been undertaken and the inclusion of risk factor variables

in the prediction models is in keeping with the importance of risk factor analysis in general cardiology

2.1 CORONARY VASCULAR DISEASE

This section reviews the literature related to the incidence of cardiovascular disease as well as known risk factors and associations The benefit of risk factor modification for prevention and treatment together with revascularisation measures are described but the major focus of this section is the review of diagnostic techniques with particular emphasis on exercise electrocardiography (exECG) ECG and non ECG abnormalities of the test as they relate to diagnosis and prediction of cardiovascular events are described as are factors that confound the interpretation of exECG and limit it application in diagnostic and prediction tasks Proposed solutions to these confounding issues, including consideration of other clinical features in addition

to exECG results, are addressed The controversy regarding the use of exECG in asymptomatic patients is also reviewed

2.1.1 Impact of cardiovascular disease

Cardiovascular disease (CVD) incidence and prevalence is high (WHO 2005) and mortality varies worldwide ranging from 106 to 844 per 100,000 population (2004) while, for Australia that year, mortality was 140 per 100,000 In developed nations, the death rate from the disease is decreasing as primary and secondary preventive measures are instituted (Rogers, Frederick et al 2008) For Australia (Australian Bureau of Statistics 2004-5) for 2004/05, 3.8% of the population reported heart, cerebral or other vascular conditions Of these, 28% reported angina, 20%, other ischaemic heart disease, 12 % cerebrovascular disease and 35% oedema and heart failure with 27% reporting disease of blood vessels The prevalence of heart, stroke or vascular conditions had decreased from 4.3% in 2001 Direct health care costs for cardiovascular disease represented 11% of total health expenditure (5.4 billion dollars) for 2000-01

Long term disability is associated with the complications of this disease, specifically chronic heart failure Also associated with CVD is the risk of sudden cardiac death (SCD) related

to electrical instability in damaged heart muscle SCD can be a presenting feature of this disease (Goldberger, Cain et al 2008; Kaikkonen, Kortelainen et al 2009) In the US, the annual incidence of sudden arrhythmic death has been estimated to be between 184,000 and 462,000, most of whom have structural heart disease Sudden death may be the first indication of disease

Trang 35

9

and it is known to occur most frequently in those with low to intermediate risk based on risk factor profile The identification of patients at risk for SCD is of significance since device (Chapa, Lee

et al 2008) and pharmacologic means of preventing or aborting SCD exist (Goldberger, Cain et

al 2008) SCD can occur with total and often sudden occlusion of a large coronary artery supplying a significant proportion of heart muscle Several predictors of such events have been proposed and include risk factor profile, stress electrocardiographic features and markers of the inflammatory process Inflammation is considered important in the rupture of lipid plaque, the presence of which is the hallmark of this disease (Goldstein, Demetriou et al 2000) Following review of these disease predictors, effectiveness of current treatment options is addressed and the importance of early diagnosis in the prevention of SCD and long term disability is stressed Diagnostic measures including the exECG are described

2.1.2 Risk factors and associated vascular disease

This section addresses the literature relevant to the pathophysiology of atherosclerotic coronary artery disease and known risk factors as well as associated vascular diseases, in particular, peripheral arterial disease (PAD) Risk factors are again addressed in the section on issues in interpretation and use of exECG results for disease diagnosis and patient management

2.1.2.1 Risk factors

The pathophysiology of this disease relates to deposition of lipids in vessel walls early in life and progression is associated with on-going vascular inflammation (Napoli, Lerman et al 2006) Risk factors have been considered to be modifiable (e.g., hypertension, smoking, elevated cholesterol) or non-modifiable (e.g., age, sex and family history) (Bonetti, Lerman et al 2003) but data from the Framingham risk study showed lifestyle factors to be strong predictors of CVD risk (Wilson, D'Agostino et al 1998; Pearson, Blair et al 2002; Anand, Lahiri et al 2003; Keteyian, Brawner et al 2008) Obesity has not been considered an independent risk factor for the development of coronary artery disease but it is associated with increased incidence of hypertension, diabetes (Type 2) and lipid disorders Efforts to reduce obesity in society are important indirectly in the reduction in coronary vascular morbidity and mortality (Church, Barlow et al 2005) Abdominal obesity has more recently been considered to be an independent risk factor for atherosclerosis Endothelial dysfunction is associated with obesity and in childhood it can be reversed with diet and exercise (Woo, Chook et al 2004)

Excessive body fat, particularly metabolically active abdominal fatty deposits, has been shown to be related to inflammatory markers such as C-reactive protein (CRP) and fibrinogen level (Kahn, Zinman et al 2006) after adjustment for other cardiac risk factors Such other cardiac risk factors are often associated with increased levels of CRP and interleukin 6 (IL-6) The potential for these inflammatory markers’ being an intermediary for the risk factors of age, body

Trang 36

10

mass index, smoking, hypertension, activity, high density lipoprotein, depression and diabetes is supported by the presence of increased levels of these inflammatory markers in association with these risk factors The syndrome identified as Metabolic Syndrome is associated with increased risk of diabetes and cardiovascular disease The components of the syndrome include abdominal obesity, elevated blood glucose, triglycerides and blood pressure as well as reduced high density lipids (Firdaus and Lyons 2007) It is considered to be a chronic inflammatory state and is associated with increased levels of inflammatory markers

The Framingham Risk Score predicts risk of cardiac events based on the presence and severity of known risk factors (Wilson, D'Agostino et al 1998; NCEP 2002) It uses categorical variables for blood pressure (systolic and diastolic), total cholesterol, and LDL-cholesterol to effectively predict coronary heart disease (CHD) in a middle-aged white population sample without overt CVD Risk is stratified into groups according to 10 year mortality: low (<6%), intermediate (6-9%), high (10-19%) and very high (>20%)

2.1.2.2 Associated vascular disease

Coronary vascular disease is known to be strongly associated with atherosclerotic disease

in other vessels either cerebral with associated risk of stroke or peripheral with claudication and risk of limb loss (Hooi, Stoffers et al 1999; Hooi, Kester et al 2004; Norman, Eikelboom et al 2004; Eigenbrodt, Sukhija et al 2007) Both coronary and other vascular atherosclerosis may be asymptomatic and several studies have used non-invasive measures to detect asymptomatic peripheral arterial disease (PAD) Methods include comparison of blood pressure readings taken

in the arm and lower leg using an ultrasound device The ankle brachial pressure index (ABPI) is ankle pressure/brachial pressure A value of <0.9 has a sensitivity of 95% and a specificity of 99% for atherosclerotic PAD Values of less than 0.5 indicate severe peripheral arterial stenosis (Vogt, McKenna et al 1993; Alzamora, Baena-Diez et al 2007), and studies have shown an increase risk of cardiovascular events in those with PAD, both symptomatic and asymptomatic (Hooi, Stoffers et al 1998) Those with PAD have the same risk of death from cardiovascular disease as those with known coronary or cerebral vascular disease and are four times more likely

to die within 10 years as patients without disease The prevalence of PAD in epidemiological studies using the ABPI is estimated to be 10-25% in men and women over age 55 but only 10-20% of those identified with disease in epidemiological studies are symptomatic (Meijer et al, 1998) In a Western Australian study the prevalence was 10.6% for men aged 65-69 but increased

to 23.3% for men 75-79 years of age (Fowler et al, 2002) A study of a larger and somewhat younger population in the Netherlands (Hooi et al, 2004) used Cox proportional hazard models in

3649 patients, aged 40-78 years and found a significant association with cardiovascular morbidity

Trang 37

11

(Hazards Ratio 1.6, 95% Confidence Intervals 1.3-2.1), total mortality (Hazards Ratio of 1.4, 95% Confidence Intervals 1.1 -1.8) and cardiovascular mortality (hazard ratio 1.5 and 95% Confidence Intervals 1.1 – 2.1)

A meta-analysis based on 9 studies has shown the sensitivity and specificity of an ABPI < 0.9 to be 16.5% and 92.5% respectively for the prediction of incident CVD and 16.0% and 92.2% respectively for prediction of incident stroke (Doobay and Anand 2005) The sensitivity for this index (ABPI) in the prediction of cardiovascular mortality was higher at 41.0% with a specificity

of 87.9% The corresponding positive likelihood ratios were 2.53 for coronary heart disease, 2.45 for stroke and 5.61 for cardiovascular death In a Framingham study population (251 males and

423 females with mean age of 80), 20% were shown to have an ABPI of < 0.9 and 18% only of these were symptomatic The hazards ratio for stroke in this population was 2.0 in those with a low ABPI but no significant relation between CVD and death was found (HR 1.2 and 1.4 respectively)

Similarly to PAD, evidence of cerebrovascular atherosclerosis is also associated with increased hazards ratio for prevalent myocardial infarction and new cardiac events (Eigenbrodt, Sukhija et al 2007; Rizzo, Corrado et al 2008) In a population of 11,225, the odds ratio for prevalent MI in females was 2.0 and 1.16 for males After correction for risk factors in a logistic regression model, these values were 1.75 and 1.27 respectively The presence of carotid vascular disease was shown to add minimally to the predictive power of risk factors alone

Recent evaluation of several self reported measures of health status has shown that mortality in the 12 months following admission with an acute coronary syndrome (unstable angina, acute myocardial infarction) predicts mortality even after adjusting for standard clinical risk factors (Thombs, Ziegelstein et al 2008) Logistic regression modelling was used in the analysis Data regarding the cause of death was not available

2.1.3 Diagnostic methods

CVD can be diagnosed using several imaging and functional studies Other investigations are used to assess the stability of lipid plaque and the risk of fatal cardiac arrhythmia These investigations are summarised in Table 2-1 A brief review of investigations aimed at predicting unstable coronary plaque and risk of fatal arrhythmia is included following description of imaging methods More detail regarding the functional method of interest to this study (exECG) follows

2.1.3.1 Imaging methods

The “gold standard” imaging method is coronary angiography by which vessel occlusion

can be quantified and localized This is an invasive and expensive test with attendant risk (Anderson, Adams et al 2007) It is able to document coronary vessel narrowing at rest but

despite its being the “gold standard” can give no indication of the stability of the plaque in the

Trang 38

12

vessel wall Hence, seemingly subcritical vessel narrowing could be suddenly converted to total

or critical occlusion if unstable lipid plaque ruptures (Ambrose 2006) Less invasive investigations include combined functional / imaging tests such as exercise echocardiography (Marwick, Case et al 2001) and exercise nuclear medicine scans (Smith Jr, Amsterdam et al 2000; Mieres JH, Shaw LJ et al 2005; Bedetti, Pasanisi et al 2008; Blumenthal and Mieres 2008) Computed tomography (CT) for identification of calcified coronary plaque (LaMont, Budoff et al 2002; Anand, Lahiri et al 2003) is also used CT scanning and magnetic resonance imaging (MRI) are also being used to less invasively image coronary arteries in a manner similar to coronary angiography (Budoff, Dowe et al 2008) MRI has been used to diagnose acute coronary syndromes (ACS) (Kwong, Schussheim et al 2003; Lockie, Nagel et al 2009) The relationship between CT angiographic (CCTA) plaque characteristic (calcified, non-calcified and mixed), ST segment depression and exECG (Duke Treadmill) scores (DTS) has also been studied (Lin, Saba

et al 2009) ST segment is a component of the electrocardiogram (ECG) and is one segment of the ECG which is either elevated or depressed in the presence of myocardial ischaemia The number of coronary artery segments with mixed plaque, as compared to calcified or non-calcified plaque, was more predictive of ST depression and DTS suggesting a role for this test in obviating the need for invasive angiography in those with a positive exECG but no plaque on CCTA It is also able to detect obstructive plaque earlier than exECG

Trang 39

13

Table 2-1: Diagnostic tests for coronary artery disease

/NonInv

Functional exECG Changes in (ST segment)

Exercise or pharmacologic stress based

on standardised protocols Exercise capacity, HR recovery

invasive

Non-Stress/Dobutamine Echocardiography

Images the heart wall post exECG and identifies poorly functioning areas associated with occluded coronary arteries

invasive

Non-Stress nucleotide study As above but uses nuclear scanning

methods for visualisation of heart wall function

invasive

Non-Imaging Coronary calcium CT Identifies calcified coronary artery

plaque but it is known that not all plaque

is calcified (calcifies later in natural history of disease)

invasive

tomography similar to nucleotide imaging but allowing 3D views

invasive

extent but give no indication of plaque stability

Invasive

coronary plaque and MRI has been used

to show extent of damaged myocardium

in ACS

Non invasive

Biochemical CRP, homocysteine,

interleukin

Markers of systemic inflammation and are correlated with CVD risk

usually released if irreversible damage to the myocardium has occurred as in MI

Test of

ventricular

irritability

? HRV, LVEF, QRS width Ambulatory ECG Electrophysiology

Assess risk of SCD

Trang 40

14

2.1.3.2 Measures of ventricular instability

Some electrocardiographic markers of ventricular instability have been identified as have several factors known to trigger or modulate potentially fatal cardiac arrhythmias (changes in autonomic nervous system, metabolic disturbances and myocardial ischaemia) (Goldberger, Cain

et al 2008) More recently, there has been interest in the significance of heart rate variability (HRV) and its ability to predict CVD (Kop, Verdino et al 2001) High and low frequency variability in R-R interval is considered to reflect autonomic control of heart rate and exercise induced HRV during and after clinical exercise testing strongly predict cardiovascular and all-cause mortality independent of clinical factors and exercise responses (Dewey, Freeman et al 2007) with hazard ratios for cardiovascular death being up to 5.9 Reduced heart rate variability has also been shown to be associated with inflammatory markers CRP and interleukin-6 (Lampert, Bremner et al 2008)

Techniques used for non-invasive identification of patients at risk for SCD include measures of left ventricular function (LVEF) and electrocardiographic abnormalities such as increased QRS duration and short term HR variability (HRV) Other indicators include ventricular arrhythmia on ambulatory ECG monitoring, long term HRV as well as the exECG and electrophysiological studies Newer methods being applied include MRI assessment of infarct size Medical therapy and implantable defibrillator use are effective in reducing the incidence of SCD in those considered at increased risk (Chapa, Lee et al 2008; Goldberger, Cain et al 2008)

2.1.3.3 Markers of inflammation and other biochemical tests

Fatal consequences of unstable plaque are not predicted by either imaging or functional studies Attempts have been made to identify the presence of unstable plaque and blood markers such as homocysteine, highly sensitive C-reactive protein (hsCRP) and several other markers of inflammation (De Ruijter, Westendorp et al 2009) have been studied This recent study of the elderly in the Netherlands showed standard Framingham risk score was not able to predict 5 year cardiovascular mortality (AUC 0.53 with 95% CI 0.42-0.63) The Framingham risk profile prediction was validated only for those aged up to 75 years (Pearson, Blair et al 2002) The predictive model based on the biomarker, homocysteine, alone had an AUC of 0.65 (0.55-0.75) This value did not increase with the addition of the standard Framingham risk score or a combination of four biomarkers (homocysteine, folic acid, C-reactive protein and interleukin 6)

2.1.3.4 Functional tests

Functional methods assess the heart’s response to the stress of exercise or the increase in

heart rate associated with pharmacologic stimulus (dobutamine) and the associated changes in an

Ngày đăng: 07/08/2017, 15:52

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w