CRITERIA FOR MODEL VERIFICATION AND EVALUATION- 123docz.net

CHAPTER 4: RESEARCH RESULT AND DISCUSSION

4.2. CRITERIA FOR MODEL VERIFICATION AND EVALUATION

Out-of-sample testing is an important part of developing predictive models and evaluating their performance. This is a commonly used technique to evaluate the accuracy and effectiveness of machine learning models in practice. Applying this technique will assist us in addressing the following issues:

Assess generalization ability: When a model is trained on a data set, it can

"learn" specific characteristics of that data. By testing on a test data set that the model has never seen before, we can evaluate the model's generalization ability, that is, its ability to predict well on new data.

Overfitting detection: Overfitting occurs when a model is too fine-tuned to fit the training data, leading to poor performance on new data. Out-of-sample testing helps detect and evaluate model overfitting by comparing performance on training data and test data.

Model refinement: Results from out-of-sample testing can be used to refine the model, including adjusting hyperparameters, testing additional variables, or even reselecting the prediction algorithm. This helps improve model performance on new data.

Increase the reliability of the results: By using out-of-sample testing, we can obtain a more accurate estimate of the model's performance on real data, increasing the reliability of the results. results compared to using only training data.

Therefore, out-of-sample testing helps ensure that the model has good predictive ability on new data and can generalize from training data to new data. In this study, we will use 80% of the data equivalent to 1,109 observational data to build the model and the remaining 20% of the data equivalent to 277 observational data to retest the model.

Therefore, the forecast results of the models presented in Content 4.3 will be tested and evaluated on out-of-sample data sets.

Estimate the LGD parameter.

Figure 4.3. Estimation results of the models in LGD estimation

Source: Calculated from the inspection system Figure 4.3 shows the regression coefficients of the variables in the OLS regression model, in which the coefficient X_5 has a large value, showing the influence of this coefficient on LGD. The remaining variables all have quite low regression coefficients.

Meanwhile, in all three machine learning models, Decision Tree, RF, and XGBoost show that variables X_7 and X_11 play an important role in explaining LGD.

Thus, research models have shown that debt-to-total asset ratio (X_5), current ratio (X_7), and cash return on equity (CRE) (X_11) play an important role in explaining the LGD risk parameter.

Figure 4.4. Estimation results of decision trees in LGD estimation

Source: Calculated from the inspection system

Table 4.4. LGD prediction results of models on out-of-sample data sets

Model Dataset MAE MSE RMSE R2 MAPE

Linear Regression (OLS)

Out- Sample

3.57007 6

359.566 337

18.9630 26

-5445.945487 22.74886 3

Decision Tree

Out- Sample

0.15005 3

0.03642 2

0.19084 4

0.448308 0.305566

Random Forest

Out- Sample

0.11267 4

0.01993 3

0.14118 3

0.698071 0.238164

XGBoost Out- Sample

0.12166 2

0.02415 5

0.15541 7

0.634122 0.250680

Source: Calculated from the inspection system The LGD prediction results demonstrate clear differences in model performance, with Random Forest and XGBoost emerging as the most accurate options. Random Forest consistently outperforms the other models, achieving the lowest mean absolute error (MAE) at 0.112674, mean squared error (MSE) at 0.019933, and root mean squared error (RMSE) at 0.141183. Additionally, its R2 value of 0.698 indicates that it explains nearly 70% of the variance in the data, making it highly reliable for predicting LGD values. Random Forest’s ensemble approach effectively reduces overfitting and captures complex nonlinear relationships within the dataset, resulting in highly accurate predictions across different data scenarios.

Following closely, XGBoost also performs well, with an MAE of 0.121662, MSE of 0.024155, and RMSE of 0.155417. Its R2 of 0.634 shows that XGBoost captures around 63.4% of the data variance, making it a solid alternative to Random Forest. While its errors are marginally higher, XGBoost’s gradient boosting technique is powerful for managing complex data and can potentially match or exceed Random Forest’s performance with further parameter tuning and optimization.

The Decision Tree model presents a moderate level of accuracy, with an MAE of 0.150053 and an R2 of 0.448, indicating that it explains about 44.8% of the variability in the data. While it offers a significant improvement over linear regression by capturing more of the data’s nonlinear patterns, it lacks the depth and ensemble advantages that make Random Forest and XGBoost more reliable for generalization. Decision trees, as individual models, are also prone to overfitting, which may impact their robustness when applied to new data.

Linear Regression (OLS), however, performs poorly across all metrics, with a high MAE of 3.570076, an MSE of 359.566337, and an RMSE of 18.963026. Its negative R2 of -5445.945487 indicates that it fails to capture the underlying relationships between variables effectively and performs worse than a simple mean prediction. This substantial deviation from observed values suggests that OLS is unsuitable for LGD prediction in this dataset, likely due to the presence of complex nonlinear relationships that a linear model cannot capture.

In summary, Random Forest stands out as the most reliable and accurate model for predicting LGD, followed by XGBoost, which also shows strong performance and potential for optimization. Decision Tree offers a viable but less robust alternative, while Linear Regression fails to meet the predictive accuracy needed for effective LGD forecasting. For practical application, Random Forest would likely deliver the most consistent results in capturing the complexities of LGD, with XGBoost as a close, tunable alternative.

SUMMARY

Chapter 4 focuses on analyzing and testing the reliability of risk estimation models through the use of research data and applying testing and evaluation criteria. It details how to test the model on out-of-sample data and covers specific criteria to evaluate the effectiveness of the estimated model, such as accuracy, coverage, and level of risk reflection. ro.

The results of the estimation model present the estimation of the LGD parameter, which is an important factor in credit risk assessment. Through these results, the chapter provides insight into the applicability and reliability of models in forecasting and managing risks and contributes to the optimization of risk management processes at financial institutions.

Next, based on the research results, chapter 5 will recommend the application process and suggest a set of criteria for building a risk weight estimation model at commercial banks.