Towards predicting esg score based on bank annual report contents Towards predicting esg score based on bank annual report contents
CHAPTER 1: INTRODUCTION
Theoretical basis for topic selection
ESG, which stands for environmental, social, and governance, encompasses standards that assess an organization's impact on the environment and society While primarily relevant in the context of investing, ESG criteria also extend to customers, suppliers, employees, and the broader public.
Environmental: This field focuses on climate change, greenhouse gas (GHG) emissions, natural resources, energy use, and biodiversity
Social: Assesses the relationship of the business with the community and people Governance: Refers to the business structure, management, ensuring stakeholders' interests, etc
The press and social media play a crucial role in promoting awareness of businesses' ESG activities, exerting pressure for compliance with sustainability standards Rapid reporting of ESG violations via platforms like Twitter, YouTube, Facebook, TikTok, and LinkedIn can significantly tarnish a company's reputation For instance, a TikTok story highlighting a company's environmental impact can elicit immediate public and investor backlash, influencing investment decisions Companies facing media criticism for ESG non-compliance often suffer reputational damage, leading to potential loss of capital and customers This media pressure compels businesses to adhere to ESG standards, continuously improve, and enhance transparency Additionally, movements advocating for boycotts of brands that disregard environmental and human rights issues force companies to adjust their strategies to regain public trust.
In Vietnam, the government has implemented Resolution 55 to promote renewable energy development, while major companies like Vinamilk, Vingroup, and FPT have adopted ESG practices to enhance their reputation and attract global investment As international investors increasingly prioritize ESG factors for assessing business sustainability and risk, global ESG assets are projected to reach $50 trillion by 2025, representing one-third of total assets under management However, many enterprises, particularly SMEs, view ESG as a cost rather than an opportunity, largely due to a lack of awareness and a comprehensive legal framework Additionally, challenges in assessing and reporting ESG data stem from inconsistent international standards By embracing ESG, businesses can improve their reputation, draw in investments, and cater to the demands of environmentally conscious consumers, ultimately contributing to sustainable development and fostering a balanced business ecosystem that integrates economic, social, and environmental benefits.
Companies gain a competitive edge by engaging in ESG (Environmental, Social, and Governance) initiatives, as 80% of consumers are willing to pay more for sustainably sourced products, according to PwC's 2024 survey Tracking and reporting ESG metrics is crucial for enhancing corporate brands, as leaders who prioritize improved working conditions, diversity, and social contributions attract investors through transparent documentation Furthermore, ESG practices, such as recycling and energy-saving measures, can boost financial performance With 46% of consumers willing to pay more for eco-friendly products, businesses that capitalize on growing social awareness can enhance customer loyalty and improve overall operations sustainably, adapting to changing market conditions while identifying cost-saving opportunities.
Improving ESG performance not only ensures compliance with regulations but also positively influences equity returns, as demonstrated by McKinsey's research, which indicates that ESG proposals benefit equity returns 63% of the time This is particularly advantageous for companies seeking to raise capital Furthermore, a study by Marsh and McLennan reveals that organizations with high employee satisfaction boast ESG scores that are 14% above the global average, facilitating talent attraction and retention.
Implementing ESG (Environmental, Social, and Governance) standards poses challenges due to their ambiguity and the lack of uniform measurement methods among various rating organizations A business's success, particularly for banks, hinges not only on financial performance but also on its commitment to environmental and social responsibilities, underpinned by effective governance In an increasingly globalized world, ESG criteria are becoming essential for evaluating sustainable business practices Predicting ESG scores from annual report content remains a complex yet vital task, especially in Vietnam's banking sector It is crucial for stakeholders—investors, regulators, and the public—to understand a bank's ESG commitments, whether they are completed, ongoing, or planned Automating the classification of these ESG topics based on textual inputs from annual reports is key to enhancing transparency and accountability.
• Past: Actions or commitments that have already been completed
• Present: Ongoing initiatives or policies currently in implementation
• Future: Planned actions or commitments not yet realized
• Mention: Statements with limited clarity about plans or commitments
Applied Artificial Intelligence in ESG
Artificial Intelligence (AI) is increasingly essential for optimizing and automating monitoring and evaluation processes, enhancing the effectiveness of Environmental, Social, and Governance (ESG) implementation in businesses The advantages of AI include increased efficiency, reduced costs, improved accuracy, and valuable insights that guide effective ESG decision-making and strategies However, challenges such as poor data quality, ethical concerns, and significant initial investment costs must be addressed The scattered and inconsistent nature of ESG data complicates the training of AI models and necessitates stringent data security measures To fully leverage AI's potential, businesses must carefully evaluate these challenges and develop a strategic implementation plan.
Object and Structure of the Thesis
The primary goal is to create a predictive model for ESG scores using text data from annual reports of banks and businesses This involves investigating the efficacy of NLP tools and intensity analysis models to evaluate the actual implementation of ESG actions, rather than merely referencing them.
The structure includes five main chapters:
● Theoretical basis for topic selection
● Overview of ESG research and existing scoring systems
● Application of NLP and machine learning/deep learning in ESG analysis
● Pre-data collection and processing
● Model development and evaluation metrics
● Insights and implications for ESG assessment
Research Questions and Objectives
This study focuses on optimizing the evaluation of Environmental, Social, and Governance (ESG) criteria by streamlining the process of analyzing annual and sustainability reports, ultimately aiming to save human resources, reduce costs, and minimize time investment.
1 How can natural language processing techniques be applied to analyze corporate annual reports for ESG-related content?
2 How accurate are the models for predicting ESG labels in the Banking industry in Vietnam?
3 Should banks develop strategies focusing on development in the Environmental, Social or Governance areas to improve ESG scores as well as investor confidence?
This research aims to deliver valuable insights for industry professionals looking to enhance the foundations of Sustainable Development in Vietnam, with the potential to significantly influence ESG development.
● To identify relevant ESG-related keywords and themes from corporate annual reports using NLP techniques
● To develop and evaluate models for predicting ESG labels and ESG scores
● To propose an ESG scoring framework that incorporates action analysis and keyword weighting
● To create a prototype web application for automating ESG score prediction.
CHAPTER 2: LITERATURE REVIEW
Related Works
In recent years, ESG research has gained significant traction as businesses prioritize sustainable development Numerous studies examine the influence of environmental, social, and governance (ESG) factors on business performance and investment choices.
ESG ratings explainability through machine learning techniques
This research investigates the application of machine learning models and explanation techniques to elucidate Refinitiv ESG scores, utilizing data from Refinitiv Eikon software It emphasizes three core pillars: Environmental (E), Social (S), and Governance (G), focusing on companies with a market capitalization exceeding one million USD The study includes a preprocessing phase to address missing data, eliminate features with zero variance, and remove redundant features Following preprocessing, various models—such as Linear Regression, Ridge Regression, Lasso Regression, k-nearest neighbor regression, Decision Trees, Artificial Neural Networks (ANNs), boosting regression (Adaboost), and Random Forest—were employed to predict ESG scores effectively.
The SHAP (SHapley Additive exPlanations) technique effectively elucidates the contribution of each feature in predicting ESG scores, enhancing the understanding of complex "black box" models like Random Forest and Gradient Boosting Findings indicate that Ridge and Lasso excel in predicting ESG scores within the Environmental pillar of the Information Technology sector, while Artificial Neural Networks (ANN) demonstrate superior performance in assessing the Social and Governance pillars Conversely, the Decision Tree model exhibits the least effectiveness in this context.
This study emphasizes the importance of model explanation techniques while identifying the primary challenge in predicting Environmental, Social, and Governance (ESG) factors: the diverse standards and measurement methods that vary across countries and industries.
8 lack of analytical data needs to be improved This challenge needs to be addressed to improve the comparability and broader applicability of ESG scores [8]
ESG2PreEM: Automated ESG grade assessment framework using pre-trained ensemble models
The ESG2PreEM framework utilizes advanced pre-trained language models like BERT, RoBERTa, and ALBERT to automate ESG scoring, enhancing the accuracy and transparency of evaluations for Dow Jones Industrial Average (DJIA) companies This research leverages extensive data from LexisNexis and Refinitiv-Sustainable Leadership Monitor, incorporating over 450 ESG indicators that encompass Environmental, Social, and Governance factors.
The process begins with data collection and preprocessing to ensure label balance, followed by the extraction of key terms using TF-IDF Deep learning models are then trained to categorize ESG data into three pillars, with a focus on comparing the performance of BERT, RoBERTa, and ALBERT models The combination of BERT and ALBERT yields the highest accuracy at 80.79% with a batch size of 20 After labeling the data by ESG groups, the FinBERT method assesses the sentiment of articles as positive, neutral, or negative, measuring the influence of ESG factors on companies A scoring model based on E, S, and G weights calculates the ESG level through label frequency and sentiment The ESG2PreEM model's results are compared with MSCI ratings for 10 companies, revealing a strong correlation of 0.9, demonstrating high accuracy in aligning ESG ratings with MSCI's evaluations.
The validation results indicate that ESG2PreEM enhances the accuracy of ESG scores and promotes the integration of advanced NLP models in ESG assessments Future research should broaden data collection to include diverse sources like social media and online forums, reducing reliance on a single data source.
9 source This study provides a potential solution framework for organizations that want to automate and improve transparency in ESG assessment [9]
Key Insights and Research Directions
Recent studies underscore the significant impact of NLP and machine learning on enhancing ESG analysis, showcasing the effectiveness of automated tools for data extraction and feature importance evaluation in measuring corporate ESG performance The integration of explainable techniques like SHAP promotes model transparency, thereby addressing reliability concerns in ESG scoring Future research should focus on cross-industry and cross-regional applications, refine data preprocessing methods, and adopt international standard ESG assessment frameworks While many studies leverage financial data to forecast ESG outcomes, the use of NLP tools for analyzing non-financial data, such as annual reports, remains underexplored, particularly within Vietnam's banking sector.
Related research on ESG score and label prediction using ML and AI
The research titled “Firms' Profitability and ESG Score: A Machine Learning Approach” employed various regression and machine learning techniques, including GLM, decision trees, random forests, and gradient boosting, to forecast corporate earnings (EBIT) based on ESG scores To promote algorithmic fairness and detect potential data issues, the study utilized interpretation tools like partial dependence diagrams.
The article "Building Machine Learning Systems for Automated ESG" discusses the use of Natural Language Processing (NLP) to analyze reports and news for key ESG sentiments and features It highlights the application of various machine learning models, including linear regression, decision trees, and random forests, for ESG scoring, with SHAP enhancing model transparency Additionally, the system is engineered to scale across different industries and effectively manage large datasets.
Recent research has utilized advanced machine learning techniques, including LightGBM, deep learning, and transparent algorithms like linear regression and k-nearest neighbors, prior to employing black-box algorithms such as artificial neural networks and random forests This approach aims to enhance the precision of ESG score reproduction.
The study "Learning-Based Approach to Identify Key Drivers for Improving Corporate ESG Ratings" evaluated various algorithms, ultimately selecting gradient boosting due to its lowest RMSE Additionally, other research has explored the impact of ESG factors on stock returns by integrating the Capital Asset Pricing Model (CAPM) with machine learning techniques, including decision trees, LightGBM, and deep learning.
In addition to machine learning approaches, traditional methods like linear regression, VIF tests, and fixed/random effects models have been utilized to analyze the impact of ESG on financial metrics such as ROA, Tobin’s Q, and stock performance These studies, drawing on ESG and financial data from various sources, indicate that higher ESG scores can positively affect financial performance, particularly during economic crises and in the context of national pension fund investments.
Recent studies have employed a range of machine learning techniques, from basic to advanced, alongside interpretation tools, to assess and forecast ESG scores while examining the influence of ESG factors on corporate financial performance.
Application of NLP and ML/DL in this ESG analysis
Support Vector Machines (SVM) are primarily utilized for classification tasks, although they can also be applied to regression In SVM, data points are represented in an n-dimensional space, where each dimension corresponds to a feature The objective is to identify a hyper-plane that effectively separates the different classes within the dataset Essentially, a hyper-plane acts as a linear boundary that divides the classes into two distinct regions.
Support Vector Machines (SVM) are designed to handle exceptions effectively, recognizing that it is often impossible to perfectly separate two classes with a single straight line By focusing on identifying the hyper-plane with the maximum margin, SVM demonstrates a robust capability to accommodate outliers in the data.
Support Vector Machines (SVM) are a widely used classification method known for their effectiveness in handling large datasets and text classification challenges, particularly in high-dimensional spaces In the realm of supervised learning, SVM frequently demonstrates superior performance compared to other algorithmic approaches.
Support Vector Machines (SVM) offer high flexibility by effectively applying both linear and non-linear kernels, which enhances classification performance Additionally, SVM is memory-efficient as it utilizes support set points for decision function predictions However, a significant drawback arises when the number of attributes (p) greatly exceeds the number of data points (n).
12 applications of SVM are diagnosing diabetes, hyperlipidemia, kidney failure, etc; fraud detection; image classification (before CNN and DeepCNN exploded); text classification, etc [11]
Artificial Neural Networks (ANN) are computational models inspired by the structure and function of the human brain, which consists of billions of interconnected neurons These networks change based on the input and output they receive, allowing them to identify complex relationships and patterns in data Like the brain's dendrites that transmit information from various senses, ANNs process information by sending messages between neurons to solve different problems As a result, ANNs can be viewed as non-linear statistical models that adapt to diverse data streams, reflecting the dynamic nature of biological neural networks.
Figure 2.3: Basic structure of artificial neural network ANN [14]
In artificial neural networks (ANN), nodes serve as the primary units for receiving input data These nodes facilitate straightforward data operations, which are subsequently relayed to other neurons within the network The result generated at each node is referred to as its output.
13 activation value or node value Each link in ANN is associated with weights In addition, they can learn This happens by changing the weight values.[14]
In artificial neural networks (ANN), arrows illustrate connections between neurons, indicating the flow of information The architecture consists of an input layer and an output layer, each having a single layer, while the hidden layer may comprise multiple layers based on the problem at hand All nodes, except for those in the input layer, are fully connected to the preceding layer's nodes Each hidden layer node processes the input matrix from the previous layer, integrating it with weights to generate an output If the ANN's output falls short of expectations, adjustments to the weights are made to enhance performance This ability to learn and adapt is crucial, positioning the model as a vital tool in computer science and artificial intelligence.
● Input layer: Takes information in Input nodes process the data, analyze or classify it and then pass it to the next layer
In a neural network, the hidden layer serves as an intermediary, receiving data from the input layer and potentially other hidden layers Each hidden layer evaluates the output from the previous layer, processes the information, and forwards it to the subsequent layer, allowing for complex data analysis through multiple layers.
● Output Layer: Gives the final result There may be one or more nodes in this layer.[16]
Artificial Neural Networks (ANN) excel at performing tasks that can be challenging for humans, such as error detection in autopilot systems, target tracking, and integrated circuit chip layout analysis Beyond these applications, ANN is also utilized across various fields, including medicine, telecommunications, and transportation.
Fully Connected Neural Networks (FC Layer)
Fully connected (FC) layers, or dense layers, play a crucial role in neural networks, particularly in deep learning In these layers, each neuron is interconnected with every neuron from the preceding layer, facilitating comprehensive data processing and feature extraction.
A convolutional neural network (CNN) is structured with four primary layer types: convolution, pooling, activation, and fully connected (FC) layers In this architecture, the convolutional and pooling layers are positioned at the beginning to extract features, while the fully connected layers follow to convert these features into final output predictions FC layers serve as the essential components that process input data directly into outputs within fully connected feedforward networks, creating a highly interconnected network where every input influences every output.
The way the FC layer works in a neural network is determined by the structure of the
FC layers Every neuron in one layer will connect to every neuron in the next layer That is its dense connectivity characteristic
● Bias: adjusts the output along with the weighted sum of the inputs
● Activation function: ReLU, Sigmoid or Tanh introduces non-linearity into the model, allowing the model to learn complex patterns and behaviors
Fully connected (FC) layers excel at synthesizing and abstracting features identified by preceding layers, transforming them into formats conducive to precise predictions Their capacity to merge diverse information enables FC layers to effectively capture and approximate intricate patterns.
Fully connected (FC) layers play a crucial role in accurate predictive modeling by serving as the final layer in neural networks for classification and regression tasks They transform high-level features into scores, which are often processed through a Softmax function to yield probabilistic class predictions This ensures tailored outputs for various problems, whether predicting multiple categories or continuous variables FC layers enhance the model's ability to generalize training data to unseen situations through the application of activation functions Their flexibility allows for diverse applications, including speech and image recognition as well as natural language processing Additionally, FC layers incorporate techniques like Dropout and L2 regularization to prevent overfitting, with Dropout randomly disabling a portion of neurons during training to promote the learning of more robust features.
In deep neural networks, if a neuron learns a useful representation, other neurons deeper in the network may overly rely on it, hindering the learning of general rules Dropout addresses this issue by randomly deactivating neurons during training, compelling other neurons to develop diverse representations, which enhances the overall model strength However, it's crucial to disable dropout during prediction to avoid poor outcomes Conversely, L2 regularization encourages the model to identify simpler and more generalized patterns.
Figure 2.5: FC layer with Dropout [20]
Fully connected (FC) layers are straightforward to implement in most neural networks, provided the input data is appropriately processed However, they come with a high number of parameters, leading to increased computational complexity and memory usage Additionally, without techniques like dropout or activation functions, these layers are prone to overfitting FC layers also fail to leverage spatial hierarchies, making them less effective for image data and similar inputs.
CHAPTER 3: RESEARCH METHODOLOGY
Overall Workflow
This chapter outlines the research workflows designed for the analysis and categorization of text data concerning the temporal implementation of ESG actions and the prediction of ESG scores These workflows enhance the transparency, specificity, and accessibility of the research for readers.
● Start: This is the initial step of the process
● Report Collection: Collect the required data from the text document source This is the input data for the NLP system
● Data Preprocessing: Clean and prepare the data, including noise removal, text normalization, word segmentation, stop word removal, or data labeling
● ESG Action Temporal Categorization: Categorize actions or information related to ESG (Environmental, Social, Governance) The labels will be categorized as follows:
○ Label 0: “ESG action in the past”: Actions or commitments that have already been completed
○ Label 1: “ESG action in present”: Ongoing initiatives or policies currently in implementation
○ Label 2: “ESG action in future”: Planned actions or commitments not yet realized
○ Label 3: “ESG action only mentioned”: Statements of limited clarity about plans or commitments
● Training & Evaluation: Use labeled data to train machine learning models Evaluate model performance based on accuracy, comprehensiveness, and other metrics
Once the categorization of ESG actions is finalized, the next step is to explore the potential for developing an ESG scoring system Utilizing data from the previous phase, we can create a predictive model to evaluate an organization's future ESG compliance While this goal is promising, it remains a secondary focus in the broader context of ESG assessment.
● End: The process ends after the model is evaluated and can be deployed or used.
Data Collection
Nguyen Thi Yen Nhi played a crucial role in the initial stages of data collection, preprocessing, and preparation, guiding the process from steps 1 to 5 I then took over from Nhi at step 6 to develop the model and conduct a more detailed analysis Our collaboration was essential in ensuring a smooth and efficient workflow, ultimately enhancing the quality of the study's final results.
● Data Collection: The first step is to collect the data needed to perform the analysis
● Identifying banks: Select important banks that play a key role in the banking industry in Vietnam
● Collecting annual reports: Collect reports to create a rich and valuable data source
● Filtering ESG-related content: Filter out irrelevant content in the reports, keeping only the parts related to ESG factors (Environment, Society, Governance)
● Extracting content into sentences: Break down the filtered reports into sentences for easy processing in the next step
● Labeling sentences by ESG topics: Review each sentence and label them according to the main ESG topics such as Environment, Society, or Governance
● Check: Check the contents of the raw dataset for consistency and quality
● Label sentences with ESG actions: Label the extracted sentences with the temporal of ESG actions Integrate new labels into the dataset, checking one last time for accuracy
● End: Complete the process after labeling is complete Save the labeled dataset in a suitable format (CSV, JSON, etc.) to be ready for further processing.
Data Preprocessing
● Data Preprocessing: Start the data processing process
● Unicode Encoding Standardization: Ensure text data is standardized according to Unicode encoding to avoid display or processing errors
● Standardize Vietnamese Accent Typing: Synchronize Vietnamese accent typing to ensure data consistency
● Checking for Proper Vietnamese Words: Remove or correct misspelled words that do not exist in Vietnamese
● Convert Uppercase to Lowercase: Ensure all data is in lowercase for ease of processing
● Remove Special Characters: Remove unnecessary characters (such as @, #, $) to retain the main content
● Remove Stopwords: Remove words that are not meaningful in context, such as "and", "of", "is", to reduce noise for the model
● Tokenize Sentences Using PhoBERT's Tokenizer: Use PhoBERT's tokenizer to split the text into tokens (words or phrases)
● Put Tokenized Sentences into Model with Attention Mask: Send the tokenized data to the PhoBERT model, along with an attention mask for context processing
To classify data effectively, utilize the CLS vector (classification token) from the model's output as the primary input for the classifier This approach ensures that the classification process is anchored in a representative feature of the data.
● End: Complete the data processing process
Model Development and Evaluation
Figure 3.4: Model Development & Evaluation Workflow
● Start: This overview phase includes model training and evaluation steps
○ Machine Learning Models (ML): Apply popular machine learning methods to train pre-processed data The model used is the SVM Classifier
○ Deep Learning Models (DL): Apply deep learning methods with models such as:
■ ANN Classifier (Artificial Neural Network)
■ FCNN (Fully Connected Neural Network)
■ PhoBERT Classifier (PhoBERT-based Classifier)
2 Testing & Evaluating: The performance of trained models is evaluated on a separate dataset (test data) This process is to measure the accuracy, classification ability and efficiency of each model
● End: Complete the process after the models have been evaluated and selected
CHAPTER 4: RESULT AND DISCUSSION
Build Dataset
As mentioned earlier, we initially collected 523 annual reports in PDF format from
This article examines 37 of the largest banks in Vietnam, focusing on 230 reports spanning 10 consecutive years from 2014 to 2023 It highlights the two types of PDF formats found in these reports: Image-Only PDFs and Digitally Created PDFs The format significantly impacts data collection success, as Digitally Created PDFs, generated through software like Excel or Microsoft Word, contain editable electronic text and images In contrast, Image-Only PDFs lack an underlying text layer, making their content uneditable and only viewable unless converted using Optical Character Recognition (OCR) technology.
Figure 4.1: Illustration for Digitally Created PDFs [27]
To understand the difference between the two types of PDFs, let's take a look at some illustrations and examples of them from general to specific concepts in this study:
Figure 4.2: Digitally Created PDFs example - Vietcombank Annual Report 2023
Figure 4.3: Image-Only PDF example - SAIGONBANK Annual Report 2022
In Chapter 3, we discussed the importance of identifying the type of PDFs when extracting text from annual reports We utilize the PyPDF library for this purpose, which effectively aids in text extraction During our review, we identified two classifications: “Digitally Created PDF,” which allows for successful text extraction and further analysis, and “Image-Only PDF,” where text extraction fails This classification system enables us to implement appropriate extraction measures based on the PDF type.
Extracting text from PDF files
To enhance the processing of Image-Only PDF files, we utilize Optical Character Recognition (OCR) technology to automatically identify the coordinates of text boxes This approach enables precise text extraction from these predominantly graphical documents.
Digitally created PDF files allow for simpler processing, utilizing the pdfminer library to efficiently convert text to its pure form This library enables quick and direct extraction of text from PDF files without requiring extra processing steps By combining these methods, the text extraction process is streamlined, ensuring accurate conversion for effective analysis.
Formatting plain texts into sentences
Analyzing annual reports presents significant challenges in differentiating between titles, plain text, table text, images, and footnotes Existing libraries like PyPDF and pdfminer lack the capability to make these distinctions To address this issue, we developed an alternative approach that utilizes text size information for better classification.
In our analysis of font sizes within the document, we found that titles typically use larger fonts, while table text and images are presented in smaller sizes compared to standard plain text Our findings revealed that certain font sizes were more prevalent than others Consequently, we established a criterion to define common text: any font size that constitutes at least 80% of the total text will be classified as common, while other sizes will be excluded from extraction This approach enhances the efficiency of our text extraction process.
In deeper processing of text content, we discover that it is typically coherent and consists of complete sentences ending with periods By eliminating irrelevant sections and retaining only these full sentences, we enhance the automatic processing of content related to sentence structure and figures, leading to more accurate analysis.
26 annual report content As a result, we successfully converted PDF files into sentences There are 11899 sentences successfully extracted after this step
After successfully building data and going through the manual labeling process, the dataset used for the classification models will look like the figure below:
Figure 4.4: Dataset used in this project
The data has 4 columns in order from left to right: “name” (bank’s name), “year” (year of the annual report), “text” (sentences extracted from the annual report) and
“label” (corresponding ESG action temporal) Each column will have its role, function and task, here we focus more on “text” and “label” columns
To accurately label the level of ESG action in Vietnam's banking industry, all sentences unrelated to ESG will be eliminated The criteria for determining ESG relevance will adhere to the Harvard ESG standards.
After identifying the ESG statements, the subsequent step involves categorizing each statement by assigning one of four action-level labels Keywords for these labels are directly sourced from the dataset.
Table 4.2: List of ESG Action Labels
To assess the performance of a model in text classification for detecting a business's ESG actions over time, several key evaluation metrics are utilized, including Accuracy, F1-score, Precision, and Recall Classification accuracy serves as a fundamental metric, particularly effective when the dataset is balanced with an equal number of samples across categories.
In a classification task with 28 samples per class, achieving high accuracy can sometimes lead to misleading false positives, particularly when precision is elevated The F1-score, which ranges from 0 to 1, serves as a crucial metric that balances precision and recall, indicating both the accuracy of classifications and the model's robustness in identifying instances A lower recall paired with high precision may yield impressive precision figures, yet it risks overlooking a significant number of instances Therefore, a higher F1-score reflects superior model performance.
Here are the formulas to calculate accuracy, precision, recall and F1-score:
Accuracy measures the proportion of correct predictions among all observations, providing a quick assessment of model performance Precision, also known as positive predictive value, indicates the ratio of correctly predicted positive observations to the total predicted positives Recall, or sensitivity, assesses the ratio of correctly predicted positives to all actual positives, making these metrics essential for analyzing imbalanced datasets The F1-score balances precision and recall, effectively addressing the impact of false positives (predicting Yes when the actual is No) and false negatives (predicting No when the actual is Yes).
Experiment & Results
Figure 4.5: Average number of ESG activities throughout the years
The chart above illustrates the number of ESG-related activities over the years from
2014 to 2023 The period 2014 - 2018 grew slowly and fluctuated between 200 and
Between 2019 and 2023, the interest and commitment of banks to ESG activities experienced fluctuations, with a notable drop below 200 in 2019, likely due to shifts in strategic priorities and external economic factors However, from 2020 onwards, there was significant growth in the number of ESG initiatives, marking a breakthrough period for banks in their commitment to sustainability.
By 2023, the number of ESG-related activities surged to over 600, tripling the figures from before 2020, driven by stricter government regulations and heightened investor demands Additionally, growing social awareness of ESG issues has intensified competition among banks, compelling them to enhance their brand image through proactive ESG initiatives.
This means that the ESG trend is becoming a top priority, a "must" factor, not just a
Developing a systematic and transparent ESG strategy is essential for banks to capitalize on long-term investment opportunities and enhance their involvement in ESG initiatives It is crucial to adopt cautious and adaptable strategies, particularly in response to crises like the one experienced in 2019 The COVID-19 pandemic has profoundly impacted the global economy, including Vietnam, highlighting the importance of ESG as businesses increasingly prioritize sustainable practices.
30 society compete to shift toward sustainable development This is an opportunity to improve governance, ensure employee rights and manage environmental risks
Figure 4.6: ESG activities proportion throughout the years
The chart shows the percentage of ESG labels (0,1,2,3) by year The labels are defined as follows:
● Label 0: “ESG-related action in the past”
● Label 1: “ESG-related action in present”
● Label 2: “ESG-related action in future”
● Label 3: “ESG-related action only mentioned”
Label 0 has consistently represented a significant portion of ESG activities over the years, suggesting that many actions are being recorded The stable or slightly declining trend may indicate that banks are prioritizing future actions over merely reporting results In contrast, Label 1 has consistently accounted for a smaller proportion, highlighting that the number of ongoing but incomplete ESG activities is relatively low, possibly because banks are concentrating on achieving their ESG goals before documenting progress.
In the overall report, Label 2 holds the lowest weighting, indicating that banks are less forthcoming about their future plans, opting instead to act when opportunities present themselves Conversely, Label 3 consistently has the highest weighting across nearly all years, highlighting that the majority of the content in annual reports is derived from these two labels.
To enhance the effectiveness of ESG commitments, it is crucial to examine the impact of Label 3's weighting If this weighting remains unchanged, a deeper analysis of how these commitments translate into actionable steps is necessary.
To enhance ESG (Environmental, Social, and Governance) performance, banks must establish clear plans and strategies, ensuring effective implementation in the future Continuous monitoring of ongoing ESG activities is essential to bridge the gap between commitments and actual actions Furthermore, detailed reporting on achieved results is crucial to showcase ESG efforts, while reducing the reliance on vague mentions can significantly bolster the bank's reputation and foster greater trust among investors and customers.
Figure 4.7: Top 5 most active banks in ESG in Vietnam
The Vietnam Bank for Agriculture and Rural Development (Agribank) stands out with an average of over 120 ESG-related actions annually, reflecting its strong commitment to sustainable development in rural areas and agriculture Other leading banks, including Military Commercial Joint Stock Bank, National Commercial Joint Stock Bank, Joint Stock Commercial Bank for Foreign Trade of Vietnam, and Maritime Commercial Joint Stock Bank, follow closely with 100 to 120 ESG initiatives each year This relatively small gap indicates a collective dedication among these institutions to achieve ESG goals Major financial entities in Vietnam are increasingly engaging in ESG practices to enhance their reputation, attract investors, and support sustainable development efforts.
After manually categorizing the statements into four ESG action labels, the subsequent crucial step is to validate and ensure the reliability of these labels Additionally, Cohen's Kappa is utilized to evaluate the level of agreement among labelers Using IBM SPSS Statistics 27 for Kappa calculation, we have compiled the following results.
The study demonstrates a very high level of agreement with a Kappa value of 0.897, indicating results that are nearly perfect The statistical significance (p