1. Trang chủ
  2. » Luận Văn - Báo Cáo

Applying machine learning algorithms for stock price forecasting

68 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Applying machine learning algorithms for stock price forecasting
Tác giả Nguyen Thu Huyen
Người hướng dẫn Dr. Nguyen Doan Dong
Trường học Vietnam National University, Hanoi International School
Chuyên ngành Business Data Analytics
Thể loại Graduation project
Năm xuất bản 2024
Thành phố Hanoi
Định dạng
Số trang 68
Dung lượng 8,5 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Cấu trúc

  • I. CHAPTER 1: INTRODUCTION (11)
    • 1.1. M OTIVATION (11)
    • 1.2. O BJECTIVES OF THE STUDY (12)
    • 1.3. S UBJECT AND SCOPE OF RESEARCH (13)
    • 1.4. R ESEARCH M ETHODS (13)
    • 1.5. S CIENTIFIC AND PRACTICAL SIGNIFICANCE (13)
    • 1.6. T HESIS LAYOUT (14)
  • II. CHAPTER 2: THEORETICAL BACKGROUND (16)
    • 2.1. B ASIC ISSUES ABOUT STOCKS (16)
      • 2.1.1. Stock concept (16)
      • 2.1.2. Securities classification (16)
    • 2.2. O VERVIEW OF THE STOCK MARKET (19)
      • 2.2.1. Stock market concept (19)
      • 2.2.2. The role of the stock market (19)
      • 2.2.3. Classification of stock markets (21)
    • 2.3. M ACHINE LEARNING AND THE STOCK MARKET (23)
    • 2.4. D EEP LEARNING AND THE STOCK MARKET (25)
      • 2.4.1. Recurrent neural network (RNN) (27)
      • 2.4.2. Long-short term memory neural network (LSTM) (29)
      • 2.4.3. Transformer Network (32)
    • 2.5. S TOCK PRICE PREDICTION (35)
      • 2.5.1. Overview (35)
      • 2.5.2. Approaches (36)
      • 2.5.3. Research proposal (37)
  • III. CHAPTER 3: PROPOSED METHODS (39)
    • 3.1. O VERVIEW OF THE PROPOSED METHODS (39)
    • 3.2. C HARACTERISTICS OF THE PROPOSED MODEL (40)
      • 3.2.1. Structure (41)
      • 3.2.2. Evaluation methods (45)
  • IV. CHAPTER 4: EXPERIMENT (47)
    • 4.1. D ATA (47)
    • 4.2. D ATA PROCESSING (49)
    • 4.3. T RAIN (52)
    • 4.4. E VALUATION (58)
    • 4.5. B UILD AN APPLICATION THAT VISUALIZES RESULTS (59)
  • V. CHAPTER 5: CONCLUSION (63)
    • 5.1. C ONCLUSION (63)
    • 5.2. F UTURE WORKS (63)

Nội dung

Ứng dụng thuật toán học máy vào dự báo giá cổ phiếu Applying machine learning algorithms for stock price forecasting.pdf

CHAPTER 1: INTRODUCTION

M OTIVATION

Stock markets are essential for the robust development of both developed and developing economies In Vietnam, the Ho Chi Minh City Stock Exchange (HOSE), established over 24 years ago, has played a significant role in the country's economic transformation since its inaugural trading session on December 28 Since July 2000, the securities industry has made notable progress, reflecting the overall growth of Vietnam's economy.

As of April 2024, Vietnam's stock market plays a crucial role in capital mobilization for the economy, having raised VND 418,271 billion in 2023, which represents a 33.5% increase from the previous year Since its inception, the stock market has successfully mobilized approximately 2.7 million billion VND From 2011 to April 2024, capital mobilization through the stock market totaled around 1.8 million billion VND, contributing an average of 20.5% to the country's total social investment capital.

Market share: By the end of March 2024, stock market capitalization is estimated to reach nearly 6.7 million billion VND, an increase of 12.2% compared to the end of

2023 [24] In 2023, stock market capitalization will reach nearly 6 million billion VND, an increase from the start of the year of 14% [26]

Bond market: By the end of February 2024, the bond market will have a listed value of more than 2,040 trillion VND, an increase of 17.1% compared to the end of 2023

As of 2019, Vietnam's bond market capitalization accounted for over 30.3% of GDP, with the corporate bond segment making up nearly 10.9% By April 2024, the stock market capitalization is projected to reach approximately 92% of GDP, playing a crucial role in developing a modern financial system that integrates the stock and currency-credit markets Notably, the VN-Index reached 1,286.11 points by the end of March 2024, reflecting a significant increase of 13.8% compared to the end of 2023.

In the first quarter of 2024, Vietnam's stock market demonstrated strong and stable growth, with the average transaction value rising to 22,529 billion VND per session, marking a significant increase of 28.2% compared to the average in 2023 This growth occurs despite ongoing challenges and fluctuations in the macroeconomic and international landscape.

Vietnam's stock market plays a crucial role in the country's economic restructuring by enhancing state-owned enterprise reform through equitization and state capital divestment, supported by modern auction mechanisms and listing processes Additionally, it serves as a vital channel for capital mobilization, bolstering the state budget and restructuring public investment The stock market also aids in the restructuring of credit institutions, particularly commercial banks, by promoting transparency and operational efficiency With accessible and complete information on stock prices, it empowers business managers, investors, and individuals to make informed decisions, thereby fostering trust and satisfaction in the market while enhancing its overall efficiency.

Predicting the stock market is essential due to its practical significance, attracting considerable attention from researchers worldwide Various solutions have been proposed, each with unique advantages and disadvantages Currently, machine learning stands out as the most effective approach for stock price forecasting Therefore, I have selected "Applying Machine Learning Algorithms for Stock Price Forecasting" as the focus of my graduation thesis.

O BJECTIVES OF THE STUDY

This thesis focuses on researching and solving the problem of predicting stock prices in the banking stock market in Vietnam with the top 4 large banks such as BIDV,

Using data from Yahoo Finance, a leading platform for financial stock evaluation, I preprocess and extract features to apply machine learning techniques, specifically Supervised Deep Learning Transformers, for predicting stock prices of major banks like Vietinbank, Vietcombank, and Techcombank This approach aims to develop the most optimal predictive model for stock price forecasting.

S UBJECT AND SCOPE OF RESEARCH

The thesis focuses on analyzing stock data sourced from Yahoo Finance, a leading platform for evaluating and analyzing global financial stocks This research specifically examines the stock performance of large corporations and companies worldwide.

Scope of research: Deep learning and machine learning techniques are used to the problem of stock price prediction in stocks with a wide range of trading days.

R ESEARCH M ETHODS

Theoretical research methods in stock market analysis involve synthesizing and examining documents related to securities and stocks, employing algorithms for stock price prediction, and integrating deep learning techniques This research also encompasses acquiring knowledge in areas such as the stock market, machine learning, and computer programming skills to enhance predictive accuracy and decision-making in trading.

Experimental research methods involve several key steps: first, understanding the theory and defining the problem; next, proposing a model for investigation; then, building and developing applications based on that model; and finally, installing a testing program to evaluate the results obtained.

- Comparison and evaluation methods: Analyze and evaluate the proposed model with previous research models using different metrics.

S CIENTIFIC AND PRACTICAL SIGNIFICANCE

The thesis presents a novel approach to feature extraction aimed at addressing the challenges of stock price prediction By evaluating various models, it seeks to identify and develop the most effective predictive model for stock prices.

This study presents a price prediction model that evaluates the performance of various machine learning techniques, including Artificial Neural Networks (ANN), Support Vector Regression (SVR), and Random Forest, alongside deep learning models like Long Short-Term Memory (LSTM) and Transformer The analysis is based on a dataset of 2,000 to 5,000 stock samples sourced from Yahoo Finance.

This thesis presents a visually-driven platform that applies predictive models to forecast stock prices within Vietnam's leading large banks Additionally, the application offers comprehensive statistics on model performance and detailed data analysis, enhancing the understanding of stock market trends.

T HESIS LAYOUT

Chapter 1 provides a comprehensive overview of the stock price prediction research field, outlining the key prediction problems and objectives It defines the scope of the research and identifies the primary objects of study while detailing the methodologies employed to achieve optimal results in stock price forecasting.

Chapter 2 provides essential background knowledge relevant to the thesis, focusing on machine learning concepts such as artificial neural networks and backpropagation techniques It also explores trained deep learning models, including Long Short-Term Memory (LSTM) networks and Transformers Additionally, this chapter offers an overview of stock market predictions and discusses various research approaches and proposals in the field.

In Chapter 3, I explore the proposed Transformer model designed for stock price prediction, detailing its key characteristics Additionally, I outline the methods used to evaluate the effectiveness of this model.

Chapter 4 details the experiments conducted, outlining the data preprocessing steps, model training processes, and evaluation methods employed It compares the results of each tested model across various data set organizations This analysis allows for insightful commentary on the performance of the implemented models, ultimately leading to the selection of the optimal model for result visualization.

● Chapter 5 Summarize the main ideas of the thesis, analyze the pros and cons of the trained models, and thereby propose an appropriate model for the

13 problem of predicting stock prices and possible solutions for further solutions Approachable in the future, good application in practice

CHAPTER 2: THEORETICAL BACKGROUND

B ASIC ISSUES ABOUT STOCKS

Securities serve as instruments for mobilizing medium and long-term capital, representing valuable papers that are both convertible and transferable, thereby establishing ownership and debt relationships between the holder and the issuer Each type of security typically possesses distinct properties that define its functionality and value in financial markets.

Liquidity of a security refers to its ability to be quickly converted into cash without significantly affecting its market value A security's liquidity is often demonstrated through its active trading on the market, where it can be easily bought, sold, and exchanged, indicating a high level of market demand and investor interest.

- Profitability: Investor income is generated from increases in stock prices on the market, or annual interest payments

- Risk: This is a basic characteristic of securities During the process of exchange, buying and reselling, the price of securities is reduced or completely lost, which is called risk

Shares represent a security that signifies ownership and legal interest in the assets and income of a joint stock company The company's total capital is divided into small, equal portions known as shares, which are owned by shareholders Shares can be issued as physical certificates or recorded as book entries, and only joint stock companies can issue shares The initial recorded value of a share is its par value, also referred to as nominal value Dividends are the payments received from capital contributions, while stock prices fluctuate during trading sessions on the stock market, independent of par value Shares are categorized into two types.

Common stock is a type of equity that remains in existence as long as the issuing company operates, without a predetermined maturity date or fixed interest rate Profits are distributed to shareholders at the end of each settlement period Holders of common stock possess voting rights, the ability to purchase additional shares, and the opportunity to participate in shareholder meetings.

Voting preference shares are specifically allocated for founding shareholders, who are required to retain these shares for a designated duration During this period, they are prohibited from transferring or exchanging their holdings.

Financial preference shares are akin to common shares but come with certain restrictions, as shareholders cannot participate in elections or assume board positions However, they benefit from a fixed annual dividend rate, receive priority in dividend payments, and are first in line for asset distribution in the event of liquidation or dissolution, ahead of common stockholders.

A bond is a financial security that represents the issuer's commitment to pay the bondholder a predetermined sum of money at designated intervals and under specific conditions As a form of debt security, bonds can be issued as physical certificates or recorded entries Various types of bonds exist, each serving different investment needs.

Bearer bonds are a type of bond that do not identify the bondholder on the certificate or in the issuing organization's records, making them easy to transfer Due to their simplicity in trading, bearer bonds are frequently bought and sold on the stock market.

- Registered bonds: Record the name and address of the bondholder on the certificate and books of the issuing organization This species is rarely traded in the market

Government bonds are securities issued by the government to address budget deficits, making them a favored choice for risk-averse investors due to their minimal payout risk.

- Construction bonds: A type of bond issued to mobilize capital to build infrastructure projects or public welfare projects

Corporate bonds are financial instruments issued by companies to raise medium- to long-term capital In this arrangement, the company acts as the borrower while the bondholder serves as the creditor The company is obligated to pay interest and return the principal to bondholders according to the terms outlined in the bond contract There are various types of corporate bonds, including secured bonds, unsecured bonds, and redeemable bonds.

Investment fund certificates are securities issued by fund management companies to raise capital from investors This capital is utilized to buy and sell various securities with the aim of generating profits, which are subsequently distributed to the investors These investment fund certificates can be categorized into different types based on their characteristics and investment strategies.

- Mutual investment fund: This is a type of fund that everyone can participate in

- Private investment fund: Limited to only a certain group of people

An open-ended investment fund allows investors to buy and sell fund certificates at any time, as the fund continuously issues new securities to attract capital This structure ensures that the fund is always prepared to repurchase the securities it has issued, providing liquidity and flexibility for investors.

- Stock investment fund: Fund specializes in investing in a certain type of stock

- Bond investment fund: A fund to invest in a certain type of bond

A mixed investment fund is a versatile investment vehicle that allocates capital across a diverse range of securities deemed effective, including various derivatives These derivatives may encompass instruments such as stock purchase warrants, forward contracts, and other financial options, allowing for a flexible approach to investment strategies.

O VERVIEW OF THE STOCK MARKET

The production process relies on essential factors such as labor, capital, land, and technology, with capital becoming increasingly vital However, a disparity exists where those with profitable investment opportunities often lack capital, while those with capital have limited investment options This gap highlights the need for intermediary organizations to connect capital surplus and deficit parties The banking system emerged to address this need, but to further attract idle monetary resources for direct investment in production, the stock market was established The stock market serves as a platform for trading securities, catering to various participants with diverse investment objectives.

2.2.2 The role of the stock market

The establishment of the World Trade Organization (WTO), the European Union, and various regional markets has intensified the trend of international and regional economic integration, compelling countries to enhance their economic development with speed and efficiency This global shift underscores the significant role of the stock market in facilitating economic growth across nations.

The stock market serves as a vital mechanism for capital accumulation, concentration, and distribution, facilitating the timely transfer of funds to meet economic development needs In South Korea, the stock market has significantly contributed to economic growth over the past three decades, achieving a remarkable ranking as the 13th largest stock market globally by the 1990s and supporting sustained average economic growth.

In 1995, the average national income per capita exceeded 10,000 USD, with a financial market rate of 9% per year The information factor plays a crucial role in competitive market dynamics, facilitating effective capital distribution The stock market, as a leader in technological innovation, adapts quickly to environmental changes, ensuring that all investors receive timely updates to analyze and price securities accurately This fosters robust competition among financial institutions, prompting commercial banks to optimize their operations and reduce costs By raising capital through the stock market, companies can enhance their equity and minimize reliance on expensive loans and stringent bank regulations Additionally, the presence of stock markets is vital for attracting foreign investment, thereby ensuring efficient resource allocation both domestically and internationally.

The stock market plays a crucial role in promoting equitable wealth distribution, reducing the concentration of corporate power, and engaging the middle class, which enhances social oversight in the distribution process This fosters fair competition, stimulates economic growth, and contributes to the development of a fair and democratic society.

The stock market plays a crucial role in differentiating ownership from management in businesses, particularly as they grow in scale This separation necessitates specialized management, which the stock market supports by facilitating capital accumulation and promoting the equitization of state-owned enterprises By reducing managerial inefficiencies, the stock market fosters a harmonious alignment of interests among owners, managers, and employees.

The successful internationalization of the stock market enhances competition and enables companies to access cheaper capital, which in turn increases foreign investment and strengthens the global competitiveness of the economy This expansion of business opportunities for domestic companies is exemplified by nations such as South Korea, Singapore, Thailand, and Malaysia, which have effectively leveraged the advantages offered by their stock markets.

19 need to consider possible negative impacts such as excessive money supply growth, inflationary pressure, and capital bleeding problems

The stock market plays a crucial role in enabling the government to mobilize monetary and financial resources while offering valuable insights into future business cycles, aiding both government and corporate investment planning It also facilitates economic restructuring However, the stock market is not without its drawbacks, including speculation, power conflicts, and price bubbles, which can harm minority shareholders and deter investors through insider trading and market manipulation Therefore, it is essential for market managers to mitigate these negative impacts to safeguard investor interests and maintain market efficiency.

The stock market plays a crucial role in various aspects of the economy, and its effectiveness relies heavily on the active participation of market participants and the management strategies implemented by the government.

The primary market is where securities are first issued to the public, enabling issuers to raise capital by selling these securities to investors This process transforms temporarily idle funds into investments, facilitating the flow of capital from short-term to long-term uses By providing essential capital for various investments, the primary market plays a crucial role in enhancing the economy's investment capacity Securities can be issued through two main methods: private issuance and public issuance.

The secondary market plays a crucial role in the financial ecosystem by facilitating transactions involving securities initially issued in the primary market By enhancing liquidity and providing a platform for buying and selling existing securities, the secondary market supports issuers in raising capital, ultimately contributing to the growth of the primary market In the stock market, both primary and secondary market transactions coexist, highlighting their interconnectedness and importance in capital formation.

20 transactions Centralized market: A place where transactions, exchanges, and purchases of securities are carried out through the stock exchange (also known as the trading floor)

Decentralized market: Also known as OTC market, this market takes place anywhere as long as there are stock trading and exchange activities taking place

Figure 2.1: Stock market classification model

The bond market: The market for purchasing, selling, and issuing new bonds

The stock market:The market for purchasing, trading, and issuing new shares This market's activities include both securities investment and securities trading fields

Figure 2.2: Model of different types of stock markets

The spot market is a trading platform where securities are exchanged at a price established at the time of contract signing, with payment and delivery occurring within a specified timeframe thereafter This market plays a crucial role in mobilizing and enhancing capital.

The futures trading market is a platform where securities are bought and sold at pre-agreed contract prices, with payment and delivery occurring after the contract signing date, typically within a period of 30 to 60 days.

The futures trading market operates similarly to the futures securities trading market, yet it distinguishes itself by featuring standardized sales contracts with specific performance conditions Additionally, it mandates a deposit secured through Escrow regulations to ensure the integrity of the contract.

M ACHINE LEARNING AND THE STOCK MARKET

The Fourth Industrial Revolution, characterized by the rise of technology, is transforming all aspects of life and has become a competitive objective for economies and businesses alike Central to this revolution are Artificial Intelligence (A.I) and Machine Learning, which have been extensively researched and applied across various sectors, particularly in the banking and financial technology (FinTech) industries These advancements have led to significant improvements in efficiency, timeliness, and accuracy in supervision, forecasting, reporting, and decision-making processes.

Machine learning, a subset of artificial intelligence, is dedicated to creating methods and conducting research that empower systems to automatically learn from data to solve specific problems For instance, machines can be trained to categorize email messages into the correct folders by identifying whether they are spam While they may have different names, statistical inference and machine learning fundamentally share similar principles.

Machine learning is closely tied to statistics, as both disciplines analyze data; however, machine learning emphasizes the complexity of algorithms used for computations Many inference challenges are categorized as NP-hard problems, leading to the development of approximate inference algorithms within the field of machine learning to address these complexities effectively.

While both statistics and machine learning focus on data processing, machine learning emphasizes the complexity of algorithms more than statistics does A crucial aspect of machine learning is the development of approximate inference algorithms, particularly since many inference problems are NP-hard.

Machine learning is widely utilized across various industries, including data mining, medical diagnosis, stock market analysis, fraud detection, DNA sequencing, speech and writing recognition, automatic translation, gaming, and robot navigation These applications share a common feature: they rely on machine learning algorithms, which act as a "logical brain" that processes digitized input data through multiple layers, enhancing complexity and intelligence through deep learning techniques.

The successful implementation of Machine Learning techniques in artificial intelligence systems has enabled major corporations like Facebook, Amazon, and Google to achieve remarkable advancements in less than a decade.

Machine Learning, when integrated with quantitative analysis models, plays a crucial role in finance, banking, and stock markets by identifying data sample sets and making accurate predictions that enhance decision-making, business continuity, and risk management The competition in the banking and stock market sectors is intense, with emerging tech companies like Feedzai and Shift Technology, alongside industry giants such as IBM, Google, and Alibaba, leveraging technological advancements to gain a competitive edge in these fields.

Machine Learning is increasingly transforming the finance and banking sectors, with notable examples such as Monzo, a UK-based startup bank Monzo has developed a rapid analysis and forecasting model that effectively detects and prevents fraudulent activities, showcasing the potential of AI-driven solutions in enhancing security and operational efficiency in the financial industry.

In response to the rise of fraudsters during transaction completions, measures have successfully reduced the fraud rate on prepaid cards from 0.85% in June 2016 to under 0.1% by January 2017 Additionally, technology firms like Xcelerit and Kinetica equip banks and investment companies with advanced systems that monitor and detect potential risks in real time, enabling strict oversight of capital requirements.

2017, JPMorgan Chase introduced COiN, a smart contract management platform, using Machine Learning, capable of reviewing 12,000 trade credit contracts in seconds, equivalent to 360,000 hours of work work of a normal employee

Machine learning plays a crucial role in the financial sector, particularly in stock market prediction, investment optimization, financial data processing, and trading strategy implementation Its advanced data processing capabilities make it applicable across various domains Consequently, the stock market is considered one of the most significant and preferred areas for machine learning research and development within the financial industry.

D EEP LEARNING AND THE STOCK MARKET

In recent years, advancements in computer processing power and the accumulation of vast amounts of data have propelled Machine Learning forward, leading to the emergence of Deep Learning (DL) This innovative field utilizes multi-layered artificial neural networks, inspired by biological systems, and relies on GPUs and specialized hardware Deep Learning employs a cascade of layers with nonlinear processing units to effectively extract and transform data features Each layer processes the output from the previous one, aiming to enhance technologies related to artificial neural networks, such as language processing, speech recognition, machine translation, and natural language understanding.

Figure 2.3: Brief history of deep learning Deep learning

(Source: Machine learning VS Deep learning)

Deep learning is a leading data-driven machine learning algorithm that is currently being researched and applied Its capacity to surpass human performance in certain cognitive tasks highlights its potential These attributes position deep learning as a highly promising approach within the field of artificial intelligence.

Deep Learning as well as Machine Learning can be divided into 4 main groups:

In the next section we will present the deep learning models researched in the thesis: Transformer supervised deep learning

When reading an essay, individuals comprehend each word in context, relying on previous information rather than analyzing each word in isolation This understanding highlights the importance of memory in processing thoughts Similarly, a neural network aiming to emulate human cognition must retain and utilize prior information to effectively interpret relationships and meanings, showcasing the necessity of memory in intelligent systems.

In analyzing a movie clip, it is essential to differentiate events based on preceding occurrences While a traditional artificial neural network (ANN) processes the entire movie as a comprehensive signal vector, a recurrent neural network (RNN) takes a different approach by segmenting the movie into distinct moments, such as individual frames Consequently, the input for the RNN consists of a sequence of vectors, each representing a frame from the beginning to the end of the movie This allows the RNN to leverage information from earlier frames to make predictions about subsequent frames, enhancing the understanding of the film's narrative flow.

An artificial neural network retains its output data over time by looping back, enabling the preservation of information from one point to another within the same data set.

Figure 2.4: Simple regression network model

The model illustrates a loop represented by the backward arrow, with a condensed view of the network on the left that highlights this loop On the right, the network is displayed after it has undergone processing.

26 opened The noteworthy point as mentioned above is that the network will receive input that is no longer simply a signal vector x, which is a series of signal vectors

At time #, the data point's input signal corresponds to the signal vector, while the network generates hidden results that represent the output signal vector at the same time.

The regression network processes the input signal by integrating it with hidden results These signals are combined using their specific parameter matrices to compute the hidden results This process continues until the data point is fully read from start to finish.

Basic formula of regression network to calculate hidden signal vectors * " To be:

The W matrix functions as the weight matrix in a basic artificial neural network, while matrix U denotes the weights associated with previous result signals Additionally, the φ function is utilized to compress each value of the vector into a logarithmic range (e.g., -1 to 1), allowing for the calculation of derivatives Commonly, we employ functions like #+ to achieve these computations.

The problem of reliance on information is so long in the past

With the above ideas and formulas, the regression network will encounter some difficult problems to solve A typical example is the vanishing gradient problem [13]

In the backpropagation calculation, we focus on a series of products representing dependence levels When these levels have an absolute value of less than 1, it indicates a stable convergence in the model's training process.

In which: E is the squared error t is the output moment for sample training w : weight matrix

When a neural network is given ample time to absorb information, it eventually reaches a saturation point where it can no longer learn effectively This phenomenon mirrors the concept of overfitting, where the network becomes overwhelmed by excessive data, hindering its ability to generalize and adapt to new information.

In language processing, context plays a crucial role in predicting words For instance, in the phrase "Fish swim under ", the proximity of the word "fish" allows us to easily infer the next word as "water." This close relationship aids the regression network in learning effectively Conversely, in a sentence like "I lived in Vietnam since I was young, so I can speak Vietnamese," the relevant information is more distantly placed, requiring the network to retain and recall information from further back in the sentence to make an accurate prediction This highlights the importance of long-range dependencies in language models.

In theory, it is possible to address these challenges by increasing the amount of information to be remembered and adjusting the learning rate coefficient (η) accordingly However, in practice, regression networks struggle to learn information that is too distant, primarily due to the "lost dependency" issue.

Recurrent Neural Networks (RNNs) face challenges in retaining crucial past information for accurate predictions When the time gap between relevant data points is extensive, the network struggles to utilize this information effectively This difficulty arises because the longer the distance, the weaker the dependency becomes, often approaching zero, which undermines the model's predictive capabilities.

# + ( Onwards, the network will no longer learn anything

2.4.2 Long-short term memory neural network (LSTM)

To address the challenge of relying on historical information, particularly due to recurrent networks' limitations in processing distant data, the long short-term memory (LSTM) neural network was developed Introduced by Hochreiter and Schmidhuber in 1997, LSTMs have since been enhanced by numerous researchers, making them a pivotal advancement in deep learning.

S TOCK PRICE PREDICTION

Numerous research projects worldwide focus on stock price prediction, primarily utilizing machine learning algorithms alongside theoretical investment analysis techniques to enhance forecasting accuracy through historical stock market data Stock market forecasting presents significant challenges, and one of the methods employed is the Autoregressive Integrated Moving Average (ARIMA), which addresses time series forecasting issues While ARIMA demonstrates effectiveness with linear and stable time series data, its performance diminishes when applied to the non-linear and volatile nature of stock market data, highlighting the complexities involved in accurately predicting stock prices.

34 researchers have increasingly directed their research towards the development of a combination of algorithms to come up with optimal research methods for the problem of predicting current stock prices

There are many different approaches used to predict stock prices such as:

The hybrid technique combining ARIMA and SVM is effective for forecasting, as it allows for the prediction of linear components through ARIMA while capturing non-linear aspects with SVM This approach acknowledges that accurate forecasting often involves both linear and nonlinear elements.

Combination Wavelet with SVM [6], wherein SVM is utilized for forecasting and Wavelet analyzes stock data

The Artificial Neural Network (Artificial Neural Network) was employed in later research.In order to forecast partially non-linear stock price data, ANN was coupled with the ARIMA algorithm[1]

Mixture that varies Another efficient method for resolving this issue is the use of wavelet and artificial neural networks [3]

Stock price prediction is another application for the Convolutional Neural Network (CNN) algorithm [16]

Recurrent neural network models (RNNs) with reinforcement learning have been proposed in a few research [5][9][11][12]

Recent research employs Generative Adversarial Networks (GANs) with a dual-layer architecture, where Long Short-Term Memory (LSTM) networks serve as the generator class, while Bidirectional LSTM (BiLSTM) networks function as the discriminator layer, effectively differentiating between predicted stock data generated by the LSTM layer.

The study uses unsupervised learning methods HOWEVER Stock price prediction which uses 2 layers LSTM used to predict and CNN used to distinguish between

The article discusses a predictive model for stock data that employs a Generative Adversarial Network (GAN) utilizing unsupervised learning techniques This model consists of two main components: the Multi-Layer Perceptron (MLP) for forecasting stock prices and a Long Short-Term Memory (LSTM) layer that differentiates between predicted and actual stock data.

This article explores a novel approach to stock price prediction utilizing supervised deep learning techniques, specifically the Transformer model, which comprises three encoder layers By analyzing historical stock market data, this method aims to enhance the accuracy of stock price forecasts.

This study introduces a Transformer model for time series analysis, featuring three encoder layers to enhance stock price prediction from time series data Research indicates that multiple encoder layers improve the effectiveness of Transformer models in time series forecasting After extensive experimentation, I found that three encoder layers provide an optimal depth, allowing the model to learn and extract complex features without risking overfitting or training difficulties This configuration strikes a balance between performance and computational efficiency, as too many layers can prolong training and demand excessive resources, while too few may fail to capture critical features.

Each encoder layer utilizes a deep neural network and a multi-head attention mechanism to effectively capture complex and long-term relationships in data To prevent overfitting, techniques like Dropout and Residual Connections are implemented, while a learning rate reduction schedule enhances the optimization of the training process.

- Parameters of the model, like the quantity of neurons, layers, etc I have also tested and the parameters that I present in the thesis are the parameters that

36 give the best results, that is the reason I choose these parameters in the final model was to build the application to visualize the results

CHAPTER 3: PROPOSED METHODS

O VERVIEW OF THE PROPOSED METHODS

The overall proposed model is presented in Figure 3.1

Figure 3.1: Structure of two Attention mechanisms

The general proposed model includes two main parts: Model training and web application

In the model training process, we begin with raw stock data, which undergoes preprocessing to enhance its quality Subsequently, we employ a Transformer deep learning model to train the system, effectively processing and extracting valuable features from the prepared data.

- Web application: For stocks among 4 stock codes: 4 major banks such as

BIDV, Vietinbank, Vietcombank, Techcombank, data preprocessing Then use 1 of the 5 available models: ANN, SVR, Random forest, LSTM and Transformer to predict and print the results to the web interface

Details of the proposed model are presented in the next sections

C HARACTERISTICS OF THE PROPOSED MODEL

The Transformer is a supervised machine learning model that utilizes an attention mechanism to focus on significant segments of time series data, capturing intricate correlations effectively By employing three attention heads, the model enhances prediction accuracy through diverse feature learning Layer Normalization stabilizes the training process and reduces overfitting, particularly in noisy financial data The Dense layer with ReLU activation generates non-linear features, while two Dense layers in each encoder bolster the model's learning capacity Residual Connections preserve information and maintain stable gradients, while Dropout mitigates overfitting by randomly deactivating neurons during training The use of exponential decay for the learning rate ensures a stable training process and helps avoid local minima With three encoder layers incorporating multi-head attention, layer normalization, dense neural networks, and residual connections, the model strikes a balance between complexity and performance This architecture is particularly effective for stock price prediction, leading to the proposal of a Transformer structure tailored for this purpose.

Figure 3.2: My proposed Transformer structure

● First Encoder Layer: Process the initial input inputs and create new features through Attention and Feed Forward Network mechanisms

● Second Encoder Class: Process the output from the first encoder layer, further improving and capturing more complex features from the data

● Third Encoder Class: Processing the output from the second encoder layer, gives the model an additional step to help it learn and improve characteristics from the data

Figure 3.3: Details of my proposed Transformer structure

Three stacked encoders are utilized to effectively learn and capture intricate features from time series data, significantly enhancing prediction accuracy and efficiency Each encoder layer incorporates essential components, including Layer Normalization, Multi-Head Attention, Dropout, Residual Connections, and Feed Forward Networks Together, these elements synergistically improve the model’s predictive performance and stability.

Layer Normalization accelerates and stabilizes the training process at the beginning of each encoder layer by normalizing inputs across features This technique ensures that each feature maintains a mean of zero and a standard deviation of one, enhancing model performance and convergence.

2: : ; < (3.1) where " represents the input vector, μ is the mean, ! $ is the input feature variance, and ϵ is a tiny constant that keeps the division by zero from happening By reducing

41 internal covariate shift, Layer Normalization improves the stability and effectiveness of the training

The Multi-Head Attention method enables the model to simultaneously focus on various segments of the input sequence, allowing it to uncover unique relationships and connections within the data Each attention head performs scaled dot-product attention, which is fundamental to this process.

23 - BIN (3.2) where Q(query) , K(key), and V (value) are linear transformations of the input, and

The 9+ dimensions of important vectors enable the model to focus on various input features through multiple attention heads, enhancing its ability to capture complex patterns effectively.

Dropout is a regularization technique used in machine learning that randomly deactivates a fraction of input units during training to reduce the risk of overfitting This method enhances model generalization by ensuring that the network does not become overly reliant on any specific input, ultimately leading to improved performance on unseen data.

The formula TS@U@DE(4, U) = 4 VGSN@D&&'(U) (3.3) illustrates the relationship between input and dropout rate, which signifies the likelihood of deactivating a neuron Implementing dropout enhances model generalization by diminishing dependence on particular neurons, thereby increasing the model's resilience to unfamiliar data.

Deep network training can be hindered by the vanishing gradient problem, but this challenge can be addressed using residual connections, or skip connections These connections facilitate the addition of a layer's input to its output, enabling gradients to flow more freely throughout the network The formula for implementing a residual connection is essential for optimizing deep learning performance.

?DEUDE = NUDE + WSHNX-@S(HE'@N(.NUDE) (3.4)

These connections enable the model to learn identity functions, ensuring that the original input information is preserved and allowing for deeper architectures

Each position in the sequence undergoes a distinct application of the Feed Forward Network (FFN), which is composed of two linear transformations interspersed with a ReLU activation function The FFN is defined by its structure and operational characteristics.

The formula YY/(#) = ZGQ$(./ # 4 + 0 # )./ 4 + 0 4 (3.5) incorporates weight matrices @A ; and @A !, along with bias terms B ; and B ! This Feedforward Neural Network (FFN) introduces non-linear transformations that significantly improve the model's expressiveness, enabling it to effectively capture intricate relationships within the data.

The proposed model features a stacked architecture comprising three encoder layers, each designed with essential components This configuration enhances the model's ability to learn hierarchical representations of data, which is crucial for precise time series prediction.

Optimizer with Learning Rate Scheduler

An Exponential Decay learning rate schedule is used to reduce the learning rate over time, helping the model to converge more smoothly The learning rate schedule formula is:

This ensures that the learning rate decreases gradually, improving the training stability

The Adam optimizer is employed to compile the model, complemented by a learning rate scheduler that exponentially decays the learning rate over time This method facilitates smoother convergence by progressively lowering the learning rate throughout the training process Additionally, the mean squared error loss function is utilized to evaluate the prediction error.

The model was trained over 100 epochs using a training dataset with a batch size of 32, focusing on optimizing its weights to reduce prediction error The true values are represented as CD: /, while the predicted values are indicated as CD:$ E, with the condition that the difference between HNI values exceeds 4, as shown in equation (3.6).

We assess the forecasting effectiveness of our model through various statistical criteria, including the R-square index, which indicates how well the model explains the dependent variable Additionally, we utilize metrics such as Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) to evaluate performance.

EE JKL is residual sum of squares is calculated by: FF MNO = ∑ 0 /Q; (CD: / − CD:I) P !

EE R)"1S is total sum of squares is calculated by: FF T0U/V = ∑ 0 /Q; (CD: / − CD:JJJJJJ) P ! where +,9JJJJJJ is the mean value of the dependent variable

The R-square value has a range of 0 to 1, with the closer the value is to 1, the more accurate the model

Root Mean Square Error (RMSE)

Mean Absolute Percentage Error (MAPE)

The low values of RMSE, MAE and MAPE measures mean that the closing price prediction approximates real data

CHAPTER 4: EXPERIMENT

D ATA

We assess our model using stock data from the Vietnam Joint Stock Commercial Bank for Industry and Trade (Vietinbank), sourced from the prominent business news website, Verizon Communications, which ranks as the largest in the United States by monthly traffic Our analysis encompasses a trading period of approximately 10 years, yielding around 5,000 data points Notably, trading days are not continuous, as they are affected by weekends and holidays, as illustrated by several stocks in Table 4.1.

Vietinbank stock, with starting price, highest, lowest price and closing price, quantity

Shares of BIDV bank, with starting, highest, lowest and closing prices of trading, quantity of 2000 samples

Vietcombank shares, with starting price, highest, lowest price and closing price, quantity 2000 samples

Techcombank shares, with starting, highest, lowest and closing prices, quantity 2000 samples

Data collection is applied by crawling data directly from the Yahoo Finance page Therefore, the number of samples will be taken based on data published directly on Yahoo Finance.

D ATA PROCESSING

Data preprocessing is essential for preparing time series data for analysis, as it maintains the crucial temporal order and continuity of data points This process includes several key steps that ensure the dataset is clean, complete, and standardized, which is vital for accurate and efficient modeling.

To ensure accurate chronological sorting and manipulation, the initial step involves converting date information into a proper datetime format This conversion is vital for preserving the integrity of temporal dependencies, allowing for precise mathematical operations on a series of dates.

48 converting these to datetime ensures that they can be ordered and indexed appropriately

Organizing data in chronological order by date maintains the integrity of the event sequence For a dataset denoted as M, the sorting function can be represented mathematically as M LWJ"K3 = NOP#(M, Q8 9 1).

This operation maintains the correct temporal order, which is vital for time series analysis where past values influence future observations

To effectively manage missing dates in a dataset, reindexing is employed to encompass all dates within a specified range, ensuring a complete time series essential for precise modeling This process involves defining a new index set, R = {# # , # $ , , # * }, which includes all potential dates, while the original dataset contains only a subset of these dates, {9 # , 9 $ , , 9 ) } By utilizing forward fill to populate any missing values, the integrity of the data is maintained, facilitating more accurate analysis and forecasting.

Normalization, or standardization, is performed to ensure that numeric columns have a mean of 0 and a standard deviation of 1 This is achieved using the formula: 4 X % Z

Standardization is a crucial preprocessing step in data analysis, where original values are transformed relative to the mean and standard deviation of their respective columns This process ensures that all features contribute equally, preventing those with larger numerical ranges from overshadowing others It is especially vital for algorithms sensitive to input data scale, such as gradient descent-based models.

In predictive modeling, particularly for stock price forecasting, the target variable is established by designating the next day's closing price as the target Mathematically, this is represented as R_t = P_t+1, where P_t signifies the closing price at time t By shifting the closing price column by one period, we align each row's target with the subsequent day's closing price, enabling accurate predictions for future values.

Splitting Data into Features and Target

The dataset is divided into features and the target variable, with S denoting the features and TAU representing the target Commonly included in the features are columns for high price, low price, opening price, closing price, volume, dividends, and stock splits.

S = {′WXYD′, ′Z[\ℎ′, ′^_`′, ′ab_cY′, ′d_befY′} g = {hCi\Yj}

This separation is crucial for training and evaluating predictive models

Our objective is to predict the next day's closing price using data from the previous t days, focusing on five key parameters The generator layer plays a crucial role in determining the distribution of actual data, enabling us to derive the closing price from generated data based on five day-ahead variables We categorize our data into training and testing sets to avoid overfitting and evaluate the model's performance on unseen data The total number of data points is represented by N, and the split index S can be calculated accordingly.

In this article, we discuss the division of a dataset into training and testing sets, where 80% of the data is allocated for training purposes The training set is comprised of the initial F data points, while the testing set includes the remaining E data points This mathematical representation ensures a clear distinction between the two sets for effective model evaluation.

This section evaluates the model's performance on a representative subset while training it on the majority of the available data, providing insights into its ability to generalize effectively to new, untested data.

T RAIN

We analyzed four major banks in Vietnam—Techcombank, BIDV, Vietinbank, and Vietcombank—known for their substantial transaction volumes and extensive operations across the country Our study spans from 2009 to 2024, utilizing approximately 5,000 samples, including 4,000 training samples and 1,000 test samples The training data is derived from historical records between 2009 and 2020, while the test set is based on subsequent data.

The experimental setup utilizes a laptop running the MAC-OS-Sonoma 14.0 operating system, equipped with 8GB of RAM and an M1 silicon chip For the training process, Python is employed alongside the open-source Keras library developed by Google.

We conduct experiments on various stock types using different models, including Support Vector Regression (SVR), Artificial Neural Networks (ANN), Random Forest, Long Short-Term Memory (LSTM), and Transformer methods Each model undergoes approximately 100 epochs to ensure robust results.

In our experiment using Vietinbank (CTG) stock data, we analyzed various methods while maintaining consistent error levels and feature counts The results of our analysis are outlined below.

Table 4.5: Evaluation results using the ANN model

Figure 4.1: Stock price chart using ANN method

Table 4.6: Evaluation results using the SVR model

Figure 4.2: Stock price chart using SVR method

Table 4.7: Evaluation results using the Random Forest model

Figure 4.3: Stock price chart using Random Forest method

Table 4.8: Evaluation results using the LSTM model

Figure 4.4: Stock price chart using LSTM method

Table 4.9: Evaluation results using the Transformer model

From the table above, it shows that the results of the Transformel – 3 encoder layers model are better than other tests

Figure 4.5: Stock price chart using the Transformer method – 3 encoder layers

Comparing the above models with the Transformer – 3 encoder layers model, I get the following results:

Table 4.10: Measurement results of the methods

From the graphs showing the measurements of the models, the pattern proposed

Transformer model has achieved valuable and encouraging results Thereby confirming that my model has achieved good results during the model training process.

E VALUATION

We calculate RMSE using average values across five distinct data categories, which also applies to MAE and MAPE For our analysis, we chose ANN, SVR, Random Forest, and LSTM as baseline techniques for comparison with our proposed model, as these are established classification methods for stock market prediction The forecast results are presented in the table above, highlighting the best outcome in red.

Low values of MAE, RMSE, and MAPE indicate that our closing price predictions closely align with actual data, demonstrating the model's strong explanatory power for the dependent variable Our prediction method has outperformed other approaches, with the best results highlighted in red within the data table.

The findings indicate that the training speed is significantly enhanced, and the prediction accuracy is improved, demonstrating the network's effectiveness for training, testing, and learning In contrast, the Transformer model yields the lowest performance metrics, suggesting that its predictions are relatively accurate and closely aligned with actual stock prices.

B UILD AN APPLICATION THAT VISUALIZES RESULTS

I developed a web interface that allows users to assess the accuracy of stock price predictions generated by deep learning models The website features three subpages: Home, Illustration, and Result Prediction, and is built using the Flask library connected to a Python server, employing HTML, CSS, and Bootstrap for design Additionally, it incorporates the Bokeh library for data visualization, enabling users to select from four stock codes: CTG (Vietinbank), BIC (BIDV), VIC (Vietcombank), and TCB (Techcombank) Based on the user's choice of deep learning method, the site delivers price prediction results and offers statistics on the training data and outcomes from various models, including ANN, SVR, Random Forest, and LSTM.

No table of figures entries found.rr

Figure 4.10: "Model Evaluation" function - Vietinbank example

Figure 4.11: "Prediction" function - Vietinbank example

CHAPTER 5: CONCLUSION

C ONCLUSION

This research article evaluates five machine learning and deep learning models, revealing that the Transformer model with three encoder classes delivers the most effective results on the dataset.

The analysis of Transformer supervised learning models demonstrates their superiority over other methods in predicting stock prices The results, along with data statistics and user interaction interface, are interconnected, showcasing the effectiveness of this new deep learning approach Overall, the thesis contributes significantly to addressing the challenges of stock price prediction using the innovative Transformer model.

The thesis encounters several challenges that require urgent attention, particularly regarding data quality The reliance on financial media companies for data collection has resulted in unmanageable and limited datasets, which often include erroneous stock information Additionally, the lack of continuous time series in the data contributes to insufficient diversity, ultimately hindering the accuracy of the results.

Processing and programming methods also focus on using existing features based on

4 criteria RMSE, R-square, MAE, MAPE, other features have not been exploited and testing methods have not been comprehensively evaluated

This research highlights the significant potential of the Transformer model for stock price prediction, paving the way for various development opportunities in the application of deep learning within the financial sector.

F UTURE WORKS

In this thesis, to forecast stock prices, I suggested using Transformer, a supervised deep learning technique I used actual data sets for my experiments, and I used

62 additional techniques to assess the outcomes The outcomes demonstrate that the Transformer approach I suggested produces the best outcomes

In order to analyze, assess, and determine which deep learning model is best for the stock price forecasting problem, I will also conduct experiments with various models

Enhancing the forecast model's practical value requires updating and expanding data sources with diverse information By utilizing web scraping and API techniques to mine data from various online platforms, we can gather rich and continuously updated datasets This approach not only improves the accuracy of forecasts but also enables the model to adapt more effectively to ongoing market fluctuations.

A promising avenue of research involves utilizing deep learning techniques to extract key features from data, enhancing model optimization This approach enables more accurate learning from distributed datasets, leading to improved predictions and reduced errors Additionally, the swift advancement of the internet supports ongoing model updates and real-time data processing, further enhancing the accuracy and timeliness of stock price forecasts.

[1] Areekul, P., Senjyu, T., Toyama, H., Yona, A., (2010) “A hybrid arima and neural network model for short-term price forecasting in deregulated market” IEEE Transactions on Power Systems Pwrs

[2] Box, G.E.P., Jenkins, G.M., (1976) “Time series analysis: Forecasting and control” Journal of Time (31), 238– 242

[3] Chandar, S.K., Sumathi, M., Sivanandam, S.N., (2016) “Prediction of stock market price using hybrid of wavelet transform and artificial neural network”

Indian Journal of Science & Technology ()9

[4] Denny Britz (2015), “Recurrent Neural Networks Tutorial”

[5] Ding, X., Zhang, Y., Liu, T., Duan, J., (2015) “Deep learning for event-driven stock prediction, in: Proceedings of the Twenty-Fourth International Joint

Conference on Artificial Intelligence), IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pp 2327–2333

[6] Huang, S., Wang, H., (2006) “Combining time-scale feature extractions with svms for stock index forecasting, in: Neural Information Processing”, 13th

International Conference, ICONIP 2006, Hong Kong, China, October 3-6, 2006, Proceedings, Part III, pp 390–399

[8] Kang Zhang, Guoqiang Zhong, Junyu Dong, Shengke Wang, Yong Wang

(2018), “Stock Market Prediction Based on Generative Adversarial Network”

[9] Nevmyvaka, Y., Feng, Y., Kearns, M.J., (2006) “Reinforcement learning for optimized trade execution”, in: Machine Learning, Proceedings of the Twenty-Third International Conference

[10] Pai, P.F., Lin, C.S., (2005) A hybrid arima and support vector machines model in stock price forecasting Omega (33), 497–505

[11] Rather, A.M., Agarwal, A., Sastry, V.N., (2015) Recurrent neural network and a hybrid model for prediction of stock returns Expert Syst (61) Appl 42, 3234–

[12] Saad, E.W., Prokhorov, D.V., II, D.C.W., (1998) Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks IEEE Trans Neural Networks (9), 1456– 1470

[13] S Hochreiter (1997), “The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions", World Scientific

[14] S Hochreiter, J Schmidhuber, (1997),”Long short-term memory”

[15] Thân Quang Khoát, “Học Máy (Machine Learning)”, năm 2016

In their 2017 study presented at the 19th IEEE Conference on Business Informatics in Thessaloniki, Greece, Tsantekidis et al explored the use of convolutional neural networks for forecasting stock prices from limit order book data Their research, detailed in the conference proceedings, spans pages 7 to 12 and highlights innovative methodologies for financial predictions using advanced machine learning techniques.

[17] Vaswani, A., Brain, G., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., Kaiser, Ł and Polosukhin, I (2017) Attention Is All You Need

[18] Xingyu Zhou, Zhisong Pan, Guyu Hu, Siqi Tang, Cheng Zhao,(2018), “Stock Market Prediction on High-Frequency Data Using Generative Adversarial Nets”

[19]Https://machinelearningcoban.com/2018/06/22/deeplearning.html

[20]Http://vnba.org.vn/index.php?option=com_k2&view=item&id720:vietinban k-voi-ung-dung-machine-learning-trong-hoat-dong- nganhang&Itemid'2&lang=en&cid7:fin-tech-block-chain

[21]Https://www.apsi.vn/kien-thuc-can-ban-ve-chung-khoan-va-thitruong-chung- khoan.html

[22]Https://dautucophieu.net/tong-quan-ve-ttck-viet-nam-15-namhinh-thanh-va- phat-trien.html

[23]Https://text.123doc.net/document/230610-nhung-van-de-co-bancua-thi-truong- chung-khoan.html

[24]https://kinhtedouong.vn/hoat-dong-huy-dong-von-qua-ttck-trong-nam-2023-co- su-khoi-sac-99337.html

[25]https://vietstock.vn/2024/03/quy-12024-von-hoa-thi-truong-co-phieu-tang-hon- 12-830-1170227.htm

[26]https://cafef.vn/sau-22-nam-phat-trien-quy-mo-von-hoa-ttck-viet-nam-tang- gap-7840-lan-hon-6-trieu-tai-khoan-duoc-mo-moi-20220725144921234.chn

[27]https://vietstock.vn/2023/12/nam-2023-vn-index-tang-hon-12-von-hoa-thi- truong-tang-gan-30-ty-usd-830-1138117.html

[28]https://24hmoney.vn/news/quy-mo-von-hoa-thi-truong-co-phieu-dat-gan-5-6- trieu-ty-dong-c1a2094867.html

SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom - Happiness

EXPLANATORY REPORT ON CHANGES/ADDITIONS BASED ON THE DECISION OF GRADUATION THESIS COMMITTEE

FOR UNDERGRADUATE PROGRAMS WITH DEGREE AWARDED BY

Student’s full name: NGUYEN THU HUYEN

Graduation thesis topic: Applying Machine learning algorithms for Stock

According to VNU-IS’s decision no …… QĐ/TQT, dated … / … / ……., a Graduation Thesis Committee has been established for Bachelor programs at Vietnam National University, Hanoi, overseeing the defense and subsequent modifications of theses in the specified sections.

No Change/Addition Suggestions by the Committee Detailed Changes/ Additions Page

1 Adding legend and revising the overall workflow

Change the direction of the arrow and redraw the updated general model in accordance with the updated program

Re-running model after applying correct split of train- test for time series

The train and test sets have been re-divided quantitatively to enhance the resulting program, with updated results reflecting the time period and data volume This approach is tailored to effectively address the challenges of forecasting data over time.

Ngày đăng: 26/02/2025, 21:50

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN