INTRODUCTION
Vietnam stock market overview
The Vietnamese stock market has experienced significant growth over the past 20 years, following the launch of the Ho Chi Minh City Stock Exchange in July 2000, which began with the inaugural tickers REE and SAM Currently, there are 1,605 companies listed and registered across two stock exchanges, with a trading volume of 150 billion shares As of early 2020, the market capitalization reached nearly 5.7 million billion VND, representing 102.74% of the country's GDP, highlighting the vital role of the Vietnamese stock market in the national economy.
Vietnam's stock market consists of three main exchanges: the Ho Chi Minh City Stock Exchange (HOSE), the Hanoi Stock Exchange (HNX), and the Unlisted Public Company Market (UPCoM) Among these, HOSE is recognized as the largest exchange in terms of scale and trading volume.
As of 2019, the Ho Chi Minh Stock Exchange (HOSE) featured 382 listed companies, with a trading volume of 8.8 billion shares and an average trading value exceeding 4,000 billion VND per session HOSE's market capitalization represented 88% of the total market, equivalent to 54.3% of the country's GDP To qualify for listing on HOSE, companies must meet stringent criteria, including a minimum charter capital of 120 billion VND at the time of registration, which is significantly higher than the 30 billion VND required by the Hanoi Stock Exchange (HNX) Additionally, companies must have been operational as joint-stock entities for at least two years prior to listing.
HOSE has stricter listing requirements compared to HNX, with a two-year profitability record needed for listed enterprises, whereas HNX only requires one year Additionally, HOSE mandates a minimum of 300 non-major shareholders holding at least 20% of the company's voting stock, in contrast to HNX's requirement of 100 shareholders with at least 15% Furthermore, HOSE enforces higher standards for information disclosure, requiring companies to reveal all debts owed to internal parties, major shareholders, and related individuals.
Vietnam's stock market has seen significant growth, rising from just 3,000 trading accounts in 2000 to 2.5 million accounts today Notably, there are approximately 33,000 accounts belonging to foreign organizations and individuals, holding nearly 35 billion USD in securities as of June 30.
2020 During this period, many foreign fund management companies also joined Vietnam's stock market Investment results show that investment funds in the period of
2009 - 2019 have relatively good investment results compared to the average growth rate of Vietnam's stock market (Figure 1.1)
Figure 1.1: The performance of investment funds in the period of 2009 - 2019
Between 2017 and 2019, Vietnam's stock market encountered significant challenges due to the complexities of the US-China trade war and a recession in major global economies This turmoil resulted in disappointing investment outcomes for both domestic and foreign investment funds, with their portfolio values declining even more sharply than the overall market downturn.
Figure 1.2: The performance of investment funds in the period of 2017 - 2019
Research by Brinson, Singer, and Beebower (1991) highlights that asset allocation significantly influences portfolio performance, accounting for 91.5% of investment results, while factors like security selection and timing contribute only 9% In the context of Vietnam's stock market, the concepts of asset allocation and optimal portfolio selection are still emerging and encounter several challenges.
V E IL VOF V CBF -T CF F T S E S S IA M V N I nde x V F M V F 1 PYN V CB F -BC F VFM V N 30 V F M V F 4 V N 30 Inde x S S I-S CA V a n E ck
Figure 1.3: Determinants of portfolio performance
The use of quantitative methods for asset allocation and optimal portfolio selection is relatively new in Vietnam's stock market, particularly among individual investors Most Vietnamese investors rely on fundamental and technical analysis for stock selection, often constructing their investment portfolios based on personal intuition rather than specific quantitative approaches While a small number of individual investors do employ quantitative models for optimal portfolio choices, these models tend to be traditional and have significant limitations in their applicability.
The unique characteristics of the Vietnam stock market pose challenges for investors, particularly for investment funds aiming to utilize quantitative models for optimal portfolio selection A significant issue is the availability of data; despite two decades of market development, the initial phase saw a limited number of companies, and the quality of information from that time remains unreliable This scarcity and inconsistency in data impact both its duration and overall effectiveness for analysis.
Asset AllocationSecurity SelectionMarket TimingOther Factors
Modern portfolio optimization requires extensive and reliable research data, yet regulations in the Vietnamese stock market complicate the creation of optimal portfolios using quantitative models Daily trading limits on HOSE (±7%), HNX (±10%), and UPCoM (±15%) restrict stock price fluctuations, which can stabilize investor sentiment during market volatility However, these regulations hinder the accurate reflection of market events in stock prices, making it challenging for investors to predict future portfolio fluctuations.
The settlement delays in the Vietnam stock market significantly impact portfolio optimization models, as investors face a three-business-day wait (T+3) after purchasing stocks before they can sell them Following a sale, an additional two-business-day wait (T+2) is required before initiating new buying transactions or receiving interest payments, which can increase risks and transaction costs These limitations hinder the application of high-frequency trading models in Vietnam, necessitating that investors account for these restrictions when developing and implementing optimal portfolio selection strategies.
Liquidity risk significantly impacts the practical application of portfolio optimization models, as a small market size and limited daily trading volume can lead to high risks for investors wishing to execute large trades Therefore, when developing and back-testing quantitative models, it is crucial for investors to consider slippage in stock trading activities, particularly for high-volume transactions, to avoid discrepancies between theoretical and actual buying and selling prices.
6 could be significantly different from the actual buying and selling prices, which in turn affects the reliability of the optimal portfolio selection models
Selecting an appropriate portfolio optimization method is crucial for investors in the Vietnam stock market This article will highlight effective strategies for choosing optimal investment portfolios, focusing on both the stock market in general and the specific context of Vietnam.
Problem statements
Modern Portfolio Theory (MPT), introduced by Harry Markowitz in 1952, has significantly influenced investment portfolio construction for over 65 years MPT aims to maximize returns for a given level of risk by optimizing asset weightings Despite its widespread use in investment practices, recent challenges have emerged regarding MPT's foundational assumptions, particularly concerning the reliance on the mean and covariance matrix of asset returns.
To effectively implement investment techniques, it is crucial for investors to accurately estimate the mean and covariance matrix of asset returns Typically, sample mean and covariance matrix methods are used; however, these estimators often exhibit instability due to estimation errors, causing portfolio weights to fluctuate over time Consequently, mean-variance portfolios can be challenging for portfolio managers to apply in practice Furthermore, numerous empirical studies, including those by Michaud (1989), have demonstrated that these portfolios tend to underperform in terms of mean and variance metrics during out-of-sample periods.
To address the challenges of Modern Portfolio Theory (MPT), two primary strategies can be employed: developing innovative methods for estimating the expected return and the covariance matrix of assets in portfolio optimization Established models, including the Capital Asset Pricing Model (CAPM) and the Fama-French model, have been widely used to estimate expected return parameters.
The Capital Asset Pricing Model (CAPM) is a one-factor model that establishes a relationship between systematic risk and expected asset returns However, the Fama-French model suggests that expected returns should also consider additional factors beyond the CAPM's beta coefficient, including size risk, value risk, profitability, and investment factors To enhance expected return estimations, researchers and portfolio managers utilize robust estimators such as truncated/trimmed means and winsorized means, as demonstrated by Martin, Clark, and Green in 2010 Furthermore, advanced robust estimation techniques like M-estimators, S-estimators, and Bayes-Stein estimators have been developed to address the limitations posed by non-stationary returns in expected return calculations.
Improving the estimation of expected asset returns addresses some limitations of Modern Portfolio Theory (MPT) However, Merton's research (1980) highlights the challenges in accurately measuring expected returns Most asset pricing models assume a constant relationship between an asset's expected return and the market's expected return, which simplifies estimation but requires extensive time series data for accuracy Despite recognizing that this assumption of constant expected returns is unrealistic, relaxing it complicates the estimation process further This leads to a new research focus on selecting portfolios based on covariance matrix estimation rather than expected returns Recent studies emphasize the importance of accurately estimating covariance matrix parameters, as this approach can enhance stability and reduce risk in investment portfolio selection.
The instability of mean-variance portfolios arises from the challenges in estimating mean asset returns Consequently, minimum-variance portfolios have gained popularity among researchers and portfolio managers This approach focuses primarily on estimating the covariance matrix, which reduces sensitivity to fluctuations in return estimates.
Jagannathan and Ma (2003) highlighted significant estimation errors, suggesting that the sample mean's inaccuracies are substantial enough to warrant ignoring it entirely Their research provides empirical evidence indicating that minimum-variance portfolios tend to outperform mean-variance portfolios in terms of Sharpe’s ratio and other performance metrics during out-of-sample periods (DeMiguel, 2005; Jagannathan and Ma, 2003).
According to Demiguel (2009), while the minimum-variance portfolio is not influenced by mean return estimations, it remains significantly affected by estimation errors This highlights the intriguing sensitivity of the minimum-variance portfolio to inaccuracies in estimations.
Portfolios constructed using the sample covariance matrix rely on the maximum likelihood estimator (MLE) for normally distributed returns, which is theoretically the most efficient due to its smallest asymptotic variance when the data aligns with the assumed distribution However, concerns arise regarding the effectiveness of the sample covariance matrix in generating suitable portfolios Huber (2004) highlighted that the efficiency of MLEs is highly sensitive to deviations from normality in asset-return distributions, indicating that MLEs may not be optimal for data that diverges even slightly from a normal distribution This is particularly relevant for portfolio selection, as substantial evidence shows that empirical return distributions typically differ from normal distributions.
The effectiveness of minimum-variance portfolio research hinges on the reliable estimation of the covariance matrix Traditional methods, such as the sample covariance matrix (SCM) and ordinary least squares (OLS), encounter significant challenges in high-dimensional portfolios The increased dimensionality heightens the risk of unexpected errors during computations, and insufficient sample data can hinder accurate estimation of the true covariance matrix.
The covariance matrix often becomes ill-conditioned or singular, leading to poor portfolio performance and profit generation To address this issue, researchers and portfolio managers have developed new covariance matrix estimators Notably, Ledoit and Wolf (2003) introduced a shrinkage estimator that combines a rough sample covariance matrix with a structured target matrix, allowing for a customizable balance between bias and variance This shrinkage technique offers a theoretically and empirically sound solution to high-dimensional portfolio covariance estimation, ensuring a well-defined covariance matrix Liu (2014) advanced this concept by utilizing a weighted average of multiple shrinkage target matrices Building on their earlier work, Ledoit and Wolf (2017a, 2017b) applied a nonlinear transformation to eigenvalues derived solely from sample data, maximizing out-of-sample expected utility Their numerical and empirical investigations revealed significant improvements over simple diversification, demonstrating robustness against deviations from normality Additionally, DeMiguel et al (2013) reviewed shrinkage frameworks for asset optimization and introduced new shrinkage-based techniques for return means and covariance matrices Candelon et al (2012) further enhanced this research by proposing a double shrinkage adaptation to improve stability.
10 estimation on even small sample sizes covariance matrices via taking into account a ridge regression approach to shrink the all the weights towards the equally-weighted asset
The choice of covariance matrix estimators significantly impacts the performance of optimized portfolios Investors can enhance their portfolio outcomes by transitioning from traditional sample covariance estimators to newer alternatives However, the lack of comprehensive research on the out-of-sample performance of these new estimators creates uncertainty Consequently, portfolio managers may hesitate to invest based on unverified methods, as the absence of a solid foundation in this area poses a risk to their capital.
The traditional covariance matrix estimator struggles to deliver expected results due to the rapid increase in investment assets in the financial market, often outpacing the observed sample size This scenario necessitates the exploration and application of new covariance matrix estimators Additionally, there remains significant debate regarding the applicability and effectiveness of various covariance matrix estimation methods across different markets.
Robust estimators of the covariance matrix have primarily been studied in developed markets, with limited research in emerging and developing financial markets, particularly in Vietnam There is a notable absence of studies focusing on the selection of covariance matrix estimators for portfolio optimization, especially regarding shrinkage methods This presents a significant opportunity for investigation into how these estimators impact minimum-variance optimized portfolios and their performance within the Vietnamese stock market.
Objectives and research questions
This dissertation aims to explore whether investors can enhance the performance of minimum-variance optimized portfolios by adjusting the covariance matrix estimators Additionally, it will identify the most suitable covariance matrix estimators for portfolio optimization in the Vietnam stock market, based on out-of-sample portfolio performance metrics.
In order to achieve the above objectives, this dissertation will attempt to answer the research questions as follows:
Question 1: How do the robust estimators of covariance matrix perform on out – of – sample performance metrics such as portfolio return, level of risk, portfolio turnover, maximum drawdown, winning rate and Jensen’s Alpha in selecting minimum – variance optimized portfolios?
Question 2: How do the estimators of covariance matrix affect the out – of – sample performance of minimum – variance optimized portfolios when the number of assets in the portfolio changes?
Question 3: Could the alternation of covariance matrix estimation for portfolio optimization beat the traditional estimator of covariance matrix and benchmarks of stock market on out - of - sample?
Research Methodology
To effectively meet the research objectives and address the outlined questions, the author must select a suitable research method This dissertation utilizes various research methodologies to achieve these aims.
This study investigates the impact of different covariance matrix estimators on optimal portfolio selection, utilizing six specific methods: the sample covariance matrix (SCM), the single index model (SIM), the constant correlation model (CCM), and shrinkage techniques.
The article discusses three key covariance matrix estimation methods: the 12 index model (SSIM), the shrinkage towards constant correlation model (SCCM), and the shrinkage towards identity matrix (STIM) Among these, the standard covariance matrix estimator is the SCM, while SSIM and SCCM represent model-based approaches These methods are categorized as shrinkage techniques Furthermore, minimum-variance optimization is employed to create optimal portfolios using the estimated covariance matrices derived from these estimators.
To assess the feasibility and potential applications of the covariance matrix estimators discussed, a back-testing process was developed using Python This process builds on the back-testing platform established in previous research by Tran et al (2020) Through this back-testing procedure, the statistical properties of the covariance matrix estimators will be analyzed, offering insights into which estimator may yield profitable results in real-world scenarios.
The back-testing process estimates key portfolio performance metrics essential for evaluating portfolios, including basic criteria such as portfolio return and volatility, alongside additional metrics like portfolio turnover, maximum drawdown, winning rate, and Jensen’s Alpha This research employs a "rolling-horizon" technique, which is a reactive scheduling method that iteratively updates the optimization horizon as new information becomes available, allowing for optimal portfolio selection based on current data The input data for this back-testing process consists of weekly stock price series, which will be analyzed to derive weekly performance metrics.
13 return during the optimization procedure One more thing, when calculating the portfolio performance metrics, the transaction costs are also considered at every rebalancing point
In conclusion, the estimated performance metrics are utilized to assess the differences among covariance matrix estimators for optimal portfolio selection To ensure that the performance metrics between the two specific estimators show significant differences, p-values are calculated using the bootstrapping methodology outlined in DeMiguel's 2009 research.
Expected contributions
After answering the research questions and achieving the research objective, this dissertation will expect to make some contributions as follows:
Empirical research on the Vietnamese stock market demonstrates that investors can enhance their portfolio performance by utilizing estimation methods to adjust the covariance matrix parameter in portfolio optimization The findings indicate that model-based estimators of the covariance matrix, such as SIM and CCM, along with shrinkage estimators like SSIM, SCCM, and STIM, significantly outperform the traditional sample covariance matrix (SCM) across nearly all tested portfolios (N = 50, 100).
The analysis of portfolio performance metrics, including portfolio return, risk level, turnover, maximum drawdown, winning rate, and Jensen’s Alpha, reveals a significant superiority, especially as the number of stocks in the portfolio increases.
Shrinkage estimators of the covariance matrix significantly outperform other estimators and market benchmarks across various portfolio evaluation criteria, particularly for high-dimensional portfolios Additionally, the shrinkage towards the constant correlation model (SCCM) demonstrates the most effective level for achieving optimal portfolio performance.
14 selection compared to the shrinkage towards single index model (SSIM) and shrinkage towards identity matrix (STIM)
The dissertation introduces a novel approach by examining how the performance of covariance matrix estimators is affected by its dimensionality and the impact of transaction costs on out-of-sample portfolio performance It specifically assesses the effectiveness of various estimation methods as the number of stocks in the portfolio ranges from N = 50 to N = 350, with transaction costs of 0.3% factored in at each rebalancing point.
The dissertation enhances the evaluation of portfolio effectiveness by incorporating a diverse range of performance metrics, including the Sharpe ratio, portfolio turnover, maximum drawdown, winning rate, and Jensen’s Alpha This multi-dimensional approach allows for a more comprehensive assessment of estimation methods in selecting optimal portfolios, moving beyond traditional metrics like return and variance used in prior research This effort underscores the author's commitment to analyzing the effectiveness of covariance matrix estimation methods in a more nuanced manner.
This dissertation's key contribution is the experimentation of various covariance matrix estimators on the Vietnam stock market, an emerging market While many researchers and financial practitioners have utilized these estimation methods to optimize investment portfolios in developed markets like the US and Europe, there is a notable scarcity of studies focusing on emerging markets, particularly Vietnam The empirical findings of this research will provide valuable insights for researchers and investors, highlighting the performance differences of covariance matrix estimators between emerging and developed financial markets.
Disposition of the dissertation
This dissertation consists of six chapters, with Chapter 1 outlining the problem statements, study objectives, research questions, methodology, and anticipated contributions The subsequent chapters are structured to build upon these foundational elements.
Chapter 2, Literature Review, provides an overall review about relevant researches regarding portfolio optimization before developing a specific theoretical framework and methodology in the next chapter
Chapter 3, Theoretical Framework, establishes the foundational theory for this dissertation by introducing essential preliminaries and the portfolio optimization problem, followed by a detailed discussion on covariance matrix estimations.
Chapter 4, Methodology, deploys the basic methodology used for answering research questions in the dissertation
Chapter 5 presents the empirical results derived from the dissertation, focusing on the back-testing performance of covariance matrix estimators in out-of-sample scenarios In Chapter 6, the conclusions summarize the research findings and outline potential directions for future studies.
LITERATURE REVIEW
Modern Portfolio Theory Framework
Harry Markowitz is a key figure in financial economics, renowned for his contributions to portfolio selection theory, which earned him the Nobel Prize in 1990 His seminal article, "Portfolio Selection," published in 1952 in "The Journal of Finance," laid the groundwork for modern portfolio theory (MPT) This foundational work was later expanded into the book "Portfolio Selection: Efficient Diversification" in 1959, significantly influencing investment strategies and financial analysis.
The Modern Portfolio Theory (MPT) is an investment strategy aimed at maximizing expected returns for a specified level of risk or minimizing risk for a desired return by strategically allocating asset weights While widely utilized in the investment industry, the foundational assumptions of MPT have faced increasing scrutiny and debate in recent years.
The Modern Portfolio Theory (MPT) enhances classical quantitative models and is crucial in financial mathematical modeling It emphasizes diversification to safeguard investment portfolios against market and specific company risks Often referred to as Portfolio Management Theory, it aids investors in classifying, evaluating, and measuring expected risks and returns The core concept of this theory lies in quantifying the relationship between risk and return, positing that investors deserve compensation for assuming risks.
The concept of diversification in Modern Portfolio Theory (MPT) involves selecting investment portfolios that exhibit lower risk compared to individual securities within those portfolios This strategy effectively reduces investment risk, regardless of whether the correlation between security returns is positive or negative.
According to Modern Portfolio Theory (MPT), a security's return is viewed as a normally distributed function, with risk defined as the standard deviation of that return The overall portfolio return is calculated as a weighted combination of individual securities' returns, and the total variance can be minimized if the returns among the securities are not perfectly positively correlated Additionally, MPT operates under the assumptions that investors act rationally and that markets are efficient.
Investing involves balancing the expected return against associated risks, as higher returns generally come with increased risk (Taleb, 2007) The Modern Portfolio Theory (MPT) provides a framework for selecting a portfolio that maximizes expected returns for a specific risk level, while also offering insights on achieving the lowest possible risk for a desired expected return.
2.1.1 Concept of risk and return
Return serves as a fundamental motivation and a key reward for any investment project, encompassing both realized returns (actual returns received) and expected returns (anticipated returns over future periods) Expected returns represent forecasted outcomes that may or may not occur, while realized returns from the past enable investors to calculate cash inflows such as dividends, interest, incentives, and capital gains Investors can assess total profit or loss over a specified investment period, expressed as a percentage of the initial investment, which is known as cumulative return In the context of securities investment, returns consist of dividends and capital gains or losses realized upon the sale of the securities.
In investment activities, risk is the unpredictability of future returns, representing the likelihood that actual profits will differ from expectations It reflects the probability that the financial outcomes of investments may not align with anticipated results Generally, investments exhibiting higher return volatility are considered riskier compared to those with lower volatility.
Risk and uncertainty are distinct concepts that must be understood clearly Risk refers to situations where the likelihood of an event occurring or not can be quantified and measured, allowing for probabilities to be assigned based on available data In contrast, uncertainty arises when such probabilities cannot be measured, often due to a lack of available facts and figures Understanding this difference is crucial for effective decision-making in various contexts.
Investors cannot predict investment outcomes with certainty, but they can use statistical estimation methods to assess the risk associated with expected returns By measuring the difference between expected and actual returns, tools like standard deviation and variance become essential for evaluating investment risk.
2.1.2 Assumptions of the modern portfolio theory
Modern portfolio theory, formulated by Markowitz, is grounded in several key assumptions about investors and markets It posits that all investors are identical in their risk aversion and rationality, leading to a unified approach to optimal portfolio selection.
Investors aim to optimize their portfolios by minimizing risk while maximizing expected returns They base their portfolio selections primarily on the anticipated returns and associated risks, with risk quantified as the variance of returns Additionally, it is assumed that asset returns remain stationary over time, and investors are fully aware of the prices of all assets under consideration.
Investors can instantly and costlessly update their portfolios in response to changes in asset prices, which are considered exogenous and unaffected by individual choices All assets are infinitely liquid, allowing for trades of any size, and investors have the option to take short positions They can also borrow and lend at a uniform interest rate without risk, and incur no transaction costs such as taxes or fees Additionally, investors allocate their entire budget to their portfolio, with no savings set aside.
The Modern Portfolio Theory (MPT) relies on several premises, both explicit and tacit, such as the assumption of normal distributions for model returns, tax indifference, and the neglect of transaction fees However, none of these assumptions hold completely true, which ultimately compromises the effectiveness of the MPT A fundamental tenet of MPT is the belief in market efficiency.
According to Fabozzi et al (2002), the primary function of Modern Portfolio Theory (MPT) is asset allocation Investors begin by assessing potential investment assets and any associated restrictions Next, they estimate the returns, correlations, and volatility of the investable securities These predictions are then utilized in an optimization process to achieve a final outcome that aligns with individual expectations.
Investor Objectives Figure 2 1: MPT investment process From Fabozzi, F., Gupta, F., & Markowitz (2002)
Parameter estimation
Mean-covariance optimization is often criticized for its estimation error, as highlighted in Markowitz's 1952 study, which prioritized theoretical soundness over practical application To effectively implement mean-variance optimization (MTP), accurate estimation of asset return means and covariances is essential, as these values are not predetermined The estimated data is then used to address investors' optimization problems (Elton et al., 2012) Numerous studies indicate that this reliance on estimates can lead to significant drawbacks in the mean-variance approach, particularly due to potential estimation errors when inappropriate moments are used for input (Michaud, 1989; Chopra and Ziernba, 1993) Consequently, the optimizer often overlooks that the data inputs are merely statistical estimates, lacking certainty, which introduces flaws into the optimization process.
The traditional method for estimating asset returns and covariance relies on historical ex-post returns to derive sample estimates, based on the assumption that past data can inform future asset price trends However, research has highlighted significant issues with this approach DeMiguel (2009) found that using sample estimates does not guarantee the creation of mean-variance optimized portfolios that outperform equally-weighted alternatives Similarly, Jobson and Korkie (1980) presented comparable findings, while Best and Grauer (1991) noted that estimation errors can distort the weights of optimized portfolios, leading to discrepancies between estimated and actual optimal weights.
Chopra and Ziemba (1993) posited that errors in expected returns significantly affect the out-of-sample performance of optimal portfolios more than errors in the covariance matrix They attributed the historical focus on expected return vectors over covariance matrices to this phenomenon However, Michaud (2012) challenged their conclusions, highlighting widespread estimation errors in research that undervalued the covariance matrix's role in returns Michaud argued that Chopra and Ziemba's findings were based on a limited in-sample study, which inadequately addressed the implications of estimation errors in out-of-sample mean-variance optimization Their research ultimately revealed that estimation errors in the covariance matrix could dominate the portfolio optimization process, particularly as the number of assets increases.
Estimating errors play a crucial role in mean-variance optimization (MVO), making it essential to address them for more effective portfolio optimization during out-of-sample periods Reducing these errors is vital, as it can greatly enhance the performance of asset managers utilizing MVO strategies.
Twenty-three scholars highlighted the crucial role of predicting the covariance matrix in Mean-Variance Optimization (MVO), leading to a significant body of literature centered on forecasting asset returns Consequently, there has been a growing demand for comprehensive research that compares the results of various forecasting methods.
This article provides an overview of various solutions for forecasting expected returns and includes a literature review focused on methods for reducing estimation errors in the covariance matrix.
Foreign researches on expected return estimation
The Capital Asset Pricing Model (CAPM), introduced by William Sharpe in 1964, is a one-factor model that predicts asset returns based on systematic risk, rooted in Modern Portfolio Theory (MPT) However, in 1992, Eugene Fama and Kenneth French found that the beta coefficient in CAPM inadequately represented the expected returns of American securities from 1963 to 1990 They identified two stock categories—small caps and those with a high Book to Market Equity ratio—that consistently outperformed the market This led to the development of a three-factor model in 1993, incorporating size and value factors, which was validated through empirical tests across developed and emerging markets In 2014, Fama and French expanded the model to a five-factor framework by adding profitability and investment factors, enhancing its predictive power for expected returns.
Researchers have utilized robust estimators to enhance the accuracy of expected return estimations Instead of relying solely on the sample mean, they recommend using methods such as the trimmed mean or winsorized mean, which involve adjusting for extreme values The trimmed mean excludes the k% most extreme values, while the winsorized mean replaces them with the next k% most extreme values However, these robust estimators, including the M-estimator and S-estimator, may not yield accurate expected return estimates if asset returns are not stationary.
To enhance performance amidst non-stationary returns and automate the tuning parameter selection, researchers are increasingly utilizing Shrinkage Estimators This approach recognizes that both actual expected return uncertainty (asset risk) and estimation risk contribute to a decrease in investor utility, as highlighted by Jorion (1985) Therefore, the objective of the optimization problem is to minimize utility loss arising from portfolio selection based on sample estimates rather than true values Rather than estimating each asset’s expected return in isolation, it is more effective to choose an estimator that reduces utility loss stemming from overall parameter uncertainty Jorion (1986) proposes the use of a Bayes–Stein estimator, which adjusts each asset’s sample mean towards the overall grand mean.
Through the simulation, this estimator reduce risk of portfolio and outperforms portfolios constructed by
Improving the estimation of expected asset returns is a recognized method to address the flaws in Modern Portfolio Theory (MPT) However, Merton's research (1980) highlights the challenges in accurately estimating these expected returns Most asset pricing models rely on the assumption of a consistent relationship between expected asset returns and the market over time While this assumption simplifies the estimation process, it still requires extensive time series data for accurate calculations.
The assumption of a constant expected return, as noted by Merton (1980), is often unrealistic; relaxing this assumption complicates the estimation process This leads to a new research avenue focused on selecting portfolios based on covariance matrix estimation rather than solely on expected return estimation.
Local researches on expected return estimation
Local studies have focused on optimizing portfolios by estimating expected return parameters Phuong Nguyen (2012) utilized a single-factor model (SIM) to assess risks and determine expected returns for stocks in the construction industry Similarly, Linh Ho (2013) applied this model to evaluate risks and expected returns for real estate stocks listed on the Ho Chi Minh Stock Exchange (HOSE) Truong and Duong (2014) employed the CAPM and Fama-French three-factor models to enhance investment portfolios on HOSE, while Tram Le (2014) also used the Fama-French model for risk measurement and expected return estimation in the Vietnamese stock market Additionally, Nguyen Tho (2010) applied arbitrage pricing theory (APT) to analyze stock price behavior in emerging markets, including Vietnam and Thailand.
Foreign researches on covariance matrix estimation
Recent research has focused on improving investment portfolio stability and minimizing risks, highlighting the limitations of traditional sample covariance matrix (SCM) estimation in portfolio optimization Michaud (1989) identified significant shortcomings in this approach, noting that it can lead to statistical errors and become ill-conditioned when the number of samples is similar to the number of assets, a phenomenon he termed the "Markowitz enigma." Additionally, Frankfurter, Phillips, and Seagle (1971) supported Michaud's findings regarding the inefficacy of SCM estimation for Modern Portfolio Theory (MPT).
26 model does not bring superior results compared to an equally weighted portfolio selection that DeMiguel (2009) called as the nạve 1/N portfolio
The single-index model (SIM) proposed by Sharpe (1964) allows researchers to estimate the covariance matrix for optimal portfolio selection more efficiently than the traditional mean-variance portfolio theory (MPT) The SIM offers three key advantages: it requires estimating only 2N+1 parameters compared to N(N+1)/2 parameters in MPT, making it less complex for portfolios with four or more assets; it simplifies the addition of new assets by only necessitating the estimation of their beta, rather than recalculating variances and covariances for all assets; and it requires fewer observations (T > 2) to estimate beta for each asset, as opposed to the MPT's requirement of T > N Research by Senneret et al (2016) demonstrates that the SIM approach yields portfolios that are less sensitive to estimation errors, outperforming the standard method across various risk and return metrics Nonetheless, both SIM and MPT are susceptible to estimation errors stemming from the sample mean returns vector.
Elton and Gruber (1973) introduced the Constant Correlation Model (CCM) to address the limitations of the SIM, positing that all stocks share a uniform correlation equal to the historical average This model suggests that the historical correlation matrix reflects only the average correlation for future periods, disregarding variations in pairwise correlations, which is a significant assumption Elton et al (2009) highlight that the CCM provides more accurate forecasts of future covariance matrices compared to the sample covariance matrix.
Portfolio Selection
The mean-variance model is a common approach for portfolio selection, allowing investors to determine optimal asset weights based on the mean and covariance of asset returns However, estimating sample means and covariance matrices can introduce significant errors, particularly in sample means, prompting a shift towards the global minimum-variance model, which relies solely on the covariance matrix for asset weighting Consequently, existing literature on portfolio selection has diverged into two main streams: those adhering to the traditional mean-variance model and those embracing the more recent global minimum-variance model.
The return on securities is viewed as a "random variable with Gaussian distribution" in the standard mean-variance model introduced by Markowitz in 1952, which posits that asset returns depend solely on mean and variance Markowitz, a pioneer in this field, published his influential work on portfolio selection in 1959 Two decades later, Merton expanded on this model in 1972, allowing for short sales in portfolio selection Over the years, these foundational models have been utilized in numerous studies, as summarized in Table 2.1.
Table 2.1 : Summarized works related to portfolio optimization
Author Year Paper/book/Thesis Title
Samuelson 1969 “Lifetime portfolio selection by dynamic stochastic programming”
Merton 1969 “Lifetime portfolio selection under uncertainty: The continuous – time case”
“An extension of the Markowitz portfolio selection model to include variable transaction costs, short sales, leverage policies and taxes”
Merton 1972 “An analytic derivation of the efficient portfolio frontier”
Gomez 2007 “Portfolio selection using neural networks”
(Source: Risk and Financial Management, 2019)
The conventional mean-variance model by Markowitz is static, limiting investors to early decisions and a fixed investment period To enhance investor utility at the end of this period, Samuelson (1969) introduced a discrete time multi-period consumption investment model Merton (1969, 1971) further advanced this concept with continuous time research aimed at maximizing expected returns within a specific planning period Pogue (1970) was the first to address mean-variance portfolio issues by incorporating transaction costs To align with real market conditions, Xue et al (2006) developed a mean-variance model that accounts for concave transaction costs More recently, Liagkouras and Metaxiotis (2018) proposed a multi-stage fuzzy portfolio optimization algorithm that also considers transaction costs, reflecting ongoing efforts to refine portfolio management strategies.
The mean-variance model has evolved through the incorporation of cardinality constraints, as demonstrated by Fernández and Gómez (2007), who emphasized limitations on asset selection and investment amounts Similarly, Soleimani et al (2009) advanced the model by adding minimum trading lot sizes and market capitalization considerations, enhancing its practical application in investment strategies.
2.3.2 Global Minimum Variance Model (GMV)
The Global Minimum Variance (GMV) portfolio represents an optimized approach on the efficient frontier, demonstrating its effectiveness in minimizing variance Haugen and Backer (1991) assessed the efficiency of capitalization-weighted portfolios, concluding that they were not efficient under standard conditions, even in efficient markets, as investors rationally optimized the risk-return trade-off Chopra and Ziemba (1993) found that the GMV model significantly reduces errors in variance and covariance estimations compared to expected return Chan et al (1999) highlighted the superior performance of GMV portfolios over the traditional mean-variance model by Markowitz Jagannathan and Ma (2003) established that GMV-optimized portfolios exhibit more stable asset weights due to reduced estimation errors in covariance Furthermore, Kempf and Memmel (2006) supported the notion that GMV portfolios yield better out-of-sample results than those predicted by tangent portfolio theory Lastly, DeMiguel and Nogales (2009) noted that GMV's reliance on covariance matrices makes it less susceptible to estimation errors compared to conventional models.
Recent research has highlighted the growing popularity of the Global Minimum Variance (GMV) portfolio as an effective tool for portfolio optimization The GMV density function, developed by Okhrin and Schmid (2006) under the assumption of normal distribution, enhances the understanding of portfolio weight distributions Similarly, Clarke et al (2006) contributed to this field by emphasizing the significance of stock returns in the context of GMV portfolios.
Under the minimum variance model, the 33 weights on the left of the effective boundary are independent of the expected safe return Therefore, it is essential to eliminate the equilibrium expectation, or "active forecast return," and rely solely on the covariance matrix to achieve an optimized portfolio.
Research indicates that optimizing an investment portfolio is more effective when focusing on the estimation of the covariance matrix, as it is less susceptible to estimation errors compared to expected returns The shrinkage model enhances the precision of the covariance matrix by balancing the strengths and weaknesses of traditional estimators Various target matrices, including linear and non-linear options, have been explored for their effectiveness across different market conditions The next phase involves applying the estimated covariance matrix to variance models to derive an optimal portfolio, with mean-variance and global minimum-variance models being the most commonly utilized by researchers and portfolio managers Historical literature suggests that the global minimum-variance model tends to outperform other approaches.
THEORETICAL FRAMEWORK
Basic preliminaries
This region includes a short overview to fundamental principles and techniques for optimizing minimum variance
The return of a stock, denoted as R, represents the gain or loss over a specific period It is calculated based on the stock price at a given time, referred to as t, during a period from t to T, where T is greater than t For non-dividend stocks, the return is determined by evaluating the price changes within this timeframe.
With respect to a stock which pay dividends in the term [t, T], the return is computed as:
At time t, the consequences of the variables remain unknown, making the stock return a randomized variable The estimated outcome of this variable is represented as E[ ].
A portfolio can consist of several stocks when determining the portfolio allocation, where the weight of stock i, , in the portfolio is computed respectively:
Thus, we have this for an asset universe of size n:
The expected return of this kind of portfolio, , - shall then be calculated by:
E[ ] = E[∑ ] = ∑ , ] = ∑ = à Here, = [ ,…, ] and à = , ,…, ] T , these are vector notations where w, à є R nx1
Variance plays a crucial role in minimum variance optimization, serving as the primary statistic for measuring the risk or volatility of asset returns It is calculated by taking the expectation of the squared differences from the mean In the context of a portfolio comprising n assets, the variance of the asset universe can be determined through specific mathematical formulas.
In short, the portfolio volatility is indicated:
Portfolio Optimization
Mean-variance optimization, rooted in Markowitz's traditional portfolio theory, faces significant drawbacks due to its sensitivity to errors in estimating means and the covariance matrix of asset returns (Jorion, 1985) Research indicates that the estimation of the covariance matrix is generally more reliable than that of expected returns Evidence, including studies by Jorion (1986) and Jagannathan and Ma (2003), suggests that minimum-variance portfolios often outperform other mean-variance approaches This dissertation focuses on improving the efficiency of the global minimum variance portfolio (GMVP), which relies solely on the assessment of covariance matrices.
In a portfolio consisting of N risky assets with weights denoted as w = (w₁, w₂, …, wₙ), it is essential to adhere to the constraint that the sum of the weights equals one, represented mathematically as w'1 = 1, where 1 is the vector of ones This ensures that the total investment is fully allocated across the assets Furthermore, the condition wᵢ > 0 indicates that short selling is not permitted, aligning with the regulations of the Vietnamese stock market The expected portfolio return can be calculated using the formula ̂ = ∑ ̂ ̂, while the portfolio variance is determined by the equation ∑̂.
Portfolio optimization problem is solved with Markowitz’s MPT theory through linear programming following statistics function:
In which: “1 denotes a vector of ones, and Σ is the covariance matrix of N stocks” The theoretical approach to the problem (3.2.1) is feasible:
To determine the optimal minimum variance portfolio, the estimated covariance matrix is inputted into a quadratic optimization program that calculates the asset weights The solution involves the inverse of the covariance matrix, typically derived from the sample covariance matrix However, this approach can be problematic as sample covariance matrices are often ill-conditioned and may not be invertible, especially in high-dimensional portfolios To address this issue, shrinkage estimators are recommended to adjust the covariance matrix parameters, leading to improved results in the optimization process.
The estimators of covariance matrix
To effectively measure portfolio risk, investors must assess the risk level of each asset and how their returns correlate, with covariance serving as a key metric According to Ledoit and Wolf (2003b), the typical Sample Covariance Matrix (SCM) is an optimal estimator under normality assumptions, making it a "best-unbiased" estimation tool However, this maximum likelihood estimator relies heavily on data quality, which is effective with large datasets but falls short in small sample sizes.
Noise data can significantly impact the effectiveness of the sample covariance matrix (SCM) in financial analysis Research indicates that while SCM can perform well in certain scenarios, it often struggles with smaller sample sizes To address this limitation, investors might consider expanding their sample viewing window; however, this can lead to an influx of redundant data that fails to enhance future forecasts Additionally, as noted by Bengtsson and Holst (2002), when the number of assets exceeds the number of historical observations, the sample covariance matrix may become ill-conditioned and non-invertible, posing serious challenges for portfolio optimization.
3.3.1 The sample covariance matrix (SCM)
First, assuming denotes the historical return for asset i at time period t Then, the historical average returns in the period [1, T] of asset i ( ̅) will be determined as follows: ̅ = ∑ (3.3.1)
Next, the equation that is used to calculate the sample covariance between any two assets i, j is formulated:
From the equation (3.3.2), the sample covariance matrix ( ̂ ) that shows the relationship among N assets in the portfolio is identified as follows: ̂ [ ̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂
Moreover, the covariance matrix Σ will be determined according to:
Where: “σ denotes as a column vector of standard deviations; diag(σ) is a matrix with the elements of σ on the main diagonal; C is a correlation matrix”
Investors can derive various covariance matrices by altering the method used to calculate the correlation of asset returns in the correlation matrix C The estimation of σ, represented as ̂, refers to the sample standard deviation of historical asset returns It is important for investors to recognize that ̂ remains unaffected by the chosen correlation estimation method.
Therefore, the estimated covariance matrices obtained by historical correlation are:
An expression of is described by:
Where “ is the correlation between asset returns and The estimation of , denoted as ̂ , is computed by using the pairwise sample correlations of the historical asset returns”
3.3.2 The single index model (SIM)
The Single-Index Model (SIM), developed by Sharp in 1963, estimates asset returns and the covariance matrix by assuming that each asset's returns are influenced by overall market performance This one-factor model aims to mitigate the volatility of asset returns by utilizing regression analysis against market returns, providing a streamlined approach to calculating expected asset performance.
The estimated return of asset i at time i is represented as ̂, while the estimated market return is denoted as ̂ The model incorporates a random error term and utilizes OLS regression to derive the parameters α and β It is assumed that the random errors are independent, with Cov[ , ] = 0, and that they are uncorrelated with the market return, represented by Cov[ ̂ , ].
= 0, the error must follow normal distribution Var[ ] = and E[ ] = 0
Thus, with the construction of asset returns based on SIM, the variance - covariance of asset returns and those of model is expressed as:
• The variance of estimated asset return i:
• The covariance of estimated asset return i, j:
• The estimated covariance matrix of SIM:
The estimated market variance, denoted as ̂ = Var[ ̂ ], is derived from the coefficients β obtained through SIM's regression involving N assets The regression error is represented by the diagonal matrix ̂, which has a shape of N x N It is essential to note that risks are indicated by variance; therefore, the assumption that market variance must be greater than zero, ̂ > 0, is crucial for accurate risk assessment.
This study employs a single index model (SIM) using market return as the index, contrasting with Cohen and Pogue's (1967) approach, which utilized industry return While the single index model can be extended to a multi-asset factor model, George Derpanopoulos (2018) argues that increasing complexity in the SIM often introduces more noise than valuable information, primarily because it becomes challenging to associate stock covariances with factors beyond the market itself.
In 1895, Karl Pearson introduced the Pearson Product-moment Correlation Coefficient (PPMCC), a method for measuring the correlation between variables This widely utilized statistical tool is essential in various fields, including economics, mathematics, and the sciences.
Formula to compute Pearson correlations is:
The Pearson correlation is commonly used to estimate the covariance matrix, referred to as the Constant Correlation Model (CCM) According to Elton and Gruber (1973), this model assumes that all stocks share the same correlation, which is equal to the average correlation used for calculating covariances.
The estimation of the covariance matrix with constant correlation is represented by a specific formula The sample covariance matrix of asset returns is denoted as S, consisting of various elements Additionally, the sample correlations between stocks are provided.
√ The average of sample correlations is calculated as: ̅ ( ) ∑ ∑
Finally, constant correlation matrix C is defined as: and ̅ √
3.3.4 Shrinkage towards single-index model (SSIM)
Ledoit and Wolf (2003) introduced a shrinkage method for estimating the covariance matrix, enhancing its stability This approach merges the sample covariance matrix S with a structured covariance estimator F, derived from a Single-Index Model, represented as F = ̂ The technique involves a convex linear combination of these two matrices, expressed as δF + (1−δ)S, where δ is the shrinkage constant discussed later in the section Essentially, this method redistributes the weight between the more stable covariance estimator F and the sample covariance matrix S, utilizing the shrinkage constant δ for improved accuracy.
By using the definition of Frobenuis norm, constant shrinkage estimator is calculated through following equation:
This equation means that using Frobenuis norm to scale the different between the True population covariance matrix Σ and the shrunk covariance matrix ∑̂ = δF + (1−δ)S Here, the shrinkage constant δ as variable
From equation (3.3.9), it gives the risk function calculated as:
Thus, R(δ) is the risk of equation (3.3.9) The mission of investors is to minimize risk For doing this mission, the first and second derivative of R(δ) must be calculated:
As a results of second derivative, (ẟ) are always larger than 0 Therefore, minimizing the risk R(δ) is solved by (ẟ) = 0 Then the shrinkage constant will be found as follows:
∑ ∑ ( ) ( ) (3.3.10) Where Ledoit and Wolf (2003a) showed that:
= ∑ ∑ ,√ - is the sum of asymptotic variances ρ = ∑ ∑ ,√ √ - is the sum of asymptotic covariance and γ = ∑ ∑ ( ) 2
The equation (3.3.10) will turn into:
The optimal shrinkage intensity, or shrinkage coefficient, identified by Ledoit and Wolf, represents a balance between the sample covariance matrix and a shrinkage target matrix A higher shrinkage coefficient indicates a greater impact of shrinkage methods on the estimation of the covariance matrix Since the effectiveness of portfolio selection relies on accurate covariance matrix estimation, which is influenced by the shrinkage intensity, it is essential to focus on determining the optimal shrinkage coefficient as a key aspect of shrinkage methods.
3.3.5 Shrinkage towards Constant correlation Model (SCCM)
Ledoit and Wolf (2003b) proposed a method utilizing the Constant Correlation Model (CCM) as the target matrix for the shrinkage method Their findings indicated that shrinking towards CCM yields superior performance compared to shrinking towards the Shrinkage Inverse Matrix (SIM), while also being simpler to implement.
The shrunk covariance matrix, represented as ∑̂ = δ + (1−δ)S, is derived by replacing the covariance matrix of the single index F with a constant correlation covariance matrix Here, δ denotes the shrinkage constant, which will be further explained in this section This approach redistributes the weight of the covariance matrix by blending a more stable sample covariance matrix with the sample covariance matrix S, utilizing the shrinkage constant δ for improved accuracy.
An objective must be chosen according to which the shrinkage coefficient is optimal
Existing shrinkage estimators from finite-sample statistical decision theory, including those by Frost and Savarino (1986), fail when the sample size (N) is greater than or equal to the number of variables (T) due to their reliance on the inverse of the covariance matrix in their loss functions To address this issue, a new loss function is introduced that is independent of the inverse and is based on an intuitive quadratic measure of distance between the true and estimated covariance matrices, utilizing the Frobenius norm Building on this methodology, Ledoit and Wolf (2003b) developed a method to estimate the shrinkage intensity effectively.
The Frobenius norm of the N × N symmetric matrix Z with entries ( ), in which i, j
Based on the Frobenius norm, the difference between a shrinkage estimator of covariance matrix and a true covariance matrix will be calculated; and the quadratic loss function is identified as follows:
The shrinkage coefficient δ will be found through minimizing the expected value of the loss function:
Assuming that N, T are fixed and approach to infinity respectively, Ledoit and Wolf
(2003) prove that “the optimal value δ asymptotically behaves like a constant over T (up to higher-order terms)” and κ denoted as the constant is shown the following formula:
Where: π is the sum of “asymptotic variances of the entries of sample covariance matrix” scaled by √ :
∑ ∑ [√ ] ρ is the sum of “asymptotic covariance of the entries of the shrinkage target with the entries of sample covariance matrix” also scaled by √ :
∑ ∑ [√ √ ] γ which denotes the misspecification of the (population) shrinkage target is determined as follows: