Stochastic frontier models review with applications to vietnamese small and medium enterprises in metal manufacturing industry

The literature of panel data models first come with the assumption of time-invariant technical inefficiency Battese & Coelli, 1988; Pitt & Lee, 1981; Schmidt & Sickles, 1984.. Within est

Trang 1

VIETNAM - NETHERLANDS PROGRAMME FOR M.A IN DEVELOPMENT ECONOMICS

STOCHASTIC FRONTIER MODELS REVIEW WITH APPLICATIONS TO VIETNAMESE

SMALL AND MEDIUM ENTERPRISES IN METAL

MANUFACTURING INDUSTRY

A thesis submitted in partial fulfilment of the requirements for the degree of

MASTER OF ARTS IN DEVELOPMENT ECONOMICS

By

NGUYEN QUANG

Academic Supervisor:

Dr TRUONG DANG THUY

HO CHI MINH CITY, NOVEMBER 2013

Trang 2

Metal manufacturing industry has an important role in the economy due to the high demand of metal products, especially steel and iron in daily life, production and, mostly construction To help maintain and develop the benefit from this industry, it is necessary to have an analysis into the technical efficiency level of small and medium enterprises (SMEs) which takes about 97% of the number of Vietnamese enterprises This study aims to estimate the technical efficiency level of Vietnamese SMEs using an unbalanced panel dataset in three years: 2005, 2007 and 2009 with stochastic frontier model Besides, because of divergent literatures of panel-data stochastic frontier model, this paper also makes a review of popular ones in order to choose the suitable model for the case of Vietnamese metal manufacturing industry The result shows different technical efficiency levels while using different models due to the divergence among identifications of technical efficiency concept.

Trang 3

TABLE OF CONTENT

Page

LIST OF TABLES 4

LIST OF FIGURES 4

LIST OF CHARTS 4

CHAPTER I: INTRODUCTION 5

1 Introduction 5

2 Research objectives 7

CHAPTER II: LITERATURE REVIEW 8

1 Efficiency measurement 8

2 Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA) 9

3 The cross-sectional Stochastic Frontier Model 12

4 Stochastic frontier model with panel data 15

4.1 Time-invariant models 16

4.1 Time varying models 19

CHAPTER III: METHODOLOGY 25

1 Overview of Vietnamese metal manufacturing industry 25

2 Analytical framework 27

3 Research method 26

3.1 Estimating technical inefficiency 26

3.2 Variables description 30

3.3 Data source 34

CHAPTER IV: RESULT AND DISCUSSION 37

1 Empirical result 37

1.1 Cobb-Douglas functional form 37

1.2 Translog functional form 42

2 Discussion 44

2.1 Models without distribution assumption 44

2.2 The distribution of technical inefficiency 45

2.3 Technical inefficiency and firm-specific effects 46

2.4 Identification issue 48

CHAPTER V: CONCLUSION 50

BIBLIOGRAPHY 54

Trang 4

LIST OF TABLES:

Table 3-1 Output and Input deflators 31

Table 3-2 Descriptive statistic of key variables 35

Table 3 – 3 Real outputs and material costs value of different-sized firms 35

Table 4-1 Time invariant models with Cobb – Douglas function 37

Table 4-2 Time varying models with Cobb – Douglas function 39

Table 4-3 Determinants 41

Table 4-4 Time invariant models with Translog function 43

Table 4-5 Time varying models with Translog function 44

Table 4-6 Value of μ in models with truncated distribution 46

LIST OF FIGURES Figure 2-1 Input-oriented efficiency 9

Figure 2-2 Output-oriented efficiency 9

Figure 2-3 various types of technical inefficiency distribution 14

LIST OF CHARTS Chart 3-1 Firm size and ownership type 36

Chart 3-2 Firm location 36

Trang 5

CHAPTER I: INTRODUCTION

1 Introduction

The rising demand of metal products (especially iron and steel) in daily life, production and,mostly, construction sector makes the role of metal manufacturing industry important According

to World Steel Association, at the end of 2011, Vietnamese steel market was the seventh largest

in Asia with the growth rate in tandem with economic expansion There are still huge potentialsfrom this industry due to the growing income and an expanding trend of construction

As reported by Viet Nam chamber of Commerce and Industry (VCCI), at the end of 2011, 97%

of the number of enterprises in Viet Nam are small and medium sized which employ more than ahalf of the domestic labor force and contribute more than 40% of GDP This dynamic group offirms have become have become an important resource for economic growth in Viet Nam.However, this industry is now facing challenges due to outdated technology and the heavydependence on import materials From the reasons above, an analysis into the technicalinefficiency level of Vietnamese small and medium enterprises (SMEs) in metal manufacturingindustry is necessary to maintain and develop the benefit from this industry

Technical efficiency is the effectiveness with which the firm uses a given set of inputs to produceoutputs The set of highest amounts of output that can be produced from given amounts of inputs

is the production frontier Technical efficiency reflects how close a firm can reach this border:firms producing on this frontier are technically efficient, while those far below from the frontierare technically inefficient A technical efficiency analysis is often conducted by constructing aproduction-possibility boundary (the frontier) and then estimating the distance (the inefficiencylevel) of firms from that boundary

There are two approaches to measure technical efficiency: deterministic and stochastic Thedeterministic approach, called Data Envelopment Analysis (DEA), was first introduced inCharnes, Cooper, and Rhodes (1978) which use linear programming with the data of inputs andoutputs to construct the frontier The advantage of this method is that it does not require thespecification of the production function However, for being deterministic, this method assumesthat there is no statistical noise in data The stochastic approach, called Stochastic FrontierAnalysis (SFA), was mentioned first in Aigner, Lovell, and Schmidt (1977) and Meeusen andBroeck (1977) This method, contrary to DEA, requires a specific functional form for the

Trang 6

production function and allows data to have noises SFA is used more often in practice becausefor many cases, the noiseless assumption are unrealistic.

Since its first appearance in Aigner et al (1977) and Meeusen and Broeck (1977), the literature

of technical efficiency has been widely developed through many studies such as Pitt and Lee(1981), Schmidt and Sickles (1984), Battese and Coelli (1988, 1992, 1995), Cornwell, Schmidt,and Sickles (1990), Kumbhakar (1990), Lee and Schmidt (1993) and Greene (2005) (see Greene(2008) for an overview of those) Being able to deal with various production processes, thismethod has become a popular tool to analyze the performance of production units such as firms,regions and countries Those applications can be found in Battese and Corra (1977), Page Jr(1984), Bravo-Ureta and Rieger (1991), Battese (1992), Dong and Putterman (1997), Anderson,Fish, Xia, and Michello (1999) and Cullinane, Wang, Song, and Ji (2006)

Despite the fact that a rich literature of this matter has been developed over a long time, researchers

at times find it difficult to choose the most appropriate model to estimate the technical efficiencylevel or determining its sources The earliest versions of these models were built to deal with crosssectional data (Aigner et al., 1977; Meeusen & Broeck, 1977) These models need assumptions abouttechnical inefficiency distribution and its uncorrelatedness with other parts of the model Pitt and Lee(1981) and Schmidt and Sickles (1984) criticized that technical inefficiency cannot be estimatedconsistently with cross-sectional data and suggested models that deal with panel data The literature

of panel data models first come with the assumption of time-invariant technical inefficiency (Battese

& Coelli, 1988; Pitt & Lee, 1981; Schmidt & Sickles, 1984) Researchers, after that, claimed that it istoo strict to assume technical inefficiency to be fixed through time and suggested models that allowits time-variation such as Cornwell et al (1990), Kumbhakar (1990), Lee and Schmidt (1993) andBattese and Coelli (1992) Those models solved the problems by imposing some time patterns.Nevertheless, the assumption of an unchanged time behavior was also criticized too strict Then themodel with technical inefficiency effects was created by Battese and Coelli (1995) which allowstechnical inefficiency to vary with time and other determinants Greene (2005) introduces “true”fixed and random models which warrant the unrestraint time changing of inefficiency and separate itfrom other firm specific factors

This thesis aims to estimate the technical efficiency level of Vietnamese metal manufacturing firmswith panel-data stochastic frontier models Besides, this study also reviews those panel data models

of technical inefficiency analysis and gives some implication about model choice in this field This

Trang 7

study uses an unbalanced panel dataset of firms in metal manufacturing industry in the year

2005, 2007 and 2009 which is withdrawn from Vietnamese SMEs survey The result showsdifferent technical efficiency levels among those stochastic frontier models

2 Research objectives

- To give a review of panel-data stochastic frontier models;

- To apply those models to investigate the technical efficiency of SME firms in metal

manufacturing industry in Viet Nam

Trang 8

CHAPTER II: LITERATURE REVIEW

The terms productivity and efficiency need to be discriminated in the context of firm production Onthe one hand, productivity implies all factors that decide how well outputs can be obtained fromgiven amounts of inputs It can be considered as “Total factor productivity - TFP” On the otherhand, efficiency relates to the production frontier This frontier shows the maximum output that can

be produced with a level of input A firm is called efficient technically when it produces on thisfrontier Firm production cannot go beyond this frontier for this is the limitation of its performingability When the firm performs below this frontier, it is considered inefficient The farther thedistance is, the more inefficient the firm is Changes in productivity can be due to the changes inefficiency (the firm becomes more or less efficient technically), a change in the amount andproportion of its inputs (changing its scale efficiency), a change in technical progress (change intechnology level over time) or a combination of all the above factors (Coelli et al., 2005)

Efficiency measurement can be approached from two sides, inputs and outputs Input-orientedmeasures relate to cost reduction (minimum amount of inputs to produce a given amount of output).Output-oriented measure, on the other hand, makes use of the maximum level of output producedfrom a given amount of inputs Figure 2-1 and 2-2 illustrate these two approaches Figure 2-1

of inputs that could be used to produce a given output If a firm operates on this isoquant (thefrontier), it will be technically efficient in an input-oriented way for the reason that the inputs amount

of this firm is minimized The iso-cost line CC’ (which can be constructed when the input-price ratio

is known) determines the optimal proportion of inputs in order to archive lowest cost Technicalefficiency (TE) can be calculated by the percentage rate of OR/OP, allocative efficiency (AE) equalsthe percentage rate of OS/OR The multiplication of AE and TE

Trang 9

expresses the overall efficiency of the firm, called economic efficiency (EE) (i.e = × ) Figure 2-2, illustrate the case where the firm uses one input and produces one output, The f(X) curve determines the maximum output can be obtained by using each level of input X (the frontier) The firm will be technical efficient operating on this frontier In this situation, TE equals BD/DE.

Figure 2 – 1: Input-oriented efficiency

Figure 2 – 2: Output-oriented efficiencyMeasurements and analyses of TE were conducted by a huge number of studies with two mainapproaches – Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA) Thenext section briefly discusses these two methods

2 Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA) a Data Envelopment Analysis (DEA)

DEA is a non-parametric method in estimating firm efficiency which was first introduced inCharnes, Cooper, and Rhodes (1978) with constant return to scale Later on, it was extended toallow for decreasing and variable return to scale in Banker, Charnes, and Cooper (1984).Specific instruction can be found in Banker et al (1984), Charnes et al (1978), Fare, Grosskopf,and Lovell (1994), Färe, Grosskopf, and Lovell (1985) and Ray (2004)

Trang 10

With n firms (called Decision Making Units – DMUs), each firm uses m types of inputs and

produces s types of outputs, the model for DEA following an output-oriented measure is given by:

With: = 1,2, … , ; = 1,2, … , ; = 1,2, … , ; , are respectively the ith input and rth output of jth

DMU; , are the weights of outputs and inputs which come from the solution of this maximization

problem (Charnes et al., 1978) Using a piece-wise frontier from (Farrell, 1957) and linear

programming algorithm in maximization mathematics, this method constructs a production

frontier Then, the ratio between outputs and inputs will be brought into account and compared

with the frontier to calculate the efficiency level of each firm

Only being noticed from 1978, but, for many reasons, DEA has become a popular branch of

efficiency analysis Wei (2001) described this growing progress by listing five evolvements in

DEA researches Studies using DEA have been conducted in almost industries, both private and

public sector Moreover, numerical methods and supporting computer programs have grown in

both number and quality Over time, new models of DEA have been discussed and established,

such as additive model, log-type DEA model and stochastic DEA model Besides, the economic

and management background of DEA have been analyzed more carefully and deeply,

strengthening the base for the applications of this model Mathematical theories related to DEA

have also been promoted by many mathematicians Those factors gave rise to the progress of

both theoretical improvements and empirical applications of this non-parametric method

b Stochastic Frontier Analysis (SFA):

Aigner et al (1977) and Meeusen and Broeck (1977) suggested the method of production

stochastic frontier to measure firms’ efficiency The model can be described mathematically as

below:

Page | 10

Trang 11

with is output of firm i, is the vector of inputs, is the vector of parameters, which is to beestimated The last two factors: and are the two error terms is the random statistical noise and isassumed to have a normal distribution with zero mean is a non-negative term indicatinginefficiency, which keeps firm far from producing on its frontier There are different assumptionsabout ’s distribution such as half normal distribution (Aigner et al., 1977), exponentialdistribution (Meeusen & Broeck, 1977), gamma distribution (Greene, 1990) or a non-negativetruncation of ( ,2) (Battese & Coelli, 1988, 1992, 1995).

c Trade-off between DEA and SFA

Being different in approaching method, DEA and SFA have their own advantages and drawbacks.This implies that when choosing between DEA and SFA, researchers must make some trade-offs.Being a non-parametric method, DEA is a deterministic approach without the specification of theproduction function while SFA is a stochastic technique using econometric (parametric) tools andrequires a model specification (Ray & Mukherjee, 1995) From that key difference, DEA isconsidered to be non-statistical, which assumes that the data have no noise Data noise can comefrom measurement errors or random factors which can be controlled by firms This restrict seems to

be stubborn in realistic situation SFA is statistical, so it allows and takes into account statisticalnoise In other words, it is more flexible with real world data, in which random factors and error incollecting is unavoidable But using SFA requires assumptions about the specification of the model,functional forms, distribution of parameters and error terms (Wagstaff, 1989) In DEA, every factorthat keeps the firm away from its frontier is regarded inefficiency While in SFA, the residual will bedecomposed into two components One part, which is not under the control of the firm itself, isaccounted to be the noise and has the zero mean The other part, which is known as the inefficiency,

is the weakness of the firm which makes it produce below the frontier So, generally, the efficiencymeasured from SFA will be higher relatively (Ferrier & Lovell, 1990)

DEA has the advantages with the ability to be applied in various complicated condition ofproduction Without the requirement of a definite production function, it helps simplifying thelinkage from inputs to outputs of a production process Without statistical properties, no test can

be used to test DEA’s goodness of fit or specification In spite of having troubles with modelspecification, SFA still has econometric tools to test whether the model is suitable or not Themost beneficial advantage of SFA is the capability of dealing with statistical noise Generally, forindustries that the production processes are controlled strictly, DEA seems to be the better choice

Trang 12

of measuring efficiency This is because the random fluctuation in these industries is minimized andthe production process is very stable (from a given amount of inputs, the number and quality ofoutputs is likely to be determined precisely) Meanwhile, SFA tends to be suitable for industries inwhich noise is inevitable Firms in those industries have to bear the impacts from randomfluctuations In the case of this thesis, firms in metal manufacturing industries are influenced by fromthe markets of both inputs and outputs, both domestic and abroad and changes in policies For thosenatures of the industries the thesis analyzed, SFA is the better model to be applied The next partdiscusses in detail SFA method, both cross-sectional and panel data models.

3 The cross-sectional Stochastic Frontier Model:

The cross-sectional stochastic frontier model in Aigner et al (1977) can be describe as:

with is the random noise and is technical inefficiency ( ≥ 0) To distinguish these two components of residual, some assumptions are necessary The first assumption is about the distribution of while follows a symmetric normal distribution For the reason that represents the distance from the frontier which keeps firms below the frontier, its value is non-negative As mentioned above, the distribution that is suggested includes half normal distribution (Aigner et al., 1977), exponential distribution (Meeusen & Broeck, 1977), gamma distribution (Greene, 1990) or a non-negative truncation of ( ,2) (Battese & Coelli, 1988, 1992, 1995).

Both two main estimation methods of econometric can be used to deal with technicalinefficiency calculation – Ordinary Least Squares (OLS) and Maximum Likelihood (ML) Butfor the fact that the error term includes two components, that is: with an asymmetric distributionand with a symmetric distribution, so will not have normal distribution Since = − then = − .This makes the intercept in OLS bias downward To clarify, we consider a regression with justthe intercept : = + , then the estimator for will be ̅ From the equation above, ( ̅) = + ,which does not equal to Winsten (1957) suggested an method called Corrected Ordinary LeastSquares (COLS), Afriat (1972) and Richmond (1974) offered the method of Modified OrdinaryLeast Squares (MOLS) to solve this bias problem Those two methods correct or modify theintercept upward by add up the maximum or average value of OLS residuals COLS and MOLShave some problems, such as non-statistical meaning estimates (Mastromarco, 2007) ML,however, has some asymptotic properties and is able to deal with asymmetrically distributedresidual, is used frequently than OLS

Page | 12

Trang 13

With technical inefficiency ( ) following a half-normal distribution i.e ~ (0, ) (Aigner et al., 1977) the log-likelihood function is:

with 2 = 2 + 2 and = 2⁄ 2 If = 0, the firm is fully efficient If = 1, it is totally inefficient In the equation above, is a vector of logarithm of outputs; = −

= ln − ′ and Φ( ) is the cumulative distribution function (cdf) of a random variable which follows N(0,1) distribution at x (Coelli et al., 2005) This function can be solved using an iterative optimization procedure in Judge, Hill, Griffiths, Lutkepohl, and Lee (1982) as cited in Coelli et al (2005).

Log-likelihood function with exponential distribution can be described as:

The case of truncated normal distribution i.e ~ +

Trang 14

Figure 2 – 3: Distribution of technical inefficiencyFigure (2 - 3) illustrates the probability density function of four types of distribution of visually.

Obviously, there are restrictions with gamma, exponential and half-normal distribution With

these distributions, because most observations locate in the area has low value of u (technical

inefficiency), one can conclude that the level of efficiency of firms is rather high (the

inefficiency level is low) Put differently, most firms are highly efficient This can be untrue with

many industries, in which high efficiency is not realistic Truncated normal distribution is more

flexible when it allows the allocation of inefficiency to almost positive point Therefore it can be

used to describe better

Consider a Cobb-Douglas production function as:

The technical inefficiency level of a firm can be calculated as the ratio of observed output ( ) to maximum feasible output ∗ which is the output

when the firm is fully efficient or the value of is zero.

( ln + − )

efficiency) can be analyzed with some determinants with the regression equation follow:

Trang 15

= 0 + +

(2.3.7)With is the vector of determinants of and is the vector of parameters that need to be estimated The distribution of is

the truncation of the normal distribution (0, 2) (Battese & Coelli, 1995) This is called Technical Inefficiency

Model and can be estimated simultaneously with the Stochastic Frontier.

4 Stochastic frontier model with panel data

There are three problems arising while we use stochastic frontier model with cross sectional data

(Schmidt & Sickles, 1984) The first is the inconsistency in estimating technical inefficiency

Most studies in this field uses the method in Jondrow, Lovell, Materov, and Schmidt (1982) to

predict the technical inefficiency level for each firm in the sample The formula is:

where and are the standard normal density and cumulative density function respectively, ∗ =

− 2 ⁄ 2 , 2 = 2 2 ⁄ 2 and 2 = 2 + 2 For the reason that

take into account this bias but it is very complicated to do This kind of bias disappears

asymptotically and can be ignored with large sample However, essentially, technical

inefficiency is independent of sample size So the level of technical inefficiency level will be

estimated inconsistently (Schmidt and Sickles, 1984) The second is the ambiguity in the

distribution of , which is necessary to guarantee an independence between technical inefficiency

( ) and statistical noise ( ) Without a stubborn distribution assumption, it is impossible to

decompose the overall error term ( ) into inefficiency ( ) and statistical noise ( ) However, with

cross-sectional data, the robustness of the assumption is hard to test The third problem is the

assumption of the uncorrelatedness of u with other regressors in the model This problem of

endogeneity causes biases in the model Schmidt and Sickles (1984) suggests that the

endogeneity is unavoidable because in the long-run the firm realizes its inefficiency level and

adjusts the use of inputs to be more efficient

Panel data models (with data from N firms in T periods) can help avoid these three weaknesses

(Greene, 2008) Firstly, more observation overtime (the ideal case is when we have long enough

Trang 16

time series data T→∞) helps estimate the technical inefficiency more consistently Secondly, by

isolation and treating technical inefficiency as fixed effect, the model using panel data is

distribution free (the distribution assumption is now optional) (Greene, 2008) Finally, the

assumption of uncorrelatedness is also relaxed because some panel models can take into account

the effects of this correlation The next section describes in detail those panel data stochastic

frontier models which have been developed over a long period since its first appearance

4.1 Time-invariant models

a Within estimation with fixed effects and GLS estimation with random effects from

Schmidt and Sickles (1984)

From those discussion above, Schmidt and Sickles (1984) suggests the use of panel data to

estimate technical inefficiency (time invariant) both with fixed and random effects The model is

described as:

ln = + ′

*Note: Schmidt and Sickles (1984) use a log-linear function.

with is uncorrelated with ′ and The within estimator uses dummy variables to estimate separate

intercepts for each firm which stand for its own technical inefficiency This method has

advantages because it need neither an assumption about the uncorrelatedness between and other

variables nor an assumption about ’s distribution After calculating, each firm’s effect is

compared with the highest in the sample and inefficiency is estimated as ̂ = max( ̂ ) − ̂

The authors suggests a large number of firms to have exact estimate of the most efficient firm in

the sample (the ideal case is with an extensive number of firms over a considerable number of

time periods) For the fact that this method is simply a fixed effects estimation using panel data,

it includes in technical inefficiency the effects of time-invariant but firm-varying effects (as

Schmidt and Sickles (1984) mentioned, it can be capital stock for example – if the value of

capital stock stays unchanged overtime, fixed effects model will include it in the value of firm’s

specific intercept) which cannot be considered as inefficiency

From the weaknesses of the within estimator mentioned above, the authors suggests assumptions about the uncorrelatedness of

ability to separate the time-invariant regressor which within estimator cannot However, the stubborn assumption needs to be

Page | 16

Trang 17

tested Given the matter of uncorrelatedness and distribution assumptions, the authors suggests

other two methods The first is the estimation of Hausman and Taylor (1981) which relaxes the

uncorrelatedness assumption and the second is the maximum likelihood estimation which are

more advanced given a specific distribution of

The two model considered above is two of some simplest approaches to the concept of technical

efficiency With their criticism about the inconsistency in estimating technical inefficiency level,

they suggest the use of fixed and random effects model which give more consistent estimate of

technical inefficiency in case T is large with a given N However, the lack of a distribution

makes it hard to estimate the true inefficiency apart from other firm-specific factors Back in

time in Pitt and Lee (1981), a half normal distribution with maximum likelihood estimation is

described in the next section

b The model with time-invariant efficiency in Pitt and Lee (1981)

In this paper, the panel data from the Indonesian weaving industry was used to estimate technical

inefficiency level and its sources The hypothesis of whether technical inefficiency is

time-invariant or time-varying was tested using three different models Three cases were suggested by

the authors The first case is when is fixed through time and only varies among individuals,

which means it is indexed by only ( ) as described below:

=+ −

(2.4.3)

*Note: Pitt and Lee (1981) use a linear function.

In the second case, technical inefficiency is independent of time and among individuals, which

leads back to the cross-sectional model as in Aigner et al (1977) That is:

=+ −

(2.4.4)

The final case is the intermediate of these two when the technical inefficiency is assumed to be

correlated with time That is:

=+ −

(2.4.5)

Trang 18

The first and second models are estimated using maximum likelihood method while the

intermediate model is estimated with generalized least squares (for the reason that the maximum

likelihood procedure for the last case is intractable) Comparing between two first models and

model three is conducted by a 2 test which can be found in Jöreskog and Goldberger (1972) The

test suggests that the last model is appropriate (which implies technical inefficiency is time

varying) The measure of technical inefficiency for each firm is not mention in the paper,

however, can be done by the method of Jondrow et al (1982) which infers the value of each

from the value of each

Although the last model is shown to be more precise, it does not take into account the

distribution of technical inefficiency Moreover, it supplies no measure of inefficiency Thus,

generally, the idea proposed by Pitt and Lee (1981) hinges around a model with time-invariant

inefficiency following half normal distribution and suggests further research into time varying

inefficiency However, as mentioned above, a half normal distribution is sometimes

unreasonable Battese and Coelli (1988) suggests a more general distribution of – the truncated

normal distribution The model is discussed in detail in the next section

c The model with truncated normal distribution in Battese and Coelli (1988)

Battese and Coelli (1988) proposes a model in that technical inefficiency follows a truncated normal distribution which is

developed in Stevenson (1980) for the estimation of stochastic production frontier For the availability of data (3 years),

old ones (half normal, which is introduced in Pitt and Lee (1981) and Schmidt and Sickles (1984)) because when = 0, the

distribution becomes half normal With development in calculating the likelihood function from Stevenson (1980), the

model is estimated with maximum likelihood method The model can be described as:

*Note: Battese and Coelli (1988) use a Cobb-Douglas function

An extensive contribution of this paper to the field of stochastic frontier is its approach in estimating technical

efficiency both in the industry level and firm level for the logarithmic case (the Cobb-Douglas functional form in the

study) Instead of using the mean of technical inefficiency - ( ) and calculating the efficiency level as 1 − ( )

in Jondrow et al (1982), the

Page | 18

Trang 19

authors suggests that technical efficiency level should be attained in form of exp(− ) in the logarithmic case.

The formula of technical efficiency level is then clarified with the properties of truncated distribution of

A common suggestion from the studies mentioned above is the research direction into time varying

characteristics of Pitt and Lee (1981) base on their empirical evidences to suggest further research

about time varying technical inefficiency Schmidt and Sickles (1984) also state that firms will

recognize their inefficiency level in the long-run and change themselves to be more efficient The

lack of a long-period data makes Battese and Coelli (1988) assume to be fixed through time, however

in their hint for future research, they also suggest other models that allow inefficiency to vary

overtime To relax the inflexible assumption of time-invariant inefficiency, models with time varying

inefficiency arise The next section considers those models

4.2 The time varying models

a The model of Cornwell et al (1990)

As mentioned above, the problems of technical inefficiency’s distribution and that whether it is

uncorrelated with inputs are treated differently among researches Once those assumptions about

uncorrelatedness and distribution are made, they easily look too stubborn and become a weakness of

the study Panel data can help relax those assumption but in the cost of treating technical inefficiency

as fixed through time Once again, the assumption of a time-invariant efficiency is too strong

(Cornwell et al., 1990) By regarding firm effect as a function of time with parameters alter across

firms, Cornwell et al (1990) create a model that changes the fixed firm-effect into flexible firm

effect which can be varied overtime The model can be described as:

*Note: Cornwell et al (1990) use a Cobb-Douglas function.

with is time varying firm effects which follows a function of time The authors mentions a

quadratic function of time with parameters vary across firms described as

which allows firm effects change across firms and overtime The model can be estimated by within

estimator, GLS or efficient instrumental estimator The residuals after being estimated, are regressed

on a quadratic function of time Firm specific temporal effect then is estimated using

Trang 20

the coefficients of the latter estimation Using the method similar to the one in Schmidt and

Sickles (1984), the authors calculated firm specific temporal inefficiency level:

with ̂ = ( ̂ ) which is a comparing the specific effect of each firm to the most efficient firm (in that year).

This model well adapts to time varying technical inefficiency However, by using parameters that

vary across firms ( × 3 parameters), its degree of freedom is heavily affected in small sample

(especially sample with small T) The model of Kumbhakar (1990), Battese and Coelli (1992) and

Lee and Schmidt (1993) which include less parameters and uses a specific distribution of to capture

the time varying inefficiency of firms will be discussed in the following section

b The model of Kumbhakar (1990), Battese and Coelli (1992) and Lee and Schmidt (1993)

Kumbhakar (1990) considers the time varying inefficiency as a function of time and time

invariant inefficiency The model can be described as:

*Note: Kumbhakar (1990) use a Cobb-Douglas function.

with = ( ) where is fixed through time but varied across firms and follows a half normal distribution The suggested time function is:

The fact that ( ) ≥ 0 makes always positive in the production

The values of and decide ( ) to be monotonically increasing or decreasing and to be concave or convex Thus, the data which is used to estimate the model can

determine the time-behavior of ( ) and also Then we can easily test the functional form of ( ) by a LR test with the null hypothesis = 0, = 0 or = = 0 The

model then is estimated by ML method with the likelihood function as given in the paper After estimating ̂( ) and ̂ , temporal technical inefficiency for each firm

is calculated as ̂ = ̂( ) × ̂

Having the same thought as in Kumbhakar (1990), the model in Battese and Coelli (1992) also considers the form of technical

value of

Page | 20

Trang 21

determine the time behavior of technical inefficiency When increases, will increase, remain constant or

decrease if < 0, = 0 or > 0 repectively Thus the functional form of technical inefficiency can be

decided by the data This approach, however is simpler and uses less parameters than the one in

Kumbhakar (1990).

The model in Lee and Schmidt (1993) replace the time function in those two previous studies by

a set of dummy variables The model can be described as:

=++

(2.4.12)

*Note: Lee and Schmidt (1993) use a linear function.

with is firm’s time invariant technical inefficiency The authors suggest that by doing this, the

time pattern is not restricted to a specific functional form of time However the number of

parameters can be large (even though smaller than the one in Cornwell et al (1990)) in the case

T is large, thus the authors recommend this method in the case the time-series is not too long

Since it does not use any distribution of technical inefficiency, the method of within estimator

and GLS is applied to estimate the model

Generally, the three models mentioned above adapts well in dealing with time varying technical

inefficiency and relax the strong time-invariance assumption However, they do not consider the

matter of determinants of technical inefficiency This matter is first regarded in the study of Pitt

and Lee (1981) with the purpose of finding the source of technical inefficiency This kind of

model which is called as technical inefficiency effects model (TIEM) has been developed

through the researches of Kumbhakar, Ghosh, and McGuckin (1991), Reifschneider and

Stevenson (1991) and Huang and Liu (1994) The next section will describe the model of Battese

and Coelli (1995), a popular model which includes both stochastic frontier model and technical

inefficiency effects model

c The model of Battese and Coelli (1995) with technical inefficiency effects model

The model suggested in this paper includes a stochastic frontier model and a technical

inefficiency effects model Theoretically, the stochastic frontier model estimates technical

inefficiency of firms from the data of outputs and inputs and then the technical inefficiency

effects model regresses on other variables which can be considered as explanatory factors

associated with The formula of stochastic frontier model for panel data can be described as:

Trang 22

= exp( + − )

(2.4.13)

*Note: Battese and Coelli (1995) use a Cobb-Douglas function.

with is a vector of value of a specific functional form, is symmetric normal distributed statistical noise follows a

truncated normal distribution with mean and variance 2 with is a vector of explanatory variables which can be

considered as sources of technical inefficiency.

The two models are estimated simultaneously using maximum likelihood method given the

truncated normal distribution of In this paper, Battese and Coelli use micro panel data of India

villages to apply the model A time variable is included in the stochastic frontier model to take

into account the technical progress (Hicksian neutral technological change) while a time variable

is used in the technical inefficiency model to imply the time varying characteristic of (here is a

linear correlation) With similar procedure in their papers in 1988, 1992 and 1993, the authors

use a log likelihood test to examine the existence of technical inefficiency, functional form of

stochastic frontier model and technical inefficiency effects model

Having the ability to take into account the impacts of explanatory factors on technical

inefficiency, the (Battese & Coelli, 1995)’s model is applied widely to analyze technical

efficiency and its determinants This application is useful in the case of discovering factors that

benefit firms in gaining efficiency Thus, become a powerful tool in policy recommendation

However, the fact that technical inefficiency can be explained by other factors raises the problem

of biases in estimating the whole model Greene (2005) seriously criticizes this matter and

proposes a new approach which called “true” fixed effects model and “true” random effects

model Those models are described in detail below

d “True” fixed effects model and “true” random effects model (Greene, 2005)

Greene (2005) mentions two shortcomings of those fixed effects and random effects approaches

mentioned above The first is the strong assumption about the time pattern of technical inefficiency

The model of Pitt and Lee (1981), Schmidt and Sickles (1984) and Battese and Coelli (1988) both

assume a time invariant technical inefficiency This assumption becomes too strong, especially in the

case of using panel data with large number of time periods Cornwell et al (1990) proposes a model

to deal with this problem by using a time function with parameters change across firms By this way,

the number of parameters is large and this makes the model inefficient Later

Trang 23

papers such as Battese and Coelli (1992), Lee and Schmidt (1993) and Kumbhakar (1990) solve the problem

by adding a time behavior function ( ) so = ( ) Despite the various functional forms of ( ), the

assumption that technical inefficiency follows a specific pattern of time, again, seems to be strong either

(Greene, 2005).

The second matter is the assumption that is uncorrelated with other variables in the model The

previous fixed effects model does not need this assumption By imposing a separate intercept for

each firm, the “firm effect” now is not correlated with other variables However, Greene

criticizes that by doing this, one can only compute technical inefficiency by comparing with the

“best” firm in the sample Besides, he also states that “firm effect” from this method includes

heterogeneity that is not related to inefficiency In his view, technical inefficiency needs not

contain time invariant effects and should vary freely through time (Greene, 2008) This kind of

heterogeneity is also considered in the study of Farsi, Filippini, and Kuenzle (2003) as factors

beyond the control of firms The authors give the examples of factors that belongs to business

environment (for example: network effects in network industries) or relates to output

characteristics such as the severity of illness in healthcare industry or the demand fluctuations in

electricity utilities In healthcare industry, different hospitals must treat different kinds of disease

with different severity level If we take number of lives saved by a hospital as the representative

of output, then this number will be low in the hospitals which mainly treat minor diseases and

vice versa Thus, those hospitals will be closer to the frontier than others with major diseases

One can conclude that the hospitals treating minor diseases are more efficient than the hospitals

treating major diseases If we let them cure the same amount of patients that have the same level

of severity, we cannot assure which group will save more Therefore, taking those effects into

technical efficiency level is not reasonable From those ideas, Greene (2005) suggests “true”

fixed and random effects model which separate that latent heterogeneity from inefficiency

“True” fixed model is described as:

with is the firm specific constant Greene also introduces a maximization method that can help

estimating simultaneously all coefficients by maximum likelihood which uses a specific

distribution of Meanwhile the “true” random effects model can be written as:

Trang 24

*Note: the functional form of production function in equation (2.4.14) and (2.4.15) are the ones that were used in Greene (2005) to introduce their model In their examples, they use the Cobb-Douglas form.

with is a random constant term that varies across firms

As we can see, those panel data models mentioned above can be divided into two groups bytypes of approach Those with fixed effects approach as the fixed effects model in Schmidt andSickles (1984), Cornwell et al (1990), Lee and Schmidt (1993) and Greene (2005) do not requireassumptions on the uncorrelatedness between technical inefficiency and other parts of the model.Those with random effects approach such as the models in Schmidt and Sickles (1984) (randomeffects model), Pitt and Lee (1981), Battese and Coelli (1988, 1992, 1995), Kumbhakar (1990)and “true” random effects model in Greene (2005) require technical inefficiency to beuncorrelated with the rest of the model People can also divide them into time invariant modelsand time varying models as mentioned above Coming from different points of view, they havetheir own strengths and weaknesses that make them suitable to different situations

The literature on cross-sectional data stochastic frontier model is quite consistent Its simplicityand weaknesses attract less attention When the need of a more careful analysis into the nature offirm’s efficiency arises and the data in panel form are now attainable widely, researchers tend torely on panel data models despite their complicatedness Like what I mentioned about myintention for this thesis, I will only apply those models with panel data in my empirical process.The purpose is to compare them relatively to each other, to find out differences that make themmore or less suitable in specific circumstances The methodology will be described particularly

in the section below

Trang 25

CHAPTER III: METHODOLOGY

1 Overview of Vietnamese metal manufacturing industry

Firms in the sample used in this study are divided into two categories: basic metal manufacturingfirms and fabricated metal manufacturing (except machinery and equipment) firms according to theirmain products Details for this classification can be found in International Standard IndustrialClassification of All Economic Activities (ISIC), Revision 4 Firms in the first group are involved inthe activities of smelting and refining ferrous and non-ferrous metals Those firms use metallurgictechniques with the materials from mining industry such as metal ore, pig or scrap Taking part inthis industry need large investments in physical assets Thus, in this sample of small and mediumenterprises, this group takes only 9% of the numbers of firms The second group manufacturesstructural metal products, metal container-typed objects and steam generators Producing morepopular products, this group takes about 91% of the number of firms

The metal manufacturing industry in Viet Nam has many potentials due to the high demand of metalproducts for daily using, production and construction In Vietnamese young developing economy,metal manufacturing industry is still immature and most products are used for construction.According to World Steel Association, about 80% of iron and steel materials are used forconstruction Besides, the domestic rising demand of metal materials of machinery, motor andautomobile and other consumer goods manufacturing can also be considered as important condition

of development of metal manufacturing industry However, along with the depressed situation ofVietnamese economy in the recent year, metal manufacturing industry also has many difficulties Theinvestment cut in public construction due to government budget deficit strongly decreases thedemand of metal materials for construction According to Vietnamese Steel Association, steelconsumption fell about nine percent in 2012 The rising price of inputs such as electricity, water andlabor cost also imposes many hardships on this industry

Due to the importance of metal manufacturing industry, this study is conducted with the objective ofanalyzing technical efficiency level of firms in this sector However, most observations in thissample are micro and household firms (74.5%) and the number of medium sized firms take only fourpercent Moreover, due to the availability of data, the dataset used here is in the period of 2005 to

2009, during which economic conditions may be different from the current situation Because ofthose reasons, there should be high probability of sample bias if this study gives

Trang 26

conclusions about the industry (the population) Thus, as a reminder for readers about theprecision of the conclusions from this study, those results should be considered carefully whileusing for the purpose of policy recommendation.

2 Analytical framework

Panel-data stochastic frontier models presented in Chapter II are applied to estimate technicalinefficiency level of Vietnamese SMEs in metal manufacturing industry Production function forthe model is estimated with inputs and outputs data in two different functional forms – Cobb-Douglas and Translog For the case of technical inefficiency effects model in Battese and Coelli(1995), a firm-specific group of variables will be added into the model Those variables can beconsidered and sources or determinants of technical inefficiency The results, then, are comparedamong models to find out the impact of each assumption and model specification on the waytechnical efficiency is determined

3 Research method

3.1 Estimating technical inefficiency:

Firm efficiency will be calculated with Stochastic Frontier Model As noted above, an important step

of using Stochastic Frontier Model is choosing the suitable functional form to build up the frontier.Some types of production function can be considered, such as Linear, Cobb-Douglas, Quadratic,Normalized Quadratic, Translog, Generalized Leontief and Constant Elasticity of Substitution (CES)forms (see Griffin, Montgomery, and Rister (1987) for a review of those) Coelli et al (2005)emphasizes that good functional form should be flexible, linear (in parameters), regular,

parsimonious An ordered flexible functional form is the one that has enough parameters for

ith-ordered differential approximation Among those function mentioned above, Linear and Douglas form is first-ordered flexible All the rest are second-ordered flexible Most of productionfunction considered above is linear in parameters Cobb-Douglas and Translog function can becomelinear in parameters when we take the logarithm of both sides of the equation A regular functionalform is the form that satisfies economic regularity properties of production function with its ownnature or with some simple restriction Finally, a parsimonious function can be understood as thesimplest function which can adequately solve the problem A flexible functions will less likelyimposes assumptions or restrictions on the properties of the production

Trang 27

Cobb-function while parsimonious Cobb-functions will save the degree of freedom In choosing Cobb-functional

form, researchers always face the trade-off between flexibility and parsimony

Researchers usually choose Cobb-Douglas for its parsimony (and sometimes tractability), while

choosing Translog for its flexibility Being less flexible, Cobb-Douglas functional form imposes

constant production elasticity, and constant elasticity of factor substitution while Translog

functional form does not So the Translog form makes properties of the production function

testable Therefore, it is considered to be more realistic and less restrictive Nonetheless it still

has some weaknesses The appearance of cross and squared terms in Translog model increases

the number of parameters Correlation among those is highly potential Furthermore, if the

number of observations is not enough, this increase in parameters reduces the degree of freedom

For a comparison, the Cobb-Douglas function and Translog function can be described in

specification with three inputs: capital (K), labor (L), material (M) and indirect cost (I) as below:

Cobb-Douglas functional form:

With the denominator i denotes firms, t denotes time periods; is output; is capital input; is labor input; is materials; is indirect costs,

stands for statistical noise, which follows (0, 2); stands for TE, which follows specific non-negative distribution mentioned above.

Obviously, without squared and interaction terms, the Translog function becomes Cobb-Douglas

function Cobb-Douglas functional form has constant proportionate returns to scale, constant

elasticity of factor substitution, and all pairs of inputs are assumed to be complimentary Those

assumptions make it more restrictive

A likelihood ratio test (LR) can be used to test for goodness of fit between these two functional

forms with:

Trang 28

0 : 5 = 6 = 7 = 8 = 9 = 10 = 11 = 12 = 13 = 14 =0;

1 : otherwise where ( 0 ) is the log-likelihood value for null model ( 0 ) and ( 1 ) is the log-likelihood value for alternative model ( 1 ), the test is given by:

The test statistic is approximately followed a chi-squared distribution with the degree of freedom

(df) is the difference between df of null model and df of alternative model

In this thesis, two computer programs are used to estimate those Stochastic Frontier Models:

STATA and FRONTIER 4.1 Commands for estimating Stochastic Frontier Model in STATA

have been much developed since the model’s appearance “frontier” for cross sectional data and

“xtfrontier” for panel data are those popular STATA commands to estimate technical efficiency

“frontier” can deal with models whose follows half normal, truncated normal, exponential or

gamma distribution “xtfrontier” can treat the model in Battese and Coelli (1988) (time-invariant)

and Battese and Coelli (1992) (time varying) With only those two commands, users face many

troubles testing other models Fortunately, thanks for the very updated paper of Belotti, Daidone,

Ilardi, and Atella (2012) and their contribution for building useful “sfcross” and “sfpanel”

commands, STATA now grants us full capability of using those models mentioned above with

simple syntaxes The second programs is FRONTIER 4.1 from Coelli (1996) which can deal

with models in Battese and Coelli (1992) and Battese and Coelli (1995)

For those logarithm functional forms (Cobb-Douglas and Translog) Technical efficiency will be calculated

from the value of following two different methods The first method is the one in Jondrow et al (1982) which

give the formula: = exp[− ( | )] The second method is the one in (Battese & Coelli, 1988) which suggest

the formula: = [exp( − )| ] TE calculated from this equation will be a positive value less than one It is

the ratio of actual output over the maximum level of output where there is no inefficiency:

( ln + − )

It means TE is a comparison between the output of a real firm and the output of an efficient firm

Page | 28

Trang 29

The estimating process of the time-invariant fixed and random effects models as in Schmidt andSickles (1984) are similar to the panel regressions with fixed and random effect So, those modelswill be estimated using least squared methods (within estimator and GLS) After calculating, eachfirm’s effect is compared with the highest in the sample and inefficiency is estimated as ̂ =

sample which has the technical efficiency level of 100% ( = 0) This is also the method of othermodels which are also estimated by least squared methods such as the ones in Cornwell et al (1990)and Lee and Schmidt (1993) Those models with fixed effects includes a large number of parameterswhich give rise to biases that come from coincident parameter situation Especially, the model inCornwell et al (1990) includes × 3 parameters in the time function of technical inefficiency So, forthe dataset used in this study, which has only 3 time periods, this model cannot be applied However,the command created by Belotti et al (2012) for the model in Lee and Schmidt (1993) does notcalculate precisely the technical efficiency level of firms The model in Lee and Schmidt (1993)compares that level to the highest level in each year while the command compares it to the highestlevel in all years, which leads to some confusions Thus the technical efficiency level calculated bythis model will not be shown in our result

The models in Pitt and Lee (1981) and Battese and Coelli (1988) are both estimated by maximumlikelihood method The difference comes from their distribution assumptions The former has a half-normal distributed while the latter assumes to have truncated normal distribution The likelihoodfunction of these two models can be found in their original papers The latter is more general when itincludes one more parameter – (the mean of the normal distribution which takes the truncation) Theformer one is essentially a special case of the latter when = 0

The models in (Kumbhakar, 1990) and (Battese & Coelli, 1992) share some properties Both of them are estimated with maximum likelihood method and treat technical inefficiency as a function of time The former use the time function: = ( ) with ( ) = (1 + exp( + 2))−1where is fixed through time but different across firms and follows a half normal distribution while the latter conduct the function: = with the form of = exp[− ( − )] with | ( , 2)| (truncated normal distribution at zero) Those functional forms let data decide the time behavior of The one in Battese and Coelli (1992) is simpler in calculation but the one in Kumbhakar (1990) is more flexible in showing the dynamics of technical inefficiency.

Page | 29

Định dạng
Số trang	58
Dung lượng	173,92 KB