An international comparison of educational systems a temporal analysis in presence of bad outputs J Prod Anal (2017) 47 83–101 DOI 10 1007/s11123 017 0491 9 An international comparison of educational[.]
Trang 1DOI 10.1007/s11123-017-0491-9
An international comparison of educational systems: a temporal
analysis in presence of bad outputs
Published online: 25 January 2017
© Springer Science +Business Media New York 2017
index to measure performance change in the educational
systems of 29 countries/economies participating in PISA
2003 and 2012 for students at age 15 in the disciplines of
mathematics and reading This methodology is particularly
appropriate both for its desirable properties as well as its
suitability for the educational context Results indicate a
during this period This improvement is mainly due a
technological change observed Nevertheless, a deeper
scrutiny at the country level shows that results varied
remarkably among them
1 Introduction
In a world characterized by rapid technological change and the importance of innovation processes, the level of aca-demic attainment that students can achieve is essential to improving the levels of wealth and welfare of the citizens in
therefore unsurprising to see a growing concern about the
critical to effective planning of educational policies, and the assessment of educational reforms
In this vein, the OECD Programme for International Student Assessment (PISA) recently published the results
carried out every three years and evaluates education
skills and knowledge; it also provides vital information on other relevant factors (related to students background, school system and the learning environment) that can affect the learning process
While the results obtained by a given country in a standardized test (such as PISA, or the Trends in Interna-tional Mathematics and Science Study, TIMSS) are a good
cannot be regarded as a performance indicator for their educational systems and, therefore, their school authorities The main limitations of these standardized international tests are as follows: (i) the assessment of an organization performance (in this particular case, a country) does not
* Emili Tortosa-Ausina
tortosa@uji.es
1 Universitat Autònoma de Barcelona, Bellaterra (Barcelona) 08193,
Spain
2 Universidad Diego Portales, Avenida Santa Clara 797,
Huechuraba, Santiago, Chile
3 Universitat Jaume I, Avenida de Vicent Sos Baynat, s/n, Castellón
12071 Castelló, Spain
1 Around 510,000 students in 65 economies took part in Pisa 2012 assessment of reading, mathematics and science representing about 28 million 15-year-olds globally Complete information about PISA and databases can be found at https://www.oecd.org/pisa/
Trang 2depend exclusively on outcome variables; instead, we can
of the educational process; the results achieved (output)
during this process are a consequence of the resources used,
the process itself, and environmental variables beyond
for a given country, the measure of the results of the
edu-cational process should not be constrained to the knowledge
students acquire at school, but should also include other
outcomes such as the standard deviation of test scores (an
undesirable outcome of the educational process, in terms of
educational inequality); and (iii) when measuring students
educational achievements at a given point in time, it is
dif-ficult to disentangle how much achievement is attributable to
the student herself, to her family, or to the strategies applied
by previous educational authorities
Consistent with this, over the last few years there has
been a growing interest in assessing and comparing the
performance of educational systems in different countries
this issue considered aggregate data for different samples of
countries participating in international tests These include
using Data Envelopment Analysis (DEA) to analyze the
systems for 31 countries with data from TIMSS 1999
fi-cally, these authors apply directional distance functions
variables to resource variables used in the educational
outputs of academic achievement and bad (or undesirable)
outputs arising from educational inequality Their results
show that it is feasible for a higher education system to
simultaneously, obtain low inequality levels; however, they found that in most instances both dimensions required
A second approach includes studies that compare the performance of educational systems in different countries using either school- or student-level data The study by
school performance using the metafrontier framework to
schools of 16 European countries participating in PIRLS
2011 They also consider an extension of the conditional nonparametric robust approach to test for the potential
find-ings are that rankfind-ings of countries based on academic
control-ling for data on school inputs involved in the educational process, and that heterogeneity across countries is more relevant than among schools In addition to this, the study
Latin American countries by means of corrected ordinary
European countries participating in PISA 2003
However, to obtain a fuller evaluation of educational
above, could constitute a third limitation of previous research initiatives Measuring this change is critical, since
achievement needs to be measured, but also their progress, and how much of this progress is attributable to the edu-cational system itself or to external factors This particular research area in education economics refers to these mea-sures as growth studies, which require at least two evalua-tions at different points in time
To our knowledge, only two studies have analyzed how performance has changed over time, as well as the
2016a) Agasisti (2014) uses data from PISA to compare
score is considered as the output of the education processes
Data Envelopment Analysis (DEA) approach In a second
variables Finally, Malmquist indexes are calculated to
stable because of the action of two contrary forces: a slight
2 Recent literature reviews on ef ficiency in education include De
Witte and López-Torres ( 2017 ), Johnes ( 2015 ), Grosskopf et al.
( 2014 ), Emrouznejad et al ( 2010 ) and, a bit more distant in time,
Johnes ( 2004 ) and Worthington ( 2001 ) In several of these studies,
among other issues, the authors review thoroughly the studies that
have dealt with the issue of ef ficiency in education, listing the inputs,
outputs and environmental/contextual variables, considering the
different levels of analysis (university, school/high school, district/
county/city, or country), as well as the different methodological
approaches In addition, some authors (De Witte and López-Torres
2017 ) have an explicit attempt to link the standard economics of
education literature and the (nonparametric) ef ficiency literature.
3 See also the recent contribution by Aparicio et al ( 2016a ), in which
the Malmquist index is applied to different samples of PISA data
(2006, 2009 and 2012).
Trang 3Aparicio et al (2016a) also uses data from PISA 2006, 2009
and 2012 to compare the performance of public and private
government-dependent secondary schools in the Basque
Country (an Autonomous Region of Spain) These authors
propose a new pseudo-panel Malmquist index to analyze
results suggest that performance was persistently and
review on international performance of education systems
Therefore, in accordance with the rationale presented above,
desirable properties of a good education system would include
not only its ability to obtain high average academic achievement
among its students, but also to ensure that all its students make
progress To achieve this, strategies must be developed that
enable relatively disadvantaged students to also make progress
and reach basic standards Hence, an educational system that
evolves satisfactorily will be one that improves the average
student academic achievement while simultaneously
minimiz-ing the standard deviation of test scores (the educational
inequality) Similarly, changes in the endowment of resources
used by the system will indicate whether the changes in the
level of educational achievement (either positive or negative)
are due to technical change, which might be attributable to an
improvement in the educational resources available, or to
To explore these issues, researchers have proposed a
variety of measures to evaluate performance change over
related proposals (closer to the ones we will consider here)
(to achieve educational objectives), in this study we model
both good and bad outputs, using the global non-radial
Malmquist index (hereafter GNRMI), similar to the
appropriate for its highly desirable properties; it also suits
our context because it incorporates bad outputs which,
ideally, educational systems should minimize while
simul-taneously maximizing the good/desirable outputs
The global non-radial Malmquist index is used to measure
performance change in the educational systems of 29
coun-tries (21 OECD councoun-tries and 8 OECD partner councoun-tries)
participating in PISA 2003 and 2012 for students at age 15 in
the disciplines of mathematics and reading The results can
be interpreted globally or by evaluating the decomposition of
the global non-radial Malmquist index into its two
change (EC) On average, results indicate a positive
evolu-tion in educaevolu-tional performance between 2003 and 2012,
the negative technological change observed Nevertheless, results also varied remarkably among countries
The paper is organized as follows After this introduc-tion, Section 2 describes the methodological aspects of the global non-radial Malmquist index and its decomposition to evaluate the performance of education systems over time The data used for the analysis of educational systems is presented in Section 3 The main results are presented in Section 4, and Section 5 outlines the principal conclusions
2 Methodology
2.1 Modeling the educational performance over time
used to explain the change in factor productivity as a result of
apply it to the case of directional distance functions (DDF)
that incorporate the environmental impact of the units ana-lyzed by considering the bad or undesirable outputs of the
the Malmquist-Luenberger Index (ML) The application of the ML has often been related to radial expansions of good
probably to avoid the problems of translation invariance
concept of non-radial directional distance function (NDDF) where potential improvements are determined individually for each good and bad output as well as for each input Dynamic analysis has also been applied to NDDF in Zhang
Most of the former temporal indices suffer from two
is not assured This property refers to the fact that the change
in productivity over a period can be explained by the product
of changes in productivity in the different sub-periods within
it Secondly, there is a possibility of infeasibilities in the calculation of the cross-distance functions necessary to
condition that technical change be Hicks-neutral to ensure
ensure the absence of feasibility problems (Xue and Harker
Malmquist index known as the global Malmquist index
Trang 4government-dependent secondary
Agasisti (
Trang 5Giambona et
Sutherland et
attainment (environmental
Trang 6Similarly, Oh (2010) adapted the Malmquist-Luenberger index to achieve the same properties, leading to the global Malmquist-Luenberger index This paper proposes a global non-radial Malmquist index similar to that proposed by
temporal analysis of education system performance The reason for the choice of this index is that, apart from its desirable properties, it incorporates bad outputs, which edu-cational systems should minimize while maximizing the outputs (good outputs) This index is therefore particularly
Let k be the countries with available information on their educational systems for t years, where m good outputs were produced, and h bad outputs generated from the
T ¼ ð X ; Y; B Þ :X
k j¼1 λ j Y j Y;X
k j¼1 λ j X j X;X
k j¼1 λ j B j B; λ 0
ð1Þ
Various approaches to integrate the undesirable outputs
The most popular approach is probably to consider the bad outputs as weakly disposable (basically modifying the restrictions in order to accept proportional reductions in the bad as well as in the good outputs) For more details on this
However, the debate on the problems and the solutions of this option is far from over; see, for instance, Kuosmanen
others Another possibility is to convert the undesirable bad outputs into desirable (i.e strongly disposable) good
that perhaps the most intuitive option is to consider the bad outputs as strongly disposable inputs Because of its sim-plicity, this option was selected in our proposal
measured by the following non-radial directional distance
~
ð2Þ
attainable increases in the good outputs as well as the maximum decreases for both bad outputs and inputs over
Clements (
Verhoeven (
Development Assistance
Trang 7representing inefficiency measures for inputs and outputs.
This approach will lead to an evaluation where each
educa-tional system will be assessed in the direction that is more
favorable to it, without assuming ex-ante any desirable
approaching direction towards the frontier Consequently, the
or strategy In the case of analyzing the performance of
educational systems, an output-oriented approach seems to be
maximizing its educational level with the available resources
Our main focus is then on the outputs side For this reason the
directions for improvement for both types of outputs Finally,
introduction of some value judgments on the importance of
the outputs The weights are usually assumed to be equal for
each input and/or output as we assume in our case
Never-theless, there are other alternatives, such as the one proposed
weight to each category of inputs and outputs and then
dis-tributing them equally among the number of variables
included in each category For this reason, the maximum
degree of generality for the formulation of weights has been
improving either good or bad outputs in the g direction and
consequently the educational system is located on the frontier
order to facilitate comparisons with a conventional distance
each country as follows:
v ¼1βb
r¼1βy
rwyr
ð3Þ
for the good and bad outputs, respectively Clearly,
equal to one, while the smaller the values, the greater the
distance to the frontier This index will also allow us to
propose a global non-radial Malmquist index based on the
analysis
In our empirical application the inputs and outputs,
fixed when it can be assumed that it remains constant to
scale their underlying volume variables by a non-negative
DEA has important implications, especially in modeling the
returns to scale exhibited by the technology (Golany and
vari-able returns to scale (VRS) in the cases where ratio
variables are present The rationale is that assuming con-stant returns to scale (CRS) proportionality in the variation
of inputs and outputs when increasing or decreasing, the size of a decision-making unit is also assumed, something that does not occur when a ratio is scaled by a constant Instead, by assuming VRS this problem is mitigated since the need for scaling is lower However, only rencently have
variables in DEA They proposed several solutions to prop-erly model CRS and VRS when ratio variables are present In the latter case (VRS), the proposal converges with the FDH technology when all the model variables are ratios, as in our case For this reason, we have considered this technology by
models are especially appropriate when the convexity
con-vexity of production correspondences in economic theory is
importance of indivisibilities in selecting the technology This argument has often been applied against using convex
Apart from the technical reasons arising from the pre-sence of ratio variables, when comparing countries the convexity assumption is probably more debatable from a conceptual point of view Although it is well known that the discriminant capacity of FDH models is generally reduced for small samples, we consider it to be the most appropriate methodological alternative given the variables used and the nature of the units analyzed However, in order to check the robustness of the results, we also made the calculations under the DEA-VRS technology A similar approach has
~
DuðXp; Yp; Bp: gÞ
¼ maxPm r¼1wyβyþPh
v¼1wb
vβb
vþPn s¼1wxβx s:t:
Pk j¼1λjyurj yp
roþ βy
gyr; r ¼ 1; ¼ ; m
Pk j¼1λjbu
vj bp
vo βb
vgbv; v ¼ 1; ¼ ; h
Pk j¼1λjxu
sj xp
so βxgxs; s ¼ 1; ¼ ; n
Pk j¼1λj¼ 1
βy; βb; βx 0 λj2 0; 1f g
ð4Þ
Trang 8whereλjis the intensity vector and yu
rj, bu
bad output v and input s, respectively, for unit j in year u
the following linear programming problem:
~
y
rβy
b
vβb
x
sβx s
j¼1
u¼1λujyurj yp
j¼1
u¼1λujbu
j ¼1
u ¼1λujxu
j ¼1
r; βb
v; βx
ð5Þ
should be removed from linear programs (4) and (5)
(GNRMI) as:
FPI g ð Þ t ¼ FPItþ1ð tþ1 Þ
FPI t ð Þ t
FPIg tþ1 ð Þ FPItþ1 tþ1 ð Þ FPIg t ð Þ FPItþ1 t ð Þ
¼ EC BPC
ð6Þ
than one means an increase in productivity, while a value
less than unity shows a decline in productivity during the
other words, the unit is closer to its contemporary frontier in
inversely The term BPC (best-practice gap change) is a
measure of technological change in the period, that is, of
how contemporary frontiers have shifted in the period with
respect the global frontier
2.2 Bipartite decomposition of the relative contributions
to educational performance
In accordance with the expressions detailed in the previous section, the global non-radial Malmquist index (GNRMI) index is decomposed into technical change (EC) and best practice gap change (BPC) Apart from analyzing how the different components contribute to the overall change of GNRMI on average, we can also consider a distribution dynamics approach to analyze what the largest contributors
to the variation in performance are, as measured by GNRMI
use nonparametric density estimation, based on kernel smoothing
We rewrite expression (6) above as follows:
indicate that the change in educational achievement is obtained by successively multiplying its three components This in turn, allows us to construct counterfactual distribu-tions by sequentially introducing each of the factors
which isolates the effect on the distribution of changes due
change in educational achievement (gnmri)
Analogously, for extending this sequential decomposi-tion, we would proceed as follows:
ð9Þ
We can consider this sequential decomposition in a dif-ferent order In such a case, the counterfactual educational achievement change attributable to best practice gap change would be:
which, in this case, isolates the effect on the distribution of best practice gap changes only, assuming EC does not contribute to the change in educational achievement (gnmri) Then expression (9) would become:
ð11Þ
We refer to the decomposition in both expressions (9) and (11) as the bipartite decomposition of the relative contributions to the changes in the distribution of educa-tional performance
Trang 9Although the use of these counterfactual distributions is
education in general, their use is more frequent in other
contexts such as impact evaluation In our case, we have
was based on combining the distribution dynamics
(deterministic) frontier production function literature (Färe
the model, in order to account for relevant issues in
eco-nomics such as the contributions of human capital
growth and convergence
densities can be estimated via kernel smoothing, which
entails two unequally important decisions, the choice of
kernel and the choice of bandwidth (h), which tunes the
tend to smooth more, revealing fewer data particularities,
low values of h tend to smooth less, providing more detail
the kernel, we chose a popular alternative, the Gaussian
kernel Although other choices are also possible (e.g.,
out-come is much lower than that of the bandwidth In this case,
the available literature is lengthy and, and we have
con-sidering both a global bandwidth (the amount of smoothing
is the same at all data points) and a local bandwidth (the
amount of smoothing varies locally, depending on the
structure of the data at a given point) For the former, we
whereas for the local bandwidth estimator we followed
3 Data, inputs and outputs
This study considers information from the educational
systems of the 29 countries (21 OECD member countries
Programme for International Student Assessment (PISA) for
years 2003 and 2012 PISA has been operating since 2000
and assesses average results for 15-year-old students
between 7th and 12th grade PISA seeks to determine the
extent to which students have acquired the competencies
knowledge society To do so, every three years, the PISA
literacy in terms of general competencies, that is, how well students can apply the knowledge and skills they have
edition version of this study Around 510,000 students participated and 65 countries/economies took part In PISA
As noted above, the methodology described in the pre-ceding section is used to evaluate the change in the per-formance of educational systems in achieving educational goals To do this we consider, in line with the previous
not only one that obtains high results (on average) in terms
ensure all its students make progress To do this, it must also develop strategies that enable its most disadvantaged students to advance and reach standards Therefore, an educational system that evolves satisfactorily will be one that can improve the average academic achievement of its students while at the same time minimizing the differences among them Similarly, the change in the allocation of resources used will reveal whether if these changes in achieving educational objectives (either positive or nega-tive) are due to a technical change (due to an improvement
in the provision of the resources allocated for educational
In this line, therefore, our selection of variables considers
as good outputs the average academic achievement of stu-dents in each country in the mathematics test and in the reading test Including these two subject areas eliminates any potential specialization bias by the participating coun-tries in either of these subjects, and follows the logic of
is the comparability of the data used, since it comes from
p 159) states that for PISA 2012 the decision was made to report the reading, mathematics and science scores on these previously developed scales That is the reading
4 Some excellent monographs on this issue are those by Silverman
( 1986 ), Scott ( 1992 ), Li and Racine ( 2007 ) and, more recently,
Henderson and Parmeter ( 2015 ).
5 The international contractor in each country randomly selects schools for participation in PISA At these schools, the test is given to students between the ages of 15 years 3 months and 16 years 2 months
at the time of the test, rather than to students in a speci fic year of school; this age represents the end of compulsory education in most participating countries In general, each version of PISA considers a minimum of 150 schools per participant country/economy (or all the schools if there are fewer than 150 schools in that country/economy) Within each participating school, a sample of students, usually numbering 35, is selected with equal probability (all students take that test if there are fewer than 35 in the school and with a minimum of
20 students so as to guarantee the validity of the test within and among schools) In total, in each country a minimum size of 4500 students are tested.
Trang 10scales used for PISA 2000, PISA 2003 PISA 2006, PISA
2009 and PISA 2012 are directly comparable PISA 2012
mathematics reporting scale is directly comparable to PISA
2003, PISA 2006 and PISA 2009 and the science reporting
scale is directly comparable to PISA 2006 and PISA
2009 scale Therefore, the scores from the mathematics and
reading scales used in this study are directly comparable for
years 2003 and 2012 The same argument is also used in
PISA data they use
As educational inequality (bad outputs) variables, the
study considers the average standard deviation of the results
the students in each country obtain in these two disciplines,
and also considers them separately The concept of
However, despite public policies that prioritize the
impor-tance of quality and equity in the provision of education,
fi-ciency in education systems
An analysis of the data shows that for the sample
countries both disciplines evolved positively, on average,
between 2003 and 2012 This improvement is greater in
reading than in mathematics (7-point improvement vs 3
points, respectively), and occurs not only in terms of
with a 3-point standard deviation decrease in reading and,
and a 2-point standard deviation decrease in mathematics
At the country level, the data show a high correlation in
the results for academic achievement between the two
dis-ciplines for each of the two assessments (2003 and 2012),
0.894) In mathematics, the countries with the highest
points), Tunisia (29 points) and Mexico (28 points) By
countries with the greatest improvements were Japan (40
points), Hong Kong-China (35 points) and the Russian
Federation (33 points), whereas those with the sharpest
declines were Sweden (31 points), Uruguay (23 points) and
Finland (19 points)
Despite the high correlation in the results for both
is not necessarily coincidental For example, in 2012 there
for Macao-China, Slovak-Republic, Austria, South Korea,
Hong Kong, the Russian Federation and Czech Republic In
contrast, some countries (Ireland, Greece, USA, New
Zealand and Hungary) show a specialization in reading
in 2012
Similarly, it appears that high levels of academic achievement can be obtained with low inequality levels For example, in the case of mathematics in 2012, countries where this occurred include Finland, Canada, Ireland, Lat-via and Spain, whereas in the case of reading, the countries were South Korea, Hong Kong, Ireland, Poland, Macao-China, USA, and the Czech Republic
For the selection of input variables, we mainly chose those which have been most frequently used in empirical
inter-national comparisons of educational systems performance, for which information was available for both years In general, these studies consider:
involved in the teaching-learning process (e.g
proxy the ratio of teachers per 100 students in secondary education
buildings and grounds; heating/cooling and lighting systems; and instructional space (De Jorge and Santín
index of quality of physical Infrastructure provided by PISA
students and their families (e.g Agasisti and Zoido
measure this, PISA created the index of economic, social and cultural status (ESCS) As reported on the
the following variables: the International Socio-Economic Index of Occupational Status (ISEI); the
converted into years of schooling; the PISA index of family wealth; the PISA index of home educational resources; and the PISA index of possessions related
The inputs were chosen at school level rather than at macroeconomic level (e.g expenditure in secondary
impact of expenditures on the achievement of the students participating in the PISA sample
The evolution between the two assessments (2003 and 2012) for these input variables for the sample average shows a disparate behavior On the one hand, an improve-ment is seen in both the ratio of teachers per 100 students (average value increases from 6.48 to 7.61 teachers per