Using a consistent data source, we compute some familiar measures such as average population density of the metropolitan area, and the familiar negative exponential density gradient.. Su
Trang 1The Center for Urban Land Economics Research
The University of Wisconsin
975 University AvenueMadison, WI 53706-1323
smalpezzi@bus.wisc.edu
Trang 2http://wiscinfo.doit.wisc.edu/realestate
Trang 4Stephen Malpezzi is Associate Professor and Wangard Faculty Scholar in the Department of Real Estate and Urban Land Economics, and an associate member of the Department of Urban and Regional Planning, of the University
of Wisconsin-Madison Wen-Kai (Kevin) Guo is a Ph.D candidate in Food Science at the University of Wisconsin-Madison
Comments on this and closely related work have been provided by Alain Bertaud, Michael Carliner, Mark Eppli, Richard Green, James Shilling, Kerry Vandell, Anthony Yezer and participants at the Homer Hoyt Institute/Weimer School's January 1999 and January 2000 sessions as well as the June 1999 Midyear meeting of the American Real Estate and Urban Economics
Association Comments and criticisms of this paper are welcome
The research we describe was supported by the University of Wisconsin's Graduate School, the Wangard Faculty Scholarship, and the UW Center for Urban Land Economics Research Opinions in this paper are those of the authors, and do not reflect the views of any of the above individuals, or of any institution
Trang 6Economists and other social scientists have tried to study urban form more or less rigorously for the better part of two centuries The earliest commonly cited work is that by German scholars such as von Thunen (1826) and somewhat later work by Lösch (1944) In this century pioneering
English-speaking scholars include Clark (1951), Hoyt (1939, 1966), and
Burgess (1925) The 60s and 70s saw a further flowering with work such as Alonso (1964), Mills (1972), Muth (1969), and Wheaton (1977) among many others Song (1996) provides a particularly nice discussion of some
alternative measures Excellent reviews and extensions of this large
literature can be found in Anas, Arnott and Small (1998), Fujita (1989), and Turnbull (1995)
The economics of location has been a fertile academic field for some forty years, and is enjoying resurgence due partly to the high profile of
recent work on the economics of location by generalist economists such as Krugman (1991) But this academic resurgence is nothing compared to the eruption of interest in urban form by a wider range of political and social commenters One watershed was certainly journalist Joel Garreau’s excellent
popularization Edge City (1991), which broadened the audience for
discussion of urban form But the real explosion of interest in urban form has
been due to the growing concern and exploding reference to urban sprawl
What environmental activists and now many others call sprawl is
certainly not new to urban economists The phenomenon of rapid growth on the periphery of the city is something that has been a core feature of most ofthe literature mentioned above, and the much larger literature that lies
behind it To give just one example, Edwin Mills’ classic 1972 book studies the decentralization of urban population in a large sample of U.S cities goingback to the latter part of the 19th century Sam Bass Warner’s Streetcar Suburbs, coming out of a different scholarly tradition, is another well-known
early examination of decentralization
Perhaps the first dividing line between urban economists and many other urban observers is the use of the word sprawl, with its pejorative
connotations While a number of authors have used the term sprawl in the
academic literature (see references below), most urban economists have
preferred less value-laden terms, such as urban decentralization (Mills 1999)
or accessibility (Song 1996) But economists have lost the lexicographic
battle To give just one example, we did a simple Internet search on the term
"urban decentralization" using Infoseek.com The search engine returned 27 hits We repeated an identical search using the term "urban sprawl." The engine returned 5,946 hits
Trang 7With all the extraordinary attention paid to sprawl, it is quite
interesting that only recently have some of those involved in the policy
discussion attempted to define it Consider the following quotation:
sprawl in all its forms is seldom satisfactorily defined Urban sprawl is often discussed without an associated definition at all Some writers make no attempt at all at definition, while others engage in little more than emotional rhetoric, as in "the great urban explosion (which) has scattered pieces of debris over the countryside for miles around the crumbling centre a destruction of the qualities of the city"
Despite its contemporary relevance, the quotation is from a paper by Robert Harvey and W.A.V Clark, written 35 years ago.1
Our view is that discussions about sprawl, whether academic or policy oriented, are greatly hampered by loose definition and inadequate
measurement Our intention in this paper is to contribute toward improving the measurement of sprawl Using a consistent data source, we compute some familiar measures such as average population density of the
metropolitan area, and the familiar negative exponential density gradient
We also compute some less often used measures, such as those based on gravity models, and some which are fairly new, including measures based onorder statistics, measures of fit of various models, and measures that
incorporate the notion of autocorrelation
We also investigate the use of data reduction techniques to collapse some of these disparate measures into a univariate index We also evaluate how well each one of our measures incorporates the information contained inthe others, i.e., to get some sense of "which measure is best." We also
estimate simple models of the determinants of each of the measure, using right-hand side variables suggested by the urban economics literature as well as some of the sociological explanations for decentralization.2
While we investigate a large number of measures, we certainly do not exhaust all possible measures For example, our data tell us little directly about how much space within a tract is devoted to one land use or another,
1 Harvey and Clark (1965) Harvey and Clark took the internal quotation from Pearson (1957).
2 The simple models are estimated with an eye towards facilitating comparisons and
validating measures of urban form They are best viewed as exploratory See our
companion paper, Malpezzi (2000) for a more detailed model, albeit with the estimation focused on a single measure.
Trang 8or how much space is privately owned vs publicly owned These are
important components of some people’s views of sprawl
Our explanatory models are exploratory and simple We mention here and discuss again below the fact that more complete models would deal withthe potential endogeneity of some of our right hand side variables, as well asutilize a more complete vector of determinants For example, for tractability this paper abstracts from the effect of housing prices as an equilibrating mechanism, in contradistinction to Malpezzi and Kung (1998), which argues that housing price gradients and location may be jointly determined We also limit ourselves to data from the 1990 Census, that is, our measures are developed from a single cross section Measures that focus on tract level
changes are only one class of many possible extensions to our measurement
effort here
Previous Research
The literature on urban form is huge Our intention in this paper is to focus on measurement, so our review here is extremely selective Readers interested in a broader review of the literature on sprawl are referred to Malpezzi (2000), Ewing (1997), and Gordon and Richardson (1997) among others Those who wish a more detailed review of the academic approach to urban form should consult Anas, Arnott and Small (1998), as well as
McDonald (1989), Wheaton (1979), and Ingram (1979)
Surely the simplest measure of sprawl, and one used any number of times by urban economists and others, is the average density of the
metropolitan area Brueckner and Fansler (1983) and Peiser (1989) are among well-known papers by urban economists that use this measure
Far too many papers to cite focus on the negative exponential density gradient and its many derivatives and extensions According to Greene and Barnbrock (1978), the first to use the negative exponential function was the German scholar Bleicher (1892).3 In many respects, the function was
popularized by the work of Colin Clark in geography, and later by Edwin Mills,Richard Muth, and others in economics Many authors have noted that the monocentric negative exponential is not always a terribly good fit for many metropolitan communities; see Richardson (1988), Kau and Lee (1976), and Kain and Apgar (1979) for examples Here we note the following key points (1) Without doubt the univariate negative exponential fits some cities
reasonably well, and others quite badly (2) Despite this, it is still often used
partly because of the advantages of having a univariate index of
3 Although McDonald, in his excellent (1989) review, suggests that Stewart (1947)
apparently first fit the negative exponential form described here Most observers would
Trang 9decentralization or sprawl (see, for example, Jordan, Ross and Usowski
1998) (3) It is apparently neglected in the literature that the measure of fit
of such a simple univariate model is in its own way a measure of sprawl, as
we will discuss below
Most individual papers that measure decentralization, or 'sprawl,'focus on one measure or at most a few measures There are exceptions, of course Several papers have examined differences among functions
theoretically and using simulation methods Ingram (1971) examined
average distances, negative exponential functions, a linear reciprocal
function, among others, and Guy (1983) examined a number of accessibility measures in a broadly similar fashion Broadly speaking, these papers
clarify differences among candidate functions, and tell us which might best capture stylized facts of observed patterns, but do not offer empirical tests per se Some papers have tested a limited number of specific hypotheses, e.g whether a single parameter exponential function performs as well as some flexible form (e.g Kau and Lee)
In many respects Song (1996) is a paper that parallels this one, in that
it uses actual data to test a wide range of alternative forms Song estimates
a variety of gravity, distance and exponential models using tract level data from Reno, Nevada Best-fit criteria suggest that gravity measures and, especially, a negative exponential measure, perform much better than linear distance measures As Song is careful to note, results from a single metro area are suggestive, but it remains to examine other forms, and especially totest forms across other metropolitan areas Most analysts would admit the possibility of differing performance for a given estimator in, say, Los Angeles compared to Boston or Portland, for example In our paper, we follow Song
in examining a range of possible measures of urban form, but rather than focus on a single location, we examine a wide range of U.S metropolitan areas
More recently, Galster et al (2000) have independently undertaken an
exercise in some respects similar to ours Galster and colleagues estimate a series of measures of urban form for a dozen large MSAs (in contrast to our measures, for some 300) Later, we will briefly compare our results to
Galster et al.'s, and to Song's.
The Measurement of Urban Form
In this section we discuss some measures of sprawl and related
measures of urban form
First we introduce some notation We use a capital P to indicate the population of a metropolitan area, and small p to indicate the population of a
Trang 10tract Capital A denotes the area of the metropolitan area, and small a denotesthe area of the tract Capital D refers to the density of the metropolitan area, i.e D=P/A, and small d, d=p/a, is the density of a tract Distance from the city center is denoted by the letter u Letters i and j index tracts within a
metropolitan area, and k indexes metropolitan areas themselves Generally weconstruct a measure for each metropolitan area, and for notational simplicity
we usually drop the subscript k
Average Density
The most common measure is average density in the metropolitan area.:
(1) Average MSA density; for each MSA, D=P/A In our database of MSA
results, described below, this variable is denoted MSADENS
While widely used, the limitations of the measure are obvious Consider two different single-county MSAs of equal area and equal population Suppose the first contains all of its population in a city covering, say, a fourth of the area
of the county, the rest of which is rural and lightly settled Suppose the other MSA has a uniform population distribution Our measure, average density, is the same But most observers would consider the second MSA as exhibiting more "sprawl" than the first
Of course there is no reason to limit ourselves to average densities
Other moments, and nonparametric measures can also be considered, as below
Alternative Density Moments
In this paper we construct several new indicators of population density gradients, based on the densities of the Census tracts in each MSA The starting point for each MSA is to compute these tract densities, and then to sort tracts by descending density We then construct several indicators of
"sprawl", one for each MSA:
(3) Minimum tract density, DENMIN = min(di)
(4) DENMED: the density of the "median tract weighted by population," that
is, median(di) when tracts are sorted by density, the tract containing the median person in the MSA For example, suppose the population of the MSA
is 100 people, in 7 tracts:
Trang 11The median person is "contained" in tract 2, so DENMED=9.
(5), (6): DENQ1 and DENQ3, the corresponding 1st and 3rd quartiles of tract density, constructed as above
(7), (8): DENP10 and DENP90, the corresponding 10th and 90th percentiles of tract density, constructed as above
Measures of Dispersion in Tract Densities
(9): DENCV, the coefficient of variation of tract densities
(10) DENGINI, a Gini coefficient measuring variation in tract density,
constructed as follows Sort tracts within MSA in descending density
Compute cumulative population in percent, CPP and cumulative area in percent, CAP, as you move down tracts Compute difference between CPP and CAP for each tract Then sum over all tracts within each MSA
(11) Theil's information measure, an alternative to the Gini coefficient:
where ai is the area of the ith tract, A is the MSA area, pi is the population of the ith tract, and P is the MSA population
Population Density Gradients
The measure of city form that has been most often studied by urban economists is the population density gradient from a negative exponential
/Pp
/AalogAaDENTHEIL
Trang 12function, often associated with the pioneering work of Alonso, Muth and Mills, but as noted earlier first popularized among urban scholars by the geographer Colin Clark More specifically, the population density of a city is hypothesized
Among the other attractive properties of this measure, density is
characterized by two parameters, with a particular emphasis on γ, which
simplifies second stage analysis The function is easily estimable with OLS regression by taking logs:
ln d(u) = ln d0 - γu + ε
which can then be readily estimated with, say, density data from Census
tracts, once distance of each tract from the central business district (CBD) is measured Thus, we construct measures
(12) our gradient, gamma (denoted KMB1_1 in the database), and
(13) density at the center, d0 (denoted INTB_1)
The exponential density function is sufficiently important to warrant briefdiscussion This particular form has the virtue of being derivable from a simplemodel of a city, albeit one with several restrictive assumptions, e.g a
monocentric city, constant returns Cobb-Douglas production functions for housing, consumers with identical tastes and incomes, and unit price elasticity
of demand for housing
As is well known, the standard urban model of Alonso, Muth and Mills predicts that population density gradients will fall in absolute value as incomes rise, the city grows, and transport costs fall Extensions to the model permit gradients to change with location-specific amenities as well (Follain and
Malpezzi 1981)
The negative exponential function often fits the data rather well, for such
a simple function in a world of complex cities Sometimes it does not fit well,
as we will confirm Many authors have experimented with more flexible forms, such as power terms in distance on the right hand side (of which more below)
Trang 13The world is divided up into two kinds of people: those who find the simple form informative and useful, despite its shortcomings (e.g Muth 1985), and those who believe these shortcomings too serious to set aside (e.g
Richardson 1988).4 In fact, given the predicted flattening of population densitygradients as cities grow and economies develop, it can be argued that the monocentric model on which it rests contains the seeds of its own destruction; and that a gradual deterioration of the fit of the model is itself consistent with the underlying model
Extensions of the Simple Exponential Gradient
As already noted, the simplest, and most widely used model for estimation is:
ln d = a + b ln u
where d is the tract's density, and u is distance from the center We are relying on this simple model for our second stage work, but we have also computed three additional models, with right hand side variables:
(14) A quadratic model, i.e with terms u and u2
(15) A cubic model, u, u2, and u3
(16) A fourth power model, u, u2, u3 and u4
In our database of results, these coefficients are represented by variables KMBa_b, where a represents the order of the model, and b represents which term For example, KMB3_1 is the coefficient of linear distance in the cubic model The intercepts from these models are denoted INTB_a, where a is again the order of the model
Measuring Discontiguity
A simple and natural measure of discontiguity is:
(17) The R2 statistic from the univariate density gradient regressions, denoted RSQ_1
Consider the two panels of Figure 1 Panel A shows a very highly
stylized city with a given density gradient, as does Panel B In Panel A, we have drawn a pattern consistent with very contiguous density patterns as
4 The world is also divided up into people who divide the world into two kinds of people, and people who don't, but that's another paper.
Trang 14one moves from the center of the city outwards The second panel shows a city with the same density gradient, but a much more discontiguous pattern The R-squared from the density gradient regressions is a natural measure of this discontinuity However it should be noted that a low R-squared is a sufficient but not necessary condition for such discontinuity
To see this, consider a city where the density gradient is very
contiguous from tract to tract, but assume that the gradient varies by
direction as well as distance from the CBD As example would be a metro area in which the gradient declines very rapidly with distance in one
direction, but very slowly in another Suppose this difference is very
systematic, and density changes slowly as one rotates from left to right around the central point of the city Such a city would not be truly
discontiguous by most people’s thinking, but would have a low R-squared for
a simple two-parameter density regression of the usual kind, where it is maintained that density varies with distance but not direction Of course it would be possible to estimate distance density gradients that vary by
direction as well as distance (see Follain and Gross 1983), but undertaking such an exercise would require resources beyond our present ones
(18) The difference in R2 statistic from the univariate density gradient
regressions, and the R2 statistic from the fourth power density gradient
regressions, DRSQ1_4 Another variation on the preceding theme; if the
univariate model is a good one, then adding successive power terms adds little
to the explanatory power of the regression If, on the other hand, the
improvement in fit is large, the simple model is less satisfactory
Measures of Spatial Autocorrelation
The r-squared measure from the negative exponential regression will capture some of this, but we can construct examples where, for example, spatial auto correlation is high but the R-squared is low Consider a case where spatial auto correlation is high and positive (i.e., very little sprawl on this element) but where density varies tremendously by radial location
(direction north or south, for example) Since our simple negative
exponential model imposes that density is a function of distance but not direction, we would find a low R-squared even though the spatial
autocorrelation in such a city could be high A more generalized measure of such autocorrelation is therefore desirable
(19) Moran’s I (denoted MORAN_I) One commonly used measure is Moran’s
I, which is effectively a correlation coefficient, constructed using a weighting matrix where weights depend upon location More specifically, the formula
as usually written is:
Trang 15where n is the number of tracts, and C is an n by n matrix that incorporates the information of which tracts are contiguous Specifically, each row
represents a tract, and contiguous tracts have ones entered in their
corresponding columns Other elements of C are zero
One practical difficulty in computing Moran’s I for a large number of places (tracts and MSAs) is to develop an algorithm to compute the matrix C
In this version of our paper we use an algorithm from Isaaks and Srivastava (1990) which uses a quadratic approximation That is, when distances are small, the elements of C are approximately 1, but as they get larger given the elements are quadratic in distance, they rapidly approach 0
Other measures of spatial autocorrelation are possible, of course
Moran's I is isotropic, i.e., it depends on distance from the tract in question,
but not direction To the extent that direction as well as distance does
matter, an anisotropic measure that accounts for direction would be a
natural extension Dubin, Pace and Thibodeau (1998) and Gillen, Thibodeau and Wachter (1999) discuss the use of such anisotropic indexes for single metropolitan areas Computational difficulties have so far kept us from producing such an index for our full set of metropolitan areas, but given resources, this would be an appropriate extension for future work
Compactness
In Bertaud and Malpezzi (1999), Alain Bertaud developed a compactness index, rho, which is the ratio between the average distance per person to theCBD, and the average distance to the center of gravity of a cylindrical city whose circular base would be equal to the built-up area, and whose height will be the average population density:
where rho is the index, d is the distance of the ith tract from the CBD,
weighted by the tract's share of the city's population, w; and C is the similar, hypothetical calculation for a cylindrical city of equivalent population and built up area A city of area X for which the average distance per person to
ij i
D d c
D d D d c
n
)(
))(
(
Trang 16the CBD is equal to the average distance to the central axis of a cylinder which base is equal to X would have a compactness index of 1.
In this paper we use the simpler weighted average of distances from oneset of points in a metropolitan area to another to compute two distance
measures Distances are corrected for the earth's curvature
(20) DCENTAVG is the weighted average distance to the center, where tract populations are the weights
(21) DCENTMED is the weighted median distance to the center, where tract populations are the weights
Of course, in the modern city, many if not most employment and shopping destinations are no longer in the city Gravity based measures are
conceptually similar measures that are less CBD-focused
Gravity Based Measures
Gravity measures were popularized by Lowry (1964) among others Song (1996) presents several such measures The general form of a gravity model takes its form from Newton’s Law of Gravitation However, several variants can be obtained by various choices of the power terms and so on involved Song discusses and estimates a wide range of these for a single metropolitan area Given the difficulty of estimating and comparing gravity measures across metropolitan areas if one permitted the exponents to be chosen by the data, we prefer to pick two common and simple assumptions, one where terms are linear and the other exponential
(22) The linear gravity function, GRAVLIN, is is the weighted average distance from the center of each tract to every other tract, in turn in fact the same as above:
(23) The exponential gravity function, GRAVEXP, can be written:
Trang 17Combining Measures
Given the multidimensionality of sprawl, sprawl is a natural candidate for data reduction techniques We used the well-known method of principal components
Principal components analysis derives a vector p = Za', where Z is a matrix whose columns consist of n observations on K variables and a' is a K-element vector of eignenvalues We choose the a's so that the variance of p
is maximized subject to the normalization condition that a1 + a2 + + aK =
1 Given inevitable collinearity, we choose 12 of our 22 spatial measures as elements of Z:
INTB_1, the intercept from the simple univariate exponential model;KMB1_1, the coefficient of distance in the simple exponential model;RSQ_1, the R-squared from the simple exponential model;
DRSQ1_4, the improvement in R-squared from the univariate
exponential model to the fourth power model;
DENCV, the coefficient of variation of tract densities;
DENMED, the density of the median tract, when tracts are ordered by density;
DENP90, the density of the 90th percentile tract, ordered by density;DENGINI, the Gini coefficient of tract densities;
MSADENS, the average density of the entire MSA;
DCENTMED, the weighted median distance to the center of the MSA;GRAVLIN, the linear gravity measure;
GRAVEXP, the exponential gravity measure
We extract three principal component measures from this data, and label them PC1, PC2 and PC3
Other Possible Measures
Trang 18We have seed that in addition to the traditional gradient measure, many measures of urban form have been put forward and studied The simplest, of course, is the average density of the city or metropolitan area We have
proposed a fairly large set of other measures to include here, but we have not exhausted the possibilities Others include measures such as functions based
on densities other than the negative exponential, such as the normal density
(Ingram 1971; Pirie 1979; Allen et al 1993)
Many additional measures could be developed using techniques
developed by urban geographers and others for the analysis of data
exhibiting spatial autocorrelation Moran's I, which we have computed, is one such measure but there are others Anselin and Florax (1995), Pace and Gillen (forthcoming) and Pace, Barry and Sirmans (1998) describe these techniques in greater detail
A few papers have examined land use conditions on the fringe as
opposed to the metropolitan area as a whole (See Brown, Phillips and
Roberts (1981))
Also, we note that the American Housing Survey has data on land area for single-family houses To our knowledge, no one has used this data in the analysis of sprawl For example, median lot size of single-family homes built
in the last five years would be one possible indicator that could be
constructed We did undertake some preliminary work with this data
Problems arose from the fact that such data are only reported for family units But the biggest problem with this potential measure is that preliminary analysis of AHS data tells us that their are many missing
single-observations, and further (and more worryingly) missing plot area is
correlated with other housing characteristics and income, suggesting
potential biases in measures created from this dataset Further work along these lines remains for future research
The Determinants of Urban Form and "Sprawl"
In order to evaluate different measures, we intend to estimate simple least squares models of the determinants of sprawl What does theory, and previous research, suggest those determinants might be?
The well-known "standard urban model" of Alonso (1964), Muth (1969)and Mills (1972) postulates a representative consumer who maximizes utility, afunction of housing (H) and a unit priced numeraire nonhousing good, subject
to a budget constraint that explicitly includes commuting costs as well as theprices of housing (P) and nonhousing (1) It is easy to show that equilibriumrequires that change in commuting costs from a movement towards or away
Trang 19from a CBD or other employment node equals the change in rent from such amovement For such a representative consumer:
where u is distance from the CBD and t is the cost of transport This
equilibrium condition can be rearranged to show the shape of the housing price function:
Now consider two consumers, one rich and one poor Assume H is anormal good If (for the moment), t is the same for both consumers but H isbigger for the rich (at every u), the rich bid rent function will be flatter Therich will live in the suburbs and the poor in the center Even if t also increaseswith income (as is more realistic), as long as increases in H are "large" relative
to increases in t, this result holds.5 Also, as incomes rise generally, theenvelope of all such bid rents will flatten Also, clearly, as transport costs fall,bid rents will flatten
The standard urban model has a competitor, which is sometimes calledthe "Blight Flight" model (Follain and Malpezzi 1981) As presented in the U.S.literature, the Blight Flight Model has a negative tone People have left thecities not because they preferred suburban living a la the standard model, butbecause the cities themselves have become less desirable places to live AsU.S cities became more and more the habitat of low-income households andblack households, the argument goes, housing and neighborhood qualitydeclined and white middle-to-upper income households flew to the suburbs
While "Blight Flight" explanations focus on negative amenities such ascrime and fiscal stress, the models are easily generalizable to positiveamenities such as high quality schools Blight Flight can be generalized andformalized by adding a vector of localized amenities (and disamenties) to thestandard urban model above See, for example, Li and Brown (1980),Diamond and Tolley (1982), and many subsequent applications
Of course other causes can be considered, for example the degree of monocentricity, opportunity cost of land in non-urban uses, and the industrial structure of a city (manufacturing implies different land use patterns than office work, for example) Mills (1999) has a nice discussion of these
Data
5 But see Wheaton (1977) and Glaeser et al (2000) for dissenting views.
)H(uΔP(u)